Section 01
Unify-Agent: A New World Knowledge-Grounded Image Synthesis Method Based on Agent Architecture (Introduction)
Unify-Agent reconstructs image generation into a four-stage agent workflow including prompt understanding, multimodal evidence search, grounded re-description, and final synthesis. Trained on 143K high-quality agent trajectories, it has validated its world knowledge grounding capability on the FactIP benchmark, and experimental results show that its performance is close to the strongest closed-source models.