UAE
Paper
This is the official pre-trained weight of the paper "Can Understanding and Generation Truly Benefit Together -- or Just Coexist?" (https://arxiv.org/abs/2509.09666).
Github
You can access the official code in the: https://github.com/PKU-YuanGroup/UAE.
Abstract
The field’s long-standing split between “understanding” and “generation” leaves a central question open—can they truly benefit each other, or merely coexist? Despite progress, most “unified” models still run them in parallel, limiting cross-task flow. We argue that real unification requires a bidirectional contract, where the generation and understanding can mutually enhance each other. Inspired, we revisit their relationship through an Auto-Encoder lens: understanding as the encoder (I2T) that compresses images into text, and generation as the decoder (T2I) that reconstructs images from that text. Using reconstruction as the single, unifying training signal—measured by semantic alignment between the input and its reconstruction—our UAE framework enforces closed-loop consistency and coherent information flow, turning coexistence into mutual gain.
Model tree for zhiyuanyan1/UAE
Base model
stabilityai/stable-diffusion-3.5-large