LayerComposer: Interactive Personalized T2I via Spatially-Aware Layered Canvas
Abstract
LayerComposer provides interactive control over spatial composition and scalability in multi-subject text-to-image generation through a layered canvas and locking mechanism.
Despite their impressive visual fidelity, existing personalized generative models lack interactive control over spatial composition and scale poorly to multiple subjects. To address these limitations, we present LayerComposer, an interactive framework for personalized, multi-subject text-to-image generation. Our approach introduces two main contributions: (1) a layered canvas, a novel representation in which each subject is placed on a distinct layer, enabling occlusion-free composition; and (2) a locking mechanism that preserves selected layers with high fidelity while allowing the remaining layers to adapt flexibly to the surrounding context. Similar to professional image-editing software, the proposed layered canvas allows users to place, resize, or lock input subjects through intuitive layer manipulation. Our versatile locking mechanism requires no architectural changes, relying instead on inherent positional embeddings combined with a new complementary data sampling strategy. Extensive experiments demonstrate that LayerComposer achieves superior spatial control and identity preservation compared to the state-of-the-art methods in multi-subject personalized image generation.
Community
LayerComposer introduces an interactive, Photoshop-like text-to-image generation framework that lets users place, resize, and lock multiple personalized subjects on separate layers, enabling scalable, high-fidelity, and spatially controlled multi-subject image synthesis.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- ComposeMe: Attribute-Specific Image Prompts for Controllable Human Image Generation (2025)
- ContextGen: Contextual Layout Anchoring for Identity-Consistent Multi-Instance Generation (2025)
- Does FLUX Already Know How to Perform Physically Plausible Image Composition? (2025)
- Griffin: Generative Reference and Layout Guided Image Composition (2025)
- MultiCrafter: High-Fidelity Multi-Subject Generation via Spatially Disentangled Attention and Identity-Aware Reinforcement Learning (2025)
- SIGMA-GEN: Structure and Identity Guided Multi-subject Assembly for Image Generation (2025)
- ReMix: Towards a Unified View of Consistent Character Generation and Editing (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper