LayerComposer: Interactive Personalized T2I via Spatially-Aware Layered Canvas Paper • 2510.20820 • Published 3 days ago • 7
HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives Paper • 2510.20822 • Published 3 days ago • 32
Loopholing Discrete Diffusion: Deterministic Bypass of the Sampling Wall Paper • 2510.19304 • Published 5 days ago • 22
GigaBrain-0: A World Model-Powered Vision-Language-Action Model Paper • 2510.19430 • Published 5 days ago • 37
Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing Paper • 2510.19808 • Published 4 days ago • 21
Unified Reinforcement and Imitation Learning for Vision-Language Models Paper • 2510.19307 • Published 5 days ago • 22
LightMem: Lightweight and Efficient Memory-Augmented Generation Paper • 2510.18866 • Published 5 days ago • 101
ConsistEdit: Highly Consistent and Precise Training-free Visual Editing Paper • 2510.17803 • Published 6 days ago • 12
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM Paper • 2510.15870 • Published 9 days ago • 73
Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery Paper • 2510.15869 • Published 9 days ago • 40
WithAnyone: Towards Controllable and ID Consistent Image Generation Paper • 2510.14975 • Published 10 days ago • 77
Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published 20 days ago • 446
DreamOmni2: Multimodal Instruction-based Editing and Generation Paper • 2510.06679 • Published 19 days ago • 73
VIST3A: Text-to-3D by Stitching a Multi-view Reconstruction Network to a Video Generator Paper • 2510.13454 • Published 12 days ago • 6
FlashWorld: High-quality 3D Scene Generation within Seconds Paper • 2510.13678 • Published 11 days ago • 69
Advancing End-to-End Pixel Space Generative Modeling via Self-supervised Pre-training Paper • 2510.12586 • Published 12 days ago • 106