PaperBanana: Automating Academic Illustration for AI Scientists Paper • 2601.23265 • Published 11 days ago • 166
DINO-SAE: DINO Spherical Autoencoder for High-Fidelity Image Reconstruction and Generation Paper • 2601.22904 • Published 11 days ago • 15
Memory-V2V: Augmenting Video-to-Video Diffusion Models with Memory Paper • 2601.16296 • Published 19 days ago • 28
VTP Collection Towards Scalable Pre-training of Visual Tokenizers for Generation • 4 items • Updated Dec 16, 2025 • 42
The Illusion of Specialization: Unveiling the Domain-Invariant "Standing Committee" in Mixture-of-Experts Models Paper • 2601.03425 • Published Jan 6 • 16
Spiritus: An AI-Assisted Tool for Creating 2D Characters and Animations Paper • 2503.09127 • Published Mar 12, 2025 • 2
PersonaLive! Expressive Portrait Image Animation for Live Streaming Paper • 2512.11253 • Published Dec 12, 2025 • 36
Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield Paper • 2511.22677 • Published Nov 27, 2025 • 32
RELIC: Interactive Video World Model with Long-Horizon Memory Paper • 2512.04040 • Published Dec 3, 2025 • 24
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer Paper • 2511.22699 • Published Nov 27, 2025 • 237
STARFlow-V: End-to-End Video Generative Modeling with Normalizing Flow Paper • 2511.20462 • Published Nov 25, 2025 • 32
MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation Paper • 2511.09611 • Published Nov 12, 2025 • 70
PhysX-Anything: Simulation-Ready Physical 3D Assets from Single Image Paper • 2511.13648 • Published Nov 17, 2025 • 52
Back to Basics: Let Denoising Generative Models Denoise Paper • 2511.13720 • Published Nov 17, 2025 • 69
Depth Anything 3: Recovering the Visual Space from Any Views Paper • 2511.10647 • Published Nov 13, 2025 • 99
Vision Language Models: 2025 Update Collection This collection includes all the models, datasets and Spaces mentioned in the blog Vision Language Models: 2025 Update • 67 items • Updated May 12, 2025 • 6
Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published Oct 6, 2025 • 506
MIRO: MultI-Reward cOnditioned pretraining improves T2I quality and efficiency Paper • 2510.25897 • Published Oct 29, 2025 • 17
TradingAgents: Multi-Agents LLM Financial Trading Framework Paper • 2412.20138 • Published Dec 28, 2024 • 16