DyPE: Dynamic Position Extrapolation for Ultra High Resolution Diffusion Paper • 2510.20766 • Published 4 days ago • 27
HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives Paper • 2510.20822 • Published 4 days ago • 34
LayerComposer: Interactive Personalized T2I via Spatially-Aware Layered Canvas Paper • 2510.20820 • Published 4 days ago • 7
Qwen3VL Collection Nexa AI infra to support Qwen3VL running on GPU/NPU/CPU • 22 items • Updated 3 days ago • 3
Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning Paper • 2510.19338 • Published 5 days ago • 94
UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation Paper • 2510.18701 • Published 6 days ago • 64
Latent Diffusion Model without Variational Autoencoder Paper • 2510.15301 • Published 10 days ago • 44
Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset Paper • 2510.15742 • Published 10 days ago • 48
TAG:Tangential Amplifying Guidance for Hallucination-Resistant Diffusion Sampling Paper • 2510.04533 • Published 21 days ago • 47
Ovi: Twin Backbone Cross-Modal Fusion for Audio-Video Generation Paper • 2510.01284 • Published 27 days ago • 30
SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer Paper • 2509.24695 • Published 28 days ago • 42
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models Paper • 2510.04618 • Published 21 days ago • 108
LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training Paper • 2509.23661 • Published 29 days ago • 44
OpenGPT-4o-Image: A Comprehensive Dataset for Advanced Image Generation and Editing Paper • 2509.24900 • Published 28 days ago • 53
Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play Paper • 2509.25541 • Published 27 days ago • 136