Composing Concepts from Images and Videos via Concept-prompt Binding Paper โข 2512.09824 โข Published Dec 10, 2025 โข 27
MIND-V: Hierarchical Video Generation for Long-Horizon Robotic Manipulation with RL-based Physical Alignment Paper โข 2512.06628 โข Published Dec 7, 2025 โข 12
AnyTalker: Scaling Multi-Person Talking Video Generation with Interactivity Refinement Paper โข 2511.23475 โข Published Nov 28, 2025 โข 42
Hyper-Bagel: A Unified Acceleration Framework for Multimodal Understanding and Generation Paper โข 2509.18824 โข Published Sep 23, 2025 โข 22
pyannote/speaker-diarization-3.1 Automatic Speech Recognition โข Updated May 10, 2024 โข 13.9M โข 1.44k
deepseek-ai/DeepSeek-Prover-V2-671B Text Generation โข 685B โข Updated Apr 30, 2025 โข 286 โข โข 816