Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows Paper • 2512.16969 • Published 6 days ago • 101 • 9
LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling Paper • 2511.20785 • Published 29 days ago • 153 • 7
Next-Embedding Prediction Makes Strong Vision Learners Paper • 2512.16922 • Published 6 days ago • 77 • 3
WorldGen: From Text to Traversable and Interactive 3D Worlds Paper • 2511.16825 • Published Nov 20 • 21 • 4
EgoX: Egocentric Video Generation from a Single Exocentric Video Paper • 2512.08269 • Published 16 days ago • 108 • 3
Generalist Foundation Models Are Not Clinical Enough for Hospital Operations Paper • 2511.13703 • Published Nov 17 • 21 • 3
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length Paper • 2512.04677 • Published 20 days ago • 168 • 6
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance Paper • 2512.08765 • Published 15 days ago • 125 • 5
CLaRa: Bridging Retrieval and Generation with Continuous Latent Reasoning Paper • 2511.18659 • Published about 1 month ago • 18 • 3
ColPali: Efficient Document Retrieval with Vision Language Models Paper • 2407.01449 • Published Jun 27, 2024 • 49 • 2
MMSearch-Plus: A Simple Yet Challenging Benchmark for Multimodal Browsing Agents Paper • 2508.21475 • Published Aug 29 • 2 • 1
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published 22 days ago • 228 • 6