Time Travel: A Comprehensive Benchmark to Evaluate LMMs on Historical and Cultural Artifacts Paper • 2502.14865 • Published Feb 20 • 1
DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario Understanding Paper • 2503.10621 • Published Mar 13
Fann or Flop: A Multigenre, Multiera Benchmark for Arabic Poetry Understanding in LLMs Paper • 2505.18152 • Published May 23 • 1
How Good are Foundation Models in Step-by-Step Embodied Reasoning? Paper • 2509.15293 • Published Sep 18
LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM Paper • 2503.04724 • Published Mar 6 • 72
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs Paper • 2501.06186 • Published Jan 10 • 65
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs Paper • 2501.06186 • Published Jan 10 • 65