RefineBench: Evaluating Refinement Capability of Language Models via Checklists Paper • 2511.22173 • Published Nov 27, 2025 • 14
Energy-Based Transformers are Scalable Learners and Thinkers Paper • 2507.02092 • Published Jul 2, 2025 • 69
Flow-GRPO: Training Flow Matching Models via Online RL Paper • 2505.05470 • Published May 8, 2025 • 86
On Path to Multimodal Generalist: General-Level and General-Bench Paper • 2505.04620 • Published May 7, 2025 • 82
Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models Paper • 2505.04921 • Published May 8, 2025 • 185
API Agents vs. GUI Agents: Divergence and Convergence Paper • 2503.11069 • Published Mar 14, 2025 • 36