Memory-T1: Reinforcement Learning for Temporal Reasoning in Multi-session Agents Paper • 2512.20092 • Published 4 days ago • 4
SWE-EVO: Benchmarking Coding Agents in Long-Horizon Software Evolution Scenarios Paper • 2512.18470 • Published 6 days ago • 8
TokSuite: Measuring the Impact of Tokenizer Choice on Language Model Behavior Paper • 2512.20757 • Published 3 days ago • 14
Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning Paper • 2512.20848 • Published 3 days ago • 24
SecureCode v2.0: A Production-Grade Dataset for Training Security-Aware Code Generation Models Paper • 2512.18542 • Published 6 days ago • 2
UCoder: Unsupervised Code Generation by Internal Probing of Large Language Models Paper • 2512.17385 • Published 8 days ago • 17
SpatialTree: How Spatial Abilities Branch Out in MLLMs Paper • 2512.20617 • Published 3 days ago • 41
Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies Paper • 2512.19673 • Published 4 days ago • 59
Improving Recursive Transformers with Mixture of LoRAs Paper • 2512.12880 • Published 12 days ago • 4
Nemotron-Math: Efficient Long-Context Distillation of Mathematical Reasoning from Multi-Mode Supervision Paper • 2512.15489 • Published 9 days ago • 6
SWE-Bench++: A Framework for the Scalable Generation of Software Engineering Benchmarks from Open-Source Repositories Paper • 2512.17419 • Published 8 days ago • 9
Seed-Prover 1.5: Mastering Undergraduate-Level Theorem Proving via Learning from Experience Paper • 2512.17260 • Published 8 days ago • 47
Exploration v.s. Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward Paper • 2512.16912 • Published 8 days ago • 10
Confucius Code Agent: An Open-sourced AI Software Engineer at Industrial Scale Paper • 2512.10398 • Published 16 days ago • 6
Achieving Olympia-Level Geometry Large Language Model Agent via Complexity Boosting Reinforcement Learning Paper • 2512.10534 • Published 16 days ago • 31