Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents Paper • 2510.24702 • Published Oct 28, 2025 • 28
The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution Paper • 2510.25726 • Published Oct 29, 2025 • 45
Simulating Environments with Reasoning Models for Agent Training Paper • 2511.01824 • Published Nov 3, 2025 • 2
On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models Paper • 2512.07783 • Published 25 days ago • 36
ReasoningTransferability/UniReason-Qwen3-14B-think-SFT Text Generation • 15B • Updated Sep 28, 2025 • 12
TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling Paper • 2508.17445 • Published Aug 24, 2025 • 80
ReasoningTransferability/UniReason-Qwen3-14B-no-think-SFT Text Generation • 15B • Updated Aug 25, 2025 • 16 • 1
ReasoningTransferability/UniReason-Qwen3-14B-RL Text Generation • 15B • Updated Aug 25, 2025 • 21 • 3
ReasoningTransferability/UniReason-Qwen3-14B-no-think-SFT Text Generation • 15B • Updated Aug 25, 2025 • 16 • 1
ReasoningTransferability/UniReason-Qwen3-14B-think-SFT Text Generation • 15B • Updated Sep 28, 2025 • 12
ReasoningTransferability/UniReason-Qwen3-14B-RL Text Generation • 15B • Updated Aug 25, 2025 • 21 • 3
Small Models Struggle to Learn from Strong Reasoners Paper • 2502.12143 • Published Feb 17, 2025 • 39