Running on CPU Upgrade Featured 2.67k The Smol Training Playbook 📚 2.67k The secrets to building world-class LLMs
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning Paper • 2506.24119 • Published Jun 30 • 50
Running 3.6k The Ultra-Scale Playbook 🌌 3.6k The ultimate guide to training LLM on large GPU Clusters