R2E-Gym: Procedural Environments and Hybrid Verifiers for Scaling Open-Weights SWE Agents Paper • 2504.07164 • Published Apr 9 • 1
Running on CPU Upgrade Featured 2.68k The Smol Training Playbook 📚 2.68k The secrets to building world-class LLMs
Running 3.6k The Ultra-Scale Playbook 🌌 3.6k The ultimate guide to training LLM on large GPU Clusters
Running Featured 1.23k FineWeb: decanting the web for the finest text data at scale 🍷 1.23k Generate high-quality text data for LLMs using FineWeb
view article Article Tokenization in Transformers v5: Simpler, Clearer, and More Modular +4 7 days ago • 74
Efficient-DLM: From Autoregressive to Diffusion Language Models, and Beyond in Speed Paper • 2512.14067 • Published 9 days ago • 12