Running 73 Unlocking On-Policy Distillation for Any Model Family 📝 73 Apply on-policy distillation to any model family
Running on CPU Upgrade Featured 2.72k The Smol Training Playbook 📚 2.72k The secrets to building world-class LLMs
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated 18 days ago • 277k • 1.55k
yentinglin/Mistral-Small-24B-Instruct-2501-reasoning Text Generation • 24B • Updated Apr 20 • 101 • • 58
bartowski/DeepSeek-R1-Distill-Qwen-32B-abliterated-GGUF Text Generation • Updated Jan 25 • 6.91k • 126