Running on CPU Upgrade 928 928 The Smol Training Playbook: The Secrets to Building World-Class LLMs π
view article Article On the Shifting Global Compute Landscape By huggingface and 1 other β’ 4 days ago β’ 23
Less is More: Recursive Reasoning with Tiny Networks Paper β’ 2510.04871 β’ Published 27 days ago β’ 462
ServiceNow-AI/Apriel-1.5-15b-Thinker Image-Text-to-Text β’ 15B β’ Updated 27 days ago β’ 55.1k β’ 427
Running 197 197 FineVision: Open Data is All You Need π A new open-source dataset for training VLMs
Apertus LLM Collection Democratizing Open and Compliant LLMs for Global Language Environments: 8B and 70B open-data open-weights models, multilingual in >1000 languages β’ 4 items β’ Updated Oct 1 β’ 292
gpt-oss Collection Open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. β’ 2 items β’ Updated Aug 7 β’ 372
view article Article OpenReasoning-Nemotron: A Family of State-of-the-Art Distilled Reasoning Models By nvidia and 3 others β’ Jul 18 β’ 50
Comment on The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity Paper β’ 2506.09250 β’ Published Jun 10 β’ 27