Apertus LLM Collection Democratizing Open and Compliant LLMs for Global Language Environments: 8B and 70B open-data open-weights models, multilingual in >1000 languages • 4 items • Updated 25 days ago • 292
Rated Games Dataset Collection Datasets where each row is a rated chess game • 10 items • Updated Jul 10 • 8
CulturalBench: a Robust, Diverse and Challenging Benchmark on Measuring the (Lack of) Cultural Knowledge of LLMs Paper • 2410.02677 • Published Oct 3, 2024 • 1
view article Article How to Choose the Best Open Source LLM for Your Project in 2025 By dvilasuero • Sep 9 • 72
ITA-Bench: Italian Benchmarks for LLMs Collection A collection of Italian benchmarks for Large Language Models. See also our Github repo: https://github.com/SapienzaNLP/ita-bench • 19 items • Updated Dec 4, 2024 • 7
view article Article NVIDIA Releases 6 Million Multi-Lingual Reasoning Dataset By nvidia and 4 others • Aug 20 • 18
gpt-oss Collection Open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. • 2 items • Updated Aug 7 • 365
view article Article Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face Jul 29 • 189
Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance Paper • 2507.22448 • Published Jul 30 • 65
view article Article CinePile 2.0 - making stronger datasets with adversarial refinement Oct 23, 2024 • 18
view article Article Back to The Future: Evaluating AI Agents on Predicting Future Events Jul 17 • 44