Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

5,172

Full-text search

Active filters: quantized

unsloth/Qwen-Image-Layered-GGUF

Image-Text-to-Image • 20B • Updated Jan 9 • 4.43k • 42

0xSero/GLM-4.7-REAP-218B-A32B-W4A16

Text Generation • 2B • Updated Jan 15 • 384 • 19

bullpoint/Qwen3-Coder-Next-AWQ-4bit

Text Generation • 14B • Updated 15 days ago • 216k • 5

vincentzed-hf/Qwen3-Coder-Next-NVFP4

Text Generation • Updated 3 days ago • 2.46k • 5

Kilinskiy/Step-3.5-Flash-Ablitirated

Text Generation • 197B • Updated 12 days ago • 240 • 5

huawei-csl/Qwen3-4B-PreSINQ-GGUF

Text Generation • 4B • Updated 8 days ago • 109 • 3

tacos4me/Step-3.5-Flash-NVFP4

Text Generation • 111B • Updated 5 days ago • 936 • 2

inferencerlabs/MiniMax-M2.5-MLX-9bit

Text Generation • 229B • Updated 5 days ago • 643 • 2

marksverdhei/MiniMax-M2.5-GGUF

229B • Updated 5 days ago • 1.04k • 2

argmaxinc/whisperkit-coreml_01-30-24

Automatic Speech Recognition • Updated Mar 8, 2024 • 99 • 78

afrideva/TeenyTinyLlama-460m-GGUF

Text Generation • 0.5B • Updated May 12, 2024 • 57 • 1

MaziyarPanahi/Mistral-Small-Instruct-2409-GGUF

Text Generation • 22B • Updated Jan 1, 2025 • 145k • 4

MaziyarPanahi/QwQ-32B-GGUF

Text Generation • 33B • Updated Mar 7, 2025 • 144k • 4

boltuix/NeuroBERT-Tiny

Text Classification • 4.42M • Updated Jun 30, 2025 • 22 • 12

MaziyarPanahi/Qwen3-14B-GGUF

Text Generation • 15B • Updated Apr 28, 2025 • 230k • 8

MaziyarPanahi/INTELLECT-2-GGUF

Text Generation • 33B • Updated May 12, 2025 • 144k • 3

QuantStack/Wan2.1_T2V_14B_FusionX_VACE-GGUF

Image-to-Video • 17B • Updated Jun 12, 2025 • 2.62k • 57

QuantStack/Phantom_Wan_14B_FusionX-GGUF

Image-to-Video • 14B • Updated Jun 12, 2025 • 380 • 36

nvidia/Qwen3-235B-A22B-NVFP4

Text Generation • 133B • Updated Jul 8, 2025 • 5.17k • 14

nvidia/DeepSeek-R1-0528-NVFP4-v2

Text Generation • 394B • Updated Sep 2, 2025 • 134k • 14

Ayeshas21/sentence-transformers-all-MiniLM-L6-v2-quantized

Sentence Similarity • Updated Jul 27, 2025 • 4 • 1

RedHatAI/Voxtral-Mini-3B-2507-FP8-dynamic

Automatic Speech Recognition • 5B • Updated 13 days ago • 200 • 10

Immac/NetaYume-Lumina-Image-2.0-GGUF

Text-to-Image • 3B • Updated Aug 23, 2025 • 307 • 6

nvidia/Qwen3-8B-NVFP4

Text Generation • 5B • Updated Sep 9, 2025 • 20.7k • 14

nvidia/Qwen3-14B-FP8

Text Generation • 15B • Updated Sep 9, 2025 • 3.34k • 3

drwlf/MedraN-E4B-Uncensored-MLX-Quantized

Updated Sep 26, 2025 • 1

nvidia/gpt-oss-120b-Eagle3-short-context

Text Generation • Updated 23 days ago • 7.41k • 15

magiccodingman/Qwen3-4B-Instruct-2507-Unsloth-MXFP4-Hybrid-GGUF

Text Generation • 4B • Updated Dec 3, 2025 • 84 • 2

nvidia/DeepSeek-V3.1-NVFP4

Text Generation • 394B • Updated Jan 13 • 52.5k • 13

magiccodingman/Qwen3-4B-Instruct-2507-Unsloth-MagicQuant-Hybrid-GGUF

Text Generation • 4B • Updated Dec 5, 2025 • 634 • 7