Optimised AWQ Quants for high-throughput deployments of Gemma2! Compatible with Transformers, TGI & VLLM 🤗
			
	
	AI & ML interests
Optimised quants for high-throughput deployments! Compatible with Transformers, TGI & vLLM 🤗
			Organization Card
		
		Welcome to the home of exciting quantized models! We'd love to see increased adoption of powerful state-of-the-art open models, and quantization is a key component to make them work on more types of hardware.
Resources:
- Llama 3.1 Quantized Models: Optimised Quants of Llama 3.1 for high-throughput deployments! Compatible with Transformers, TGI & VLLM 🤗.
- Hugging Face Llama Recipes: A set of minimal recipes to get started with Llama 3.1.
Llama.cpp compatible quants for Llama 3.2 3B and 1B Instruct models.
			
	
	- 
	
	
	  hugging-quants/Llama-3.2-3B-Instruct-Q8_0-GGUFText Generation • 3B • Updated • 876 • 52
- 
	
	
	  hugging-quants/Llama-3.2-3B-Instruct-Q4_K_M-GGUFText Generation • 3B • Updated • 18k • 20
- 
	
	
	  hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUFText Generation • 1B • Updated • 21.6k • 35
- 
	
	
	  hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUFText Generation • 1B • Updated • 38.8k • 17
Optimised AWQ Quants for high-throughput deployments of Gemma2! Compatible with Transformers, TGI & VLLM 🤗
			
	
	Llama.cpp compatible quants for Llama 3.2 3B and 1B Instruct models.
			
	
	- 
	
	
	  hugging-quants/Llama-3.2-3B-Instruct-Q8_0-GGUFText Generation • 3B • Updated • 876 • 52
- 
	
	
	  hugging-quants/Llama-3.2-3B-Instruct-Q4_K_M-GGUFText Generation • 3B • Updated • 18k • 20
- 
	
	
	  hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUFText Generation • 1B • Updated • 21.6k • 35
- 
	
	
	  hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUFText Generation • 1B • Updated • 38.8k • 17
			models
			21
		
			
	
	
	
	
	 
				hugging-quants/Llama-4-Scout-17B-16E-Instruct-fbgemm
			Image-Text-to-Text
			• 
		
				109B
			• 
	
				Updated
					
				
				• 
					
					5
				
	
				• 
					
					2
				
 
				hugging-quants/Llama-4-Scout-17B-16E-Instruct-fbgemm-unfused
			Image-Text-to-Text
			• 
		
				109B
			• 
	
				Updated
					
				
				• 
					
					4
				
	
				• 
					
					2
				
 
				hugging-quants/gemma-2-9b-it-AWQ-INT4
			Text Generation
			• 
		
				2B
			• 
	
				Updated
					
				
				• 
					
					43.5k
				
	
				• 
					
					6
				
 
				hugging-quants/Mixtral-8x7B-Instruct-v0.1-AWQ-INT4
			Text Generation
			• 
		
				6B
			• 
	
				Updated
					
				
				• 
					
					11.6k
				
	
				
				
 
				hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF
			Text Generation
			• 
		
				1B
			• 
	
				Updated
					
				
				• 
					
					38.8k
				
	
				• 
					
					17
				
 
				hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF
			Text Generation
			• 
		
				1B
			• 
	
				Updated
					
				
				• 
					
					21.6k
				
	
				• 
					
					35
				
 
				hugging-quants/Llama-3.2-3B-Instruct-Q4_K_M-GGUF
			Text Generation
			• 
		
				3B
			• 
	
				Updated
					
				
				• 
					
					18k
				
	
				• 
					
					20
				
 
				hugging-quants/Llama-3.2-3B-Instruct-Q8_0-GGUF
			Text Generation
			• 
		
				3B
			• 
	
				Updated
					
				
				• 
					
					876
				
	
				• 
					
					52
				
 
				hugging-quants/Meta-Llama-3.1-405B-BNB-NF4
			Text Generation
			• 
		
				211B
			• 
	
				Updated
					
				
				• 
					
					7
				
	
				• 
					
					2
				
 
				hugging-quants/Meta-Llama-3.1-405B-Instruct-BNB-NF4
			Text Generation
			• 
		
				214B
			• 
	
				Updated
					
				
				• 
					
					75
				
	
				• 
					
					5
				
			datasets
			0
		
			
	None public yet
