Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
0.7
TFLOPS
11
37
414
Matricardi Fabio
FM-1976
Follow
puettmann's profile picture
21world's profile picture
Fischerboot's profile picture
18 followers
·
99 following
https://medium.com/@fabio.matricardi
ThePoorGpuGuy
fabiomatricardi
AI & ML interests
control system engineering, AI, LLM with python. ThePoorGPUguy on substack
Recent Activity
liked
a model
about 22 hours ago
LiquidAI/LFM2-2.6B-Exp
reacted
to
codelion
's
post
with 🚀
about 22 hours ago
Introducing Dhara-70M: A diffusion language model that achieves 3.8x higher throughput than autoregressive models! Key findings from our research on optimal architectures for small language models: → Depth beats width: 32 layers outperforms 12 layers at the same parameter count → Best-in-class factuality: 47.5% on TruthfulQA → 10x training efficiency using WSD (Warmup-Stable-Decay) conversion → Canon layers add only 0.13% parameters but improve reasoning We trained on 1B tokens using the optimal 50-30-20 dataset mix (PDFs + filtered web + educational content), then converted to diffusion with just 100M additional tokens. Blog: https://huggingface.co/blog/codelion/optimal-model-architecture Model: https://huggingface.co/codelion/dhara-70m
liked
a model
about 22 hours ago
codelion/dhara-70m
View all activity
Organizations
None yet
FM-1976
's models
11
Sort: Recently updated
FM-1976/gemma-2b-docjoybot-lora-F16-GGUF
10.4M
•
Updated
May 9
•
14
•
1
FM-1976/Gaia-LLM-8B-Q4_K_M-GGUF
8B
•
Updated
May 9
•
15
•
1
FM-1976/Qwen-1.5B-Tweet-Generations-F16-GGUF
2.18M
•
Updated
May 8
•
12
•
1
FM-1976/SmolLM2-360M-it-llamafile
Text Generation
•
Updated
Apr 15
•
12
FM-1976/Qwen2.5-1.6b-llamafile
Text Generation
•
Updated
Apr 15
•
23
•
1
FM-1976/Lite-Oute-1-300M-Instruct-openvino
Text Generation
•
Updated
Mar 7
•
10
FM-1976/stablelm-zephyr-3b-openvino-4bit
Updated
Feb 24
•
16
FM-1976/ov_Llama-SmolTalk-3.2-1B-Instruct
Text Generation
•
Updated
Nov 29, 2024
•
14
FM-1976/ov_NuExtract-1.5-tiny
Text Generation
•
Updated
Nov 29, 2024
•
10
FM-1976/NuExtract-1.5-tiny-ONNX
Updated
Nov 28, 2024
•
9
FM-1976/gemma-2-2b-it-Q5_K_M-GGUF
Text Generation
•
3B
•
Updated
Oct 13, 2024
•
14
•
1