ℏεsam
hesamation
AI & ML interests
post-training / reasonign models / RAG
Recent Activity
upvoted
an
article
29 days ago
We Got Claude to Fine-Tune an Open Source LLM
upvoted
a
paper
30 days ago
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models
upvoted
a
paper
about 1 month ago
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices