49 57 165

Stefano Fiorucci PRO

anakin87

55tfsi's profile picture

ogschrms's profile picture

sdiazlor's profile picture

theanakin87
anakin87
stefano-fiorucci

AI & ML interests

Language Models: orchestration, post-training, GRPO, synthetic data... Contributing to Haystack LLM framework 🏗️

Recent Activity

upvoted a paper 3 days ago

Extracting alignment data in open models

updated a collection 3 days ago

📝 Cool LLM papers

upvoted an article 4 days ago

Extract Text and Knowledge from Images with Open Vision Language Models

View all activity

Organizations

anakin87 's collections 4

📝 Cool LLM papers

Starting from 2024-11-15

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published Apr 18 • 135
Understanding R1-Zero-Like Training: A Critical Perspective

Paper • 2503.20783 • Published Mar 26 • 56
Inference-Time Scaling for Generalist Reward Modeling

Paper • 2504.02495 • Published Apr 3 • 56
Large Language Diffusion Models

Paper • 2502.09992 • Published Feb 14 • 122

Gemma Neogenesis 💎🌍🇮🇹

Datasets and models for Neogenesis: Post-training recipe for improving Gemma 2 for a specific language. Notebook: https://t.ly/iuKdy

anakin87/gemma-2-9b-neogenesis-ita

Text Generation • 9B • Updated Mar 10 • 774 • • 11
anakin87/gemma-2-2b-neogenesis-ita

Text Generation • 3B • Updated Jan 16 • 789 • • 6
Running on Zero

Gemma 2 9B Neogenesis ITA

💎

9B Italian strong model 💪
Running on Zero

3

3

Gemma 2 2B Neogenesis ITA

💎

Chat with an Italian Small Model

Qwen Scheduler GRPO

Train a SLM to create a schedule from a list of events and priorities - Article: https://t.ly/-Dejx - Code: https://t.ly/1J_VG

anakin87/events-scheduling

Viewer • Updated Apr 26 • 600 • 85
anakin87/qwen-scheduler-7b-grpo

Text Generation • Updated Apr 26 • 6

🇮🇹 Italian Merges

I tried to merge two of the best Italian LLMs using Mergekit. The results are acceptable, but I could not improve on the best existing model.

anakin87/Llama-3-8b-ita-ties-pro

Text Generation • 8B • Updated May 24, 2024 • 769 • • 1
anakin87/Llama-3-8b-ita-slerp

Text Generation • 8B • Updated May 24, 2024 • 778 • • 1
anakin87/Llama-3-8b-ita-ties

Text Generation • 8B • Updated May 24, 2024 • 773 • • 3
NikolayKozloff/Llama-3-8b-ita-ties-Q8_0-GGUF

8B • Updated May 19, 2024 • 5 • 1

📝 Cool LLM papers

Starting from 2024-11-15

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published Apr 18 • 135
Understanding R1-Zero-Like Training: A Critical Perspective

Paper • 2503.20783 • Published Mar 26 • 56
Inference-Time Scaling for Generalist Reward Modeling

Paper • 2504.02495 • Published Apr 3 • 56
Large Language Diffusion Models

Paper • 2502.09992 • Published Feb 14 • 122

Qwen Scheduler GRPO

Train a SLM to create a schedule from a list of events and priorities - Article: https://t.ly/-Dejx - Code: https://t.ly/1J_VG

anakin87/events-scheduling

Viewer • Updated Apr 26 • 600 • 85
anakin87/qwen-scheduler-7b-grpo

Text Generation • Updated Apr 26 • 6

Gemma Neogenesis 💎🌍🇮🇹

Datasets and models for Neogenesis: Post-training recipe for improving Gemma 2 for a specific language. Notebook: https://t.ly/iuKdy

anakin87/gemma-2-9b-neogenesis-ita

Text Generation • 9B • Updated Mar 10 • 774 • • 11
anakin87/gemma-2-2b-neogenesis-ita

Text Generation • 3B • Updated Jan 16 • 789 • • 6
Running on Zero

Gemma 2 9B Neogenesis ITA

💎

9B Italian strong model 💪
Running on Zero

3

3

Gemma 2 2B Neogenesis ITA

💎

Chat with an Italian Small Model

🇮🇹 Italian Merges

I tried to merge two of the best Italian LLMs using Mergekit. The results are acceptable, but I could not improve on the best existing model.

anakin87/Llama-3-8b-ita-ties-pro

Text Generation • 8B • Updated May 24, 2024 • 769 • • 1
anakin87/Llama-3-8b-ita-slerp

Text Generation • 8B • Updated May 24, 2024 • 778 • • 1
anakin87/Llama-3-8b-ita-ties

Text Generation • 8B • Updated May 24, 2024 • 773 • • 3
NikolayKozloff/Llama-3-8b-ita-ties-Q8_0-GGUF

8B • Updated May 19, 2024 • 5 • 1

Stefano Fiorucci PRO

AI & ML interests

Recent Activity

Organizations

anakin87 's collections 4

Gemma 2 9B Neogenesis ITA

Gemma 2 2B Neogenesis ITA

Gemma 2 9B Neogenesis ITA

Gemma 2 2B Neogenesis ITA