Arunkumar Venkataramanan's picture

In a Training Loop 🔄

139 288

Arunkumar Venkataramanan

ArunkumarVR

·

https://arunkumarramanan.github.io

AI & ML interests

AGI Research: Reasoning, Safety & Alignment (Superalignment), Generative AI (GenAI), Multi-Modal Foundation Models (FMs), Large Language Models (LLMs), Transformers & Diffusion Models, Open LLM Training, Optimization & Finetuning, Serving & Inference

Recent Activity

upvoted a paper about 10 hours ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

liked a model 3 days ago

deepseek-ai/DeepSeek-R1-0528-Qwen3-8B

upvoted a collection 4 days ago

View all activity

Organizations

upvoted a paper about 10 hours ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 252

liked a model 3 days ago

deepseek-ai/DeepSeek-R1-0528-Qwen3-8B

Text Generation • 8B • Updated May 29 • 491k • • 1k

upvoted a collection 4 days ago

Moonlight-A3B

Moonshot's Compute-efficient MoE LLM, first Scaling Up of Muon Optimizer • 3 items • Updated Nov 2 • 9

upvoted a paper 4 days ago

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published Jan 22 • 127

liked a model 4 days ago

Qwen/Qwen3-8B

Text Generation • 8B • Updated Jul 26 • 4.32M • • 825

liked a model 5 days ago

google/functiongemma-270m-it

Text Generation • 0.3B • Updated 6 days ago • 24.9k • 561

upvoted a collection 5 days ago

FunctionGemma

3 items • Updated 6 days ago • 28

liked 2 datasets 5 days ago

google/mobile-actions

Viewer • Updated 6 days ago • 9.65k • 2.47k • 153

allenai/c4

Viewer • Updated Jan 9, 2024 • 10.4B • 654k • 491

liked 3 datasets 6 days ago

nvidia/OpenCodeReasoning

Viewer • Updated May 4 • 753k • 4.06k • 519

nvidia/OpenMathReasoning

Viewer • Updated May 27 • 5.68M • 18k • 390

open-thoughts/OpenThoughts3-1.2M

Viewer • Updated Jun 9 • 1.2M • 12.3k • 196

upvoted an article 6 days ago

Article

The Open Evaluation Standard: Benchmarking NVIDIA Nemotron 3 Nano with NeMo Evaluator

7 days ago

•

32

liked 3 datasets 8 days ago

nvidia/Nemotron-CC-v2.1

Viewer • Updated 2 days ago • 3.8B • 17.3k • 28

nvidia/Nemotron-Pretraining-Code-v2

Viewer • Updated 2 days ago • 836M • 4.5k • 75

nvidia/Nemotron-CC-Code-v1

Viewer • Updated 2 days ago • 216M • 1.62k • 11

upvoted a collection 8 days ago

Nemotron v3 Pre-Training

Large scale pre-training datasets used in the Nemotron family of models. • 11 items • Updated 1 day ago • 6

liked 3 datasets 8 days ago

nvidia/Nemotron-CC-v2

Viewer • Updated 2 days ago • 8.79B • 25k • 96

nvidia/Nemotron-Pretraining-Code-v1

Viewer • Updated 2 days ago • 936M • 4.07k • 55

nvidia/Nemotron-Pretraining-SFT-v1

Viewer • Updated 2 days ago • 299M • 8.89k • 48