1 5 14

Wenhan Ma

CuteNPC

https://github.com/CuteNPC

CuteNPC

AI & ML interests

Large Language Model

Recent Activity

upvoted a paper 3 days ago

GroundingME: Exposing the Visual Grounding Gap in MLLMs through Multi-Dimensional Evaluation

upvoted a paper 23 days ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

liked a model about 1 month ago

Lansechen/deepseek-v2-lite-16b-chat-R1-Distill-bs17k-batch32

View all activity

Organizations

None yet

upvoted a paper 3 days ago

GroundingME: Exposing the Visual Grounding Gap in MLLMs through Multi-Dimensional Evaluation

Paper • 2512.17495 • Published 6 days ago • 17

upvoted a paper 23 days ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published 24 days ago • 93

liked a model about 1 month ago

Lansechen/deepseek-v2-lite-16b-chat-R1-Distill-bs17k-batch32

Text Generation • 16B • Updated Feb 22 • 10 • 1

authored a paper about 2 months ago

Stabilizing MoE Reinforcement Learning by Aligning Training and Inference Routers

Paper • 2510.11370 • Published Oct 13 • 3

authored a paper 7 months ago

MiMo-VL Technical Report

Paper • 2506.03569 • Published Jun 4 • 80

upvoted a paper 7 months ago

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 263

authored a paper 8 months ago

MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining

Paper • 2505.07608 • Published May 12 • 82

upvoted a paper 8 months ago

MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining

Paper • 2505.07608 • Published May 12 • 82

liked a dataset 9 months ago

agentica-org/DeepScaleR-Preview-Dataset

Viewer • Updated Feb 10 • 40.3k • 8.5k • 183

liked 5 datasets 10 months ago

liked 4 datasets about 1 year ago

trl-lib/ultrafeedback_binarized

Viewer • Updated Sep 12, 2024 • 63.1k • 6.01k • 20

Dahoas/synthetic-instruct-gptj-pairwise

Viewer • Updated Jan 9, 2023 • 33.1k • 572 • 57

peiyi9979/Math-Shepherd

Viewer • Updated Jan 3, 2024 • 445k • 576 • 100

meta-math/MetaMathQA

Viewer • Updated Dec 21, 2023 • 395k • 9.66k • 425

upvoted a paper about 1 year ago

Self-Boosting Large Language Models with Synthetic Preference Data

Paper • 2410.06961 • Published Oct 9, 2024 • 16

liked a dataset about 1 year ago

openai/gsm8k

Benchmark • Updated 5 days ago • 17.6k • 426k • 1.07k

Wenhan Ma

AI & ML interests

Recent Activity

Organizations

CuteNPC's activity