14 21 7

Yuzhe Gu

vanilla1116

https://guyuzhe.site/

Liqu1d-G

AI & ML interests

LLM; Reasoning; Hallucination; Self-Improvement

Recent Activity

commented on a paper 19 days ago

Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving

authored a paper 24 days ago

CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward

authored a paper 24 days ago

Intern-S1: A Scientific Multimodal Foundation Model

View all activity

Organizations

commented a paper 19 days ago

Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving

Paper • 2512.10739 • Published 25 days ago • 46 •

authored 4 papers 24 days ago

CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward

Paper • 2508.03686 • Published Aug 5, 2025 • 37

Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published Aug 21, 2025 • 259

OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification

Paper • 2512.10756 • Published 25 days ago • 34

Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving

Paper • 2512.10739 • Published 25 days ago • 46

upvoted a paper 24 days ago

Achieving Olympia-Level Geometry Large Language Model Agent via Complexity Boosting Reinforcement Learning

Paper • 2512.10534 • Published 25 days ago • 31

submitted a paper to Daily Papers 24 days ago

Achieving Olympia-Level Geometry Large Language Model Agent via Complexity Boosting Reinforcement Learning

Paper • 2512.10534 • Published 25 days ago • 31

upvoted a paper 24 days ago

Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving

Paper • 2512.10739 • Published 25 days ago • 46

submitted a paper to Daily Papers 24 days ago

Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving

Paper • 2512.10739 • Published 25 days ago • 46

upvoted a paper 24 days ago

OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification

Paper • 2512.10756 • Published 25 days ago • 34

submitted a paper to Daily Papers 24 days ago

OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification

Paper • 2512.10756 • Published 25 days ago • 34

upvoted a paper about 1 month ago

ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning

Paper • 2512.05111 • Published Dec 4, 2025 • 47

authored a paper 3 months ago

MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization

Paper • 2510.08540 • Published Oct 9, 2025 • 109

upvoted a paper 3 months ago

MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization

Paper • 2510.08540 • Published Oct 9, 2025 • 109

upvoted a paper 5 months ago

Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published Aug 21, 2025 • 259

liked 3 models 5 months ago

authored a paper 6 months ago

Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning

Paper • 2507.16814 • Published Jul 22, 2025 • 21

upvoted a paper 6 months ago

Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning

Paper • 2507.16814 • Published Jul 22, 2025 • 21

Yuzhe Gu

AI & ML interests

Recent Activity

Organizations

vanilla1116's activity