6 39 29

Rui-Jie Zhu

ridger

AI & ML interests

None yet

Recent Activity

upvoted a paper 4 days ago

Seed3D 1.0: From Images to High-Fidelity Simulation-Ready 3D Assets

upvoted a paper 26 days ago

Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation

upvoted a paper about 1 month ago

OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling

View all activity

Organizations

upvoted a paper 4 days ago

Seed3D 1.0: From Images to High-Fidelity Simulation-Ready 3D Assets

Paper • 2510.19944 • Published 6 days ago • 15

upvoted a paper 26 days ago

Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation

Paper • 2509.25849 • Published 28 days ago • 47

upvoted a paper about 1 month ago

OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling

Paper • 2506.20512 • Published Jun 25 • 47

upvoted a paper about 2 months ago

Why Language Models Hallucinate

Paper • 2509.04664 • Published Sep 4 • 189

upvoted 2 papers 3 months ago

WideSearch: Benchmarking Agentic Broad Info-Seeking

Paper • 2508.07999 • Published Aug 11 • 109

Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving

Paper • 2507.23726 • Published Jul 31 • 112

upvoted a paper 4 months ago

A Systematic Analysis of Hybrid Linear Attention

Paper • 2507.06457 • Published Jul 8 • 24

upvoted an article 4 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

Jul 8

• 701

upvoted 2 papers 4 months ago

CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization

Paper • 2507.06181 • Published Jul 8 • 43

A Survey on Latent Reasoning

Paper • 2507.06203 • Published Jul 8 • 92

upvoted an article 4 months ago

Article

All LLMs Will Be Sparse BitNet Hybrids

•

May 14

• 16

upvoted a collection 4 months ago

Hybrid Linear Attention Research

Collection

All 1.3B & 340M hybrid linear-attention experiments. • 62 items • Updated Sep 11 • 12

upvoted 2 papers 4 months ago

A Survey on Vision-Language-Action Models: An Action Tokenization Perspective

Paper • 2507.01925 • Published Jul 2 • 38

Kwai Keye-VL Technical Report

Paper • 2507.01949 • Published Jul 2 • 130

upvoted a collection 4 months ago

ERNIE 4.5

Collection

collection of ERNIE 4.5 models. "-Paddle" models use PaddlePaddle weights, while "-PT" models use Transformer-style PyTorch weights. • 26 items • Updated Sep 24 • 174

upvoted a paper 4 months ago

Essential-Web v1.0: 24T tokens of organized web data

Paper • 2506.14111 • Published Jun 17 • 46

upvoted a paper 5 months ago

Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation

Paper • 2506.09991 • Published Jun 11 • 55

upvoted a paper 6 months ago

Efficient Pretraining Length Scaling

Paper • 2504.14992 • Published Apr 21 • 20

upvoted 2 papers 7 months ago

ARFlow: Autogressive Flow with Hybrid Linear Attention

Paper • 2501.16085 • Published Jan 27 • 1

A Comprehensive Survey on Long Context Language Modeling

Paper • 2503.17407 • Published Mar 20 • 49