arxiv:2509.22921
Tu Nguyen
tumeteor
AI & ML interests
LLM/RL, Graphs, IR/NLP
Recent Activity
upvoted
a
paper
about 1 month ago
Tree-OPO: Off-policy Monte Carlo Tree-Guided Advantage Optimization for
Multistep Reasoning
authored
a paper
about 1 month ago
Rethinking Large Language Model Distillation: A Constrained Markov
Decision Process Perspective
upvoted
a
paper
about 1 month ago
Rethinking Large Language Model Distillation: A Constrained Markov
Decision Process Perspective
Organizations
None yet