CE-GPPO: Controlling Entropy via Gradient-Preserving Clipping Policy Optimization in Reinforcement Learning Paper • 2509.20712 • Published Sep 25 • 18
Task-level Distributionally Robust Optimization for Large Language Model-based Dense Retrieval Paper • 2408.10613 • Published Aug 20, 2024
CartesianMoE: Boosting Knowledge Sharing among Experts via Cartesian Product Routing in Mixture-of-Experts Paper • 2410.16077 • Published Oct 21, 2024 • 1
Finedeep: Mitigating Sparse Activation in Dense LLMs via Multi-Layer Fine-Grained Experts Paper • 2502.12928 • Published Feb 18 • 1
UniAttn: Reducing Inference Costs via Softmax Unification for Post-Training LLMs Paper • 2502.00439 • Published Feb 1 • 1
LightRetriever: A LLM-based Hybrid Retrieval Architecture with 1000x Faster Query Inference Paper • 2505.12260 • Published May 18
Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization Paper • 2508.07629 • Published Aug 11 • 41
Scaffold-BPE: Enhancing Byte Pair Encoding with Simple and Effective Scaffold Token Removal Paper • 2404.17808 • Published Apr 27, 2024
MaskMoE: Boosting Token-Level Learning via Routing Mask in Mixture-of-Experts Paper • 2407.09816 • Published Jul 13, 2024 • 1