Apertus: Democratizing Open and Compliant LLMs for Global Language Environments Paper • 2509.14233 • Published Sep 17 • 14
Bandits with Preference Feedback: A Stackelberg Game Perspective Paper • 2406.16745 • Published Jun 24, 2024
Contextual Bilevel Reinforcement Learning for Incentive Alignment Paper • 2406.01575 • Published Jun 3, 2024