LiveOIBench: Can Large Language Models Outperform Human Contestants in Informatics Olympiads? Paper • 2510.09595 • Published Oct 10 • 1
A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence Paper • 2507.21046 • Published Jul 28 • 82
AgentDistill: Training-Free Agent Distillation with Generalizable MCP Boxes Paper • 2506.14728 • Published Jun 17
On Path to Multimodal Historical Reasoning: HistBench and HistAgent Paper • 2505.20246 • Published May 26
Alita: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self-Evolution Paper • 2505.20286 • Published May 26 • 8
GenoArmory: A Unified Evaluation Framework for Adversarial Attacks on Genomic Foundation Models Paper • 2505.10983 • Published May 16 • 2
EmoAgent: Assessing and Safeguarding Human-AI Interaction for Mental Health Safety Paper • 2504.09689 • Published Apr 13 • 6