UI-Genie
Collection
UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents
•
3 items
•
Updated
UI-Genie-Agent-3B is a state-of-the-art Multimodal Large Language Model specifically trained for mobile GUI automation tasks. It is part of the UI-Genie framework, which introduces a novel self-improving approach for enhancing MLLM-based mobile GUI agents through iterative agent-reward model co-evolution.
This model achieves state-of-the-art performance on mobile GUI benchmarks by eliminating the need for manual annotation through synthetic trajectory generation guided by our specialized reward model UI-Genie-RM.
| Model Size | Low-Level Tasks | High-Level Tasks |
|---|---|---|
| UI-Genie-Agent-3B | 93.8% SR | 72.9% SR |
| UI-TARS-2B | 89.3% SR | 68.9% SR |
| Qwen2.5-VL-3B | 90.8% SR | 63.7% SR |
| Model | Success Rate | Sub-Goal Success Rate |
|---|---|---|
| UI-Genie-Agent-3B | 28.8% | 35.4% |
| AutoGLM | 36.2% | - |
| Qwen2.5-VL-7B | 14.9% | 18.7% |
Our model is trained on a combination of:
The model supports a comprehensive action space for mobile interactions:
| Action Type | Parameters | Description |
|---|---|---|
open |
app_name, action_desc | Launch applications |
click |
coordinate/som, action_desc | Tap UI elements |
swipe |
coordinate/som, direction, distance, action_desc | Scroll the screen |
long_press |
coordinate/som, action_desc | Long press interactions |
type |
text, action_desc | Text input |
system_button |
button, action_desc | System button presses |
wait |
time, action_desc | Wait operations |
terminate |
status, action_desc | Task completion |
@misc{xiao2025uigenieselfimprovingapproachiteratively,
title={UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents},
author={Han Xiao and Guozhi Wang and Yuxiang Chai and Zimu Lu and Weifeng Lin and Hao He and Lue Fan and Liuyang Bian and Rui Hu and Liang Liu and Shuai Ren and Yafei Wen and Xiaoxin Chen and Aojun Zhou and Hongsheng Li},
year={2025},
eprint={2505.21496},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2505.21496},
}