HanXiao's picture

4 15

HanXiao

HanXiao1999

·

Euphoria16

AI & ML interests

None yet

Recent Activity

updated a collection 39 minutes ago

updated a collection 39 minutes ago

updated a collection 40 minutes ago

View all activity

Organizations

None yet

authored a paper 3 months ago

A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence

Paper • 2507.21046 • Published Jul 28 • 81

authored 11 papers 5 months ago

SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models

Paper • 2311.07575 • Published Nov 13, 2023 • 15

ImageBind-LLM: Multi-modality Instruction Tuning

Paper • 2309.03905 • Published Sep 7, 2023 • 17

Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models

Paper • 2403.16999 • Published Mar 25, 2024 • 5

Token-Label Alignment for Vision Transformers

Paper • 2210.06455 • Published Oct 12, 2022

Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT

Paper • 2406.18583 • Published Jun 5, 2024

AMEX: Android Multi-annotation Expo Dataset for Mobile GUI Agents

Paper • 2407.17490 • Published Jul 3, 2024 • 31

BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices

Paper • 2411.10640 • Published Nov 16, 2024 • 46

LLM-Powered GUI Agents in Phone Automation: Surveying Progress and Prospects

Paper • 2504.19838 • Published Apr 28 • 22

WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch

Paper • 2505.03733 • Published May 6 • 17

MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning

Paper • 2505.10557 • Published May 15 • 47

UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents

Paper • 2505.21496 • Published May 27 • 38