MageBench-Leaderboard

Running

daiqi commited on Nov 27, 2024

Commit

df18eb0

verified ·

1 Parent(s): d622bf5

Update src/about.py

Files changed (1) hide show

src/about.py CHANGED Viewed

@@ -21,11 +21,20 @@ NUM_FEWSHOT = 0 # Change with your few shot
 # Your leaderboard name
-TITLE = """<h1 align="center" id="space-title">Demo leaderboard</h1>"""
 # What does your leaderboard evaluate?
 INTRODUCTION_TEXT = """
-Intro text
 """
 # Which evaluations are you running? how can people reproduce what you have?

 # Your leaderboard name
+TITLE = """<h1 align="center" id="space-title">MageBench Leaderboard</h1>"""
 # What does your leaderboard evaluate?
 INTRODUCTION_TEXT = """
+![overview image](./assets/overview.pdf)
+MageBench is a reasoning-oriented multimodal intelligent agent benchmark introduced in the paper "xxx".
+The tasks we selected meet the following criteria:
+- Simple environment,
+- Reflect a certain reasoning ability，
+- High level of visual involvement.
+In our paper, we demonstrate that our benchmark can generalize well to other scenarios.
+We hope our work can empower future research in the fields of intelligent agents, robotics, and more.
 """
 # Which evaluations are you running? how can people reproduce what you have?