Spaces:
Running
Running
Update src/about.py
Browse files- src/about.py +11 -2
src/about.py
CHANGED
|
@@ -21,11 +21,20 @@ NUM_FEWSHOT = 0 # Change with your few shot
|
|
| 21 |
|
| 22 |
|
| 23 |
# Your leaderboard name
|
| 24 |
-
TITLE = """<h1 align="center" id="space-title">
|
| 25 |
|
| 26 |
# What does your leaderboard evaluate?
|
| 27 |
INTRODUCTION_TEXT = """
|
| 28 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 29 |
"""
|
| 30 |
|
| 31 |
# Which evaluations are you running? how can people reproduce what you have?
|
|
|
|
| 21 |
|
| 22 |
|
| 23 |
# Your leaderboard name
|
| 24 |
+
TITLE = """<h1 align="center" id="space-title">MageBench Leaderboard</h1>"""
|
| 25 |
|
| 26 |
# What does your leaderboard evaluate?
|
| 27 |
INTRODUCTION_TEXT = """
|
| 28 |
+

|
| 29 |
+
|
| 30 |
+
MageBench is a reasoning-oriented multimodal intelligent agent benchmark introduced in the paper "xxx".
|
| 31 |
+
The tasks we selected meet the following criteria:
|
| 32 |
+
- Simple environment,
|
| 33 |
+
- Reflect a certain reasoning ability,
|
| 34 |
+
- High level of visual involvement.
|
| 35 |
+
|
| 36 |
+
In our paper, we demonstrate that our benchmark can generalize well to other scenarios.
|
| 37 |
+
We hope our work can empower future research in the fields of intelligent agents, robotics, and more.
|
| 38 |
"""
|
| 39 |
|
| 40 |
# Which evaluations are you running? how can people reproduce what you have?
|