Update constants.py
Browse files- constants.py +5 -0
constants.py
CHANGED
|
@@ -82,6 +82,11 @@ TABLE_INTRODUCTION = """In the table below, we summarize each task performance o
|
|
| 82 |
We use accurancy(%) as the primary evaluation metric for each tasks.
|
| 83 |
SEED-Bench-1 calculates the overall accuracy by dividing the total number of correct QA answers by the total number of QA questions.
|
| 84 |
SEED-Bench-2 represents the overall accuracy using the average accuracy of each dimension.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 85 |
"""
|
| 86 |
|
| 87 |
LEADERBORAD_INFO = """
|
|
|
|
| 82 |
We use accurancy(%) as the primary evaluation metric for each tasks.
|
| 83 |
SEED-Bench-1 calculates the overall accuracy by dividing the total number of correct QA answers by the total number of QA questions.
|
| 84 |
SEED-Bench-2 represents the overall accuracy using the average accuracy of each dimension.
|
| 85 |
+
For PPL evaluation method, we count the loss for each candidate and select the lowest loss candidate. For detail, please refer [InternLM_Xcomposer_VL_interface](https://github.com/AILab-CVC/SEED-Bench/blob/387a067b6ba99ae5e8231f39ae2d2e453765765c/SEED-Bench-2/model/InternLM_Xcomposer_VL_interface.py#L74).
|
| 86 |
+
For PPL A/B/C/D evaluation method, please refer [EVAL_SEED.md](https://github.com/QwenLM/Qwen-VL/blob/master/eval_mm/seed_bench/EVAL_SEED.md) for more information.
|
| 87 |
+
For Generate evaluation method, please refer [Evaluation.md](https://github.com/haotian-liu/LLaVA/blob/main/docs/Evaluation.md#seed-bench) for detailed.
|
| 88 |
+
For the NG evaluation method, we indicate that the evaluation method is Not Given.
|
| 89 |
+
If you have any questions, please feel free to contact us.
|
| 90 |
"""
|
| 91 |
|
| 92 |
LEADERBORAD_INFO = """
|