Spaces:

lmarena-ai
/

lmarena-leaderboard

Running

App Files Files Community

weichiang commited on Apr 11, 2024

Commit

35f8ff4

1 Parent(s): 0bcfc15

update

Browse files

Files changed (1) hide show

app.py +4 -5

app.py CHANGED Viewed

@@ -25,9 +25,7 @@ def make_default_md(arena_df, elo_results):
 | [Vote](https://chat.lmsys.org) | [Blog](https://lmsys.org/blog/2023-05-03-arena/) | [GitHub](https://github.com/lm-sys/FastChat) | [Paper](https://arxiv.org/abs/2306.05685) | [Dataset](https://github.com/lm-sys/FastChat/blob/main/docs/dataset_release.md) | [Twitter](https://twitter.com/lmsysorg) | [Discord](https://discord.gg/HSWAKCrnFx) |
 LMSYS [Chatbot Arena](https://lmsys.org/blog/2023-05-03-arena/) is a crowdsourced open platform for LLM evals.
-We've collected over **500,000** human preference votes to rank LLMs with the Elo ranking system. Contribute your vote 🗳️ at [chat.lmsys.org](https://chat.lmsys.org)!
-Code to recreate leaderboard tables and plots in this [notebook]({notebook_url}) and more discussions in this blog [post](https://lmsys.org/blog/2023-12-07-leaderboard/).
 """
     return leaderboard_md
@@ -37,9 +35,10 @@ def make_arena_leaderboard_md(arena_df):
     total_models = len(arena_df)
     space = "&nbsp;&nbsp;&nbsp;"
     leaderboard_md = f"""
-Total #models: **{total_models}**.{space} Total #votes: **{"{:,}".format(total_votes)}**.{space} Last updated: April 9, 2024.
 📣 **NEW!** View leaderboard for different categories (e.g., coding, long user query)!
 """
     return leaderboard_md
@@ -405,7 +404,7 @@ def build_leaderboard_tab(elo_results_file, leaderboard_table_file, show_plot=Fa
                 gr.Markdown(
                     f"""Note: we take the 95% confidence interval into account when determining a model's ranking.
             A model is ranked higher only if its lower bound of model score is higher than the upper bound of the other model's score.
-            See Figure 3 below for visualization of the confidence intervals. Code to recreate these tables and plots in this [notebook]({notebook_url}) and more discussions in this blog [post](https://lmsys.org/blog/2023-12-07-leaderboard/).
             """,
                     elem_id="leaderboard_markdown"
                 )

 | [Vote](https://chat.lmsys.org) | [Blog](https://lmsys.org/blog/2023-05-03-arena/) | [GitHub](https://github.com/lm-sys/FastChat) | [Paper](https://arxiv.org/abs/2306.05685) | [Dataset](https://github.com/lm-sys/FastChat/blob/main/docs/dataset_release.md) | [Twitter](https://twitter.com/lmsysorg) | [Discord](https://discord.gg/HSWAKCrnFx) |
 LMSYS [Chatbot Arena](https://lmsys.org/blog/2023-05-03-arena/) is a crowdsourced open platform for LLM evals.
+We've collected over **500,000** human preference votes to rank LLMs with the Elo ranking system.
 """
     return leaderboard_md
     total_models = len(arena_df)
     space = "&nbsp;&nbsp;&nbsp;"
     leaderboard_md = f"""
+Total #models: **{total_models}**.{space} Total #votes: **{"{:,}".format(total_votes)}**.{space} Last updated: April 11, 2024.
 📣 **NEW!** View leaderboard for different categories (e.g., coding, long user query)!
+Code to recreate leaderboard tables and plots in this [notebook]({notebook_url}). Cast your vote 🗳️ at [chat.lmsys.org](https://chat.lmsys.org)!
 """
     return leaderboard_md
                 gr.Markdown(
                     f"""Note: we take the 95% confidence interval into account when determining a model's ranking.
             A model is ranked higher only if its lower bound of model score is higher than the upper bound of the other model's score.
+            See Figure 3 below for visualization of the confidence intervals. More details in [notebook]({notebook_url}).
             """,
                     elem_id="leaderboard_markdown"
                 )