Spaces:

idsudd
/

open_asr_leaderboard_cl

Running

App Files Files Community

open_asr_leaderboard_cl / README.md

astroza

Update leaderboard configuration and results processing for Chilean Spanish ASR evaluation

13a06cd 30 days ago

preview code

raw

history blame

2.63 kB

metadata

title: Open Asr Leaderboard CL
emoji: 🥇
colorFrom: green
colorTo: indigo
sdk: gradio
app_file: app.py
pinned: true
license: apache-2.0
short_description: Open ASR Leaderboard for Chilean Spanish
sdk_version: 4.44.0
tags:
  - leaderboard

Chilean Spanish ASR Leaderboard

Simple Gradio-based leaderboard displaying ASR evaluation results for Chilean Spanish models.

Quick Start

This is a simplified version that displays results from a CSV file with two tabs:

🏅 Chilean Spanish ASR Leaderboard: Shows model rankings based on WER and RTFx metrics
📝 About: Detailed information about the evaluation methodology and datasets

Running the Leaderboard

# Clone the repository
git clone https://github.com/aastroza/open_asr_leaderboard_cl.git
cd open_asr_leaderboard_cl

# Install dependencies
pip install gradio pandas

# Run the application
python app.py

The application will load results from results.csv and display them in a simple, clean interface.

Results Format

The results.csv file should contain the following columns:

model_id: The model identifier (e.g., "openai/whisper-large-v3")
wer: Word Error Rate (lower is better)
rtfx: Real-Time Factor (higher is better)
Additional metadata columns (dataset, num_samples, etc.)

Configuration

Title and Content: Edit src/about.py to modify the title, introduction text, and about section
Styling: Customize appearance in src/display/css_html_js.py
Data Processing: Modify the load_results() function in app.py to change how results are aggregated and displayed

About the Evaluation

This leaderboard evaluates ASR models on Chilean Spanish using three datasets:

Common Voice (Chilean Spanish subset)
Google Chilean Spanish
Datarisas

Models are ranked by average Word Error Rate (WER) across all datasets, with Real-Time Factor (RTFx) as a secondary metric for inference speed.

Models Evaluated

openai/whisper-large-v3
openai/whisper-large-v3-turbo
openai/whisper-small
rcastrovexler/whisper-small-es-cl (Chilean Spanish fine-tuned)
nvidia/canary-1b-v2
nvidia/parakeet-tdt-0.6b-v3
microsoft/Phi-4-multimodal-instruct
mistralai/Voxtral-Mini-3B-2507
elevenlabs/scribe_v1

For detailed methodology and complete evaluation framework, see the Modal-based evaluation code in the original repository.

Citation

@misc{astroza2024chilean,
  title={Chilean Spanish ASR Test Dataset},
  author={Alonso Astroza},
  year={2025},
  howpublished={\url{https://huggingface.co/datasets/astroza/es-cl-asr-test-only}}
}