Spaces:
Running
Running
metadata
title: Open Asr Leaderboard CL
emoji: π₯
colorFrom: green
colorTo: indigo
sdk: gradio
app_file: app.py
pinned: true
license: apache-2.0
short_description: Open ASR Leaderboard for Chilean Spanish
sdk_version: 4.44.0
tags:
- leaderboard
Chilean Spanish ASR Leaderboard
Simple Gradio-based leaderboard displaying ASR evaluation results for Chilean Spanish models.
Quick Start
This is a simplified version that displays results from a CSV file with two tabs:
- π Chilean Spanish ASR Leaderboard: Shows model rankings based on WER and RTFx metrics
- π About: Detailed information about the evaluation methodology and datasets
Running the Leaderboard
# Clone the repository
git clone https://github.com/aastroza/open_asr_leaderboard_cl.git
cd open_asr_leaderboard_cl
# Install dependencies
pip install gradio pandas
# Run the application
python app.py
The application will load results from results.csv and display them in a simple, clean interface.
Results Format
The results.csv file should contain the following columns:
model_id: The model identifier (e.g., "openai/whisper-large-v3")wer: Word Error Rate (lower is better)rtfx: Real-Time Factor (higher is better)- Additional metadata columns (dataset, num_samples, etc.)
Configuration
- Title and Content: Edit
src/about.pyto modify the title, introduction text, and about section - Styling: Customize appearance in
src/display/css_html_js.py - Data Processing: Modify the
load_results()function inapp.pyto change how results are aggregated and displayed
About the Evaluation
This leaderboard evaluates ASR models on Chilean Spanish using three datasets:
- Common Voice (Chilean Spanish subset)
- Google Chilean Spanish
- Datarisas
Models are ranked by average Word Error Rate (WER) across all datasets, with Real-Time Factor (RTFx) as a secondary metric for inference speed.
Models Evaluated
- openai/whisper-large-v3
- openai/whisper-large-v3-turbo
- openai/whisper-small
- rcastrovexler/whisper-small-es-cl (Chilean Spanish fine-tuned)
- nvidia/canary-1b-v2
- nvidia/parakeet-tdt-0.6b-v3
- microsoft/Phi-4-multimodal-instruct
- mistralai/Voxtral-Mini-3B-2507
- elevenlabs/scribe_v1
For detailed methodology and complete evaluation framework, see the Modal-based evaluation code in the original repository.
Citation
@misc{astroza2024chilean,
title={Chilean Spanish ASR Test Dataset},
author={Alonso Astroza},
year={2025},
howpublished={\url{https://huggingface.co/datasets/astroza/es-cl-asr-test-only}}
}