--- language: - fon thumbnail: null tags: - automatic-speech-recognition - CTC - Attention - Transformer - Conformer - pytorch - speechbrain license: apache-2.0 datasets: - beethogedeon/fongbe-speech metrics: - wer - cer --- # Fongbe ASR model w/out diacritics ### How to use for inference ```python from speechbrain.inference.ASR import EncoderASR asr_model = EncoderASR.from_hparams( source="whettenr/asr-fon-without-diacritics", savedir="pretrained_models/asr-fongbe-without-diacritics" ) asr_model.transcribe_file("whettenr/asr-fon-without-diacritics/example.wav") # expected output: # huzuhuzu gɔngɔn ɖe ɖo dandan ``` ### Details of model ~100M parameters, 12 layer conformer encoder, FFNN decoder ### Details of training - pretrained using BEST-RQ on 140 hours - FFSTC 2 + beethogedeon/fongbe-speech (~40 hours) - cappfm (~100 hours) - finetuned with CTC loss on training sets of - FFSTC 2 - beethogedeon/fongbe-speech ``` @inproceedings{kponou25_interspeech, title = {{Extending the Fongbe to French Speech Translation Corpus: resources, models and benchmark}}, author = {D. Fortuné Kponou and Salima Mdhaffar and Fréjus A. A. Laleye and Eugène C. Ezin and Yannick Estève}, year = {2025}, booktitle = {{Interspeech 2025}}, pages = {4533--4537}, doi = {10.21437/Interspeech.2025-1801}, issn = {2958-1796}, } ```