pushing fr model
Browse files- README.md +77 -0
- asr.ckpt +3 -0
- normalizer.ckpt +3 -0
- tokenizer.ckpt +3 -0
    	
        README.md
    ADDED
    
    | @@ -0,0 +1,77 @@ | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            ---
         | 
| 2 | 
            +
            language: "fr"
         | 
| 3 | 
            +
            thumbnail:
         | 
| 4 | 
            +
            tags:
         | 
| 5 | 
            +
            - ASR
         | 
| 6 | 
            +
            - CTC
         | 
| 7 | 
            +
            - Attention
         | 
| 8 | 
            +
            - pytorch
         | 
| 9 | 
            +
            license: "apache-2.0"
         | 
| 10 | 
            +
            datasets:
         | 
| 11 | 
            +
            - commonvoice
         | 
| 12 | 
            +
            metrics:
         | 
| 13 | 
            +
            - wer
         | 
| 14 | 
            +
            - cer
         | 
| 15 | 
            +
            ---
         | 
| 16 | 
            +
             | 
| 17 | 
            +
            # CRDNN with CTC/Attention trained on CommonVoice French (No LM)
         | 
| 18 | 
            +
             | 
| 19 | 
            +
            This repository provides all the necessary tools to perform automatic speech
         | 
| 20 | 
            +
            recognition from an end-to-end system pretrained on CommonVoice (FR) within
         | 
| 21 | 
            +
            SpeechBrain. For a better experience we encourage you to learn more about
         | 
| 22 | 
            +
            [SpeechBrain](https://speechbrain.github.io). The given ASR model performance are:
         | 
| 23 | 
            +
             | 
| 24 | 
            +
            | Release | Test CER | Test WER | GPUs |
         | 
| 25 | 
            +
            |:-------------:|:--------------:|:--------------:| :--------:|
         | 
| 26 | 
            +
            | 07-03-21 | 6.54 | 17.70 | 2xV100 16GB |
         | 
| 27 | 
            +
             | 
| 28 | 
            +
            ## Pipeline description
         | 
| 29 | 
            +
             | 
| 30 | 
            +
            This ASR system is composed with 2 different but linked blocks:
         | 
| 31 | 
            +
            1. Tokenizer (unigram) that transforms words into subword units and trained with
         | 
| 32 | 
            +
            the train transcriptions (train.tsv) of CommonVoice (FR).
         | 
| 33 | 
            +
            3. Acoustic model (CRDNN + CTC/Attention). The CRDNN architecture is made of
         | 
| 34 | 
            +
            N blocks of convolutional neural networks with normalisation and pooling on the
         | 
| 35 | 
            +
            frequency domain. Then, a bidirectional LSTM is connected to a final DNN to obtain
         | 
| 36 | 
            +
            the final acoustic representation that is given to the CTC and attention decoders.
         | 
| 37 | 
            +
             | 
| 38 | 
            +
            ## Intended uses & limitations
         | 
| 39 | 
            +
             | 
| 40 | 
            +
            This model has been primilarly developed to be run within SpeechBrain as a pretrained ASR model
         | 
| 41 | 
            +
            for the French language. Thanks to the flexibility of SpeechBrain, any of the 2 blocks
         | 
| 42 | 
            +
            detailed above can be extracted and connected to you custom pipeline as long as SpeechBrain is
         | 
| 43 | 
            +
            installed.
         | 
| 44 | 
            +
             | 
| 45 | 
            +
            ## Install SpeechBrain
         | 
| 46 | 
            +
             | 
| 47 | 
            +
            First of all, please install SpeechBrain with the following command:
         | 
| 48 | 
            +
             | 
| 49 | 
            +
            ```
         | 
| 50 | 
            +
            pip install \\we hide ! SpeechBrain is still private :p
         | 
| 51 | 
            +
            ```
         | 
| 52 | 
            +
             | 
| 53 | 
            +
            Please notice that we encourage you to read our tutorials and learn more about
         | 
| 54 | 
            +
            [SpeechBrain](https://speechbrain.github.io).
         | 
| 55 | 
            +
             | 
| 56 | 
            +
            ### Transcribing your own audio files
         | 
| 57 | 
            +
             | 
| 58 | 
            +
            ```python
         | 
| 59 | 
            +
            from speechbrain.pretrained import EncoderDecoderASR
         | 
| 60 | 
            +
             | 
| 61 | 
            +
            asr_model = EncoderDecoderASR.from_hparams(source="speechbrain/asr-crdnn-commonvoice-fr")
         | 
| 62 | 
            +
            asr_model.transcribe_file("path_to_your_file.wav")
         | 
| 63 | 
            +
             | 
| 64 | 
            +
            ```
         | 
| 65 | 
            +
             | 
| 66 | 
            +
            #### Referencing SpeechBrain
         | 
| 67 | 
            +
             | 
| 68 | 
            +
            ```
         | 
| 69 | 
            +
            @misc{SB2021,
         | 
| 70 | 
            +
                author = {Ravanelli, Mirco and Parcollet, Titouan and Rouhe, Aku and Plantinga, Peter and Rastorgueva, Elena and Lugosch, Loren and Dawalatabad, Nauman and Ju-Chieh, Chou and Heba, Abdel and Grondin, Francois and Aris, William and Liao, Chien-Feng and Cornell, Samuele and Yeh, Sung-Lin and Na, Hwidong and Gao, Yan and Fu, Szu-Wei and Subakan, Cem and De Mori, Renato and Bengio, Yoshua },
         | 
| 71 | 
            +
                title = {SpeechBrain},
         | 
| 72 | 
            +
                year = {2021},
         | 
| 73 | 
            +
                publisher = {GitHub},
         | 
| 74 | 
            +
                journal = {GitHub repository},
         | 
| 75 | 
            +
                howpublished = {\url{https://github.com/speechbrain/speechbrain}},
         | 
| 76 | 
            +
              }
         | 
| 77 | 
            +
            ```
         | 
    	
        asr.ckpt
    ADDED
    
    | @@ -0,0 +1,3 @@ | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            version https://git-lfs.github.com/spec/v1
         | 
| 2 | 
            +
            oid sha256:42d4644742da8f95e68124d8d04605907b11fb10e82c5800982d098380b2cd49
         | 
| 3 | 
            +
            size 592775161
         | 
    	
        normalizer.ckpt
    ADDED
    
    | @@ -0,0 +1,3 @@ | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            version https://git-lfs.github.com/spec/v1
         | 
| 2 | 
            +
            oid sha256:75bf1f53645ef67244c27a9474c63144b79ac6453c827af6a45e2c5e385fcdf7
         | 
| 3 | 
            +
            size 1783
         | 
    	
        tokenizer.ckpt
    ADDED
    
    | @@ -0,0 +1,3 @@ | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            version https://git-lfs.github.com/spec/v1
         | 
| 2 | 
            +
            oid sha256:fd21b3558352be835d8f8855f8c677c5794133c7e1d59aec47f6ba40dc2ca63e
         | 
| 3 | 
            +
            size 244544
         | 

