Update README.md
Browse files
README.md
CHANGED
|
@@ -12,14 +12,15 @@ license: cc-by-4.0
|
|
| 12 |
|
| 13 |
## OWSM: Open Whisper-style Speech Model
|
| 14 |
|
| 15 |
-
|
| 16 |
|
| 17 |
-
|
|
|
|
| 18 |
|
| 19 |
**[OWSM v3.1](https://arxiv.org/abs/2401.16658) is an improved version of OWSM v3. It significantly outperforms OWSM v3 in almost all evaluation benchmarks.**
|
| 20 |
We do not include any new training data. Instead, we utilize a state-of-the-art speech encoder, [E-Branchformer](https://arxiv.org/abs/2210.00077).
|
| 21 |
|
| 22 |
-
|
| 23 |
Specifically, it supports the following speech-to-text tasks:
|
| 24 |
- Speech recognition
|
| 25 |
- Any-to-any-language speech translation
|
|
|
|
| 12 |
|
| 13 |
## OWSM: Open Whisper-style Speech Model
|
| 14 |
|
| 15 |
+
OWSM aims to develop fully open speech foundation models using publicly available data and open-source toolkits, including [ESPnet](https://github.com/espnet/espnet).
|
| 16 |
|
| 17 |
+
Inference examples can be found on our [project page](https://www.wavlab.org/activities/2024/owsm/).
|
| 18 |
+
Our demo is available [here](https://huggingface.co/spaces/pyf98/OWSM_v3_demo).
|
| 19 |
|
| 20 |
**[OWSM v3.1](https://arxiv.org/abs/2401.16658) is an improved version of OWSM v3. It significantly outperforms OWSM v3 in almost all evaluation benchmarks.**
|
| 21 |
We do not include any new training data. Instead, we utilize a state-of-the-art speech encoder, [E-Branchformer](https://arxiv.org/abs/2210.00077).
|
| 22 |
|
| 23 |
+
The model in this repo has 1.02B parameters in total and is trained on 180k hours of public speech data.
|
| 24 |
Specifically, it supports the following speech-to-text tasks:
|
| 25 |
- Speech recognition
|
| 26 |
- Any-to-any-language speech translation
|