Dominik Macháček
commited on
Commit
·
a365074
1
Parent(s):
819ac6c
Update README.md
Browse filesupdate server description
README.md
CHANGED
|
@@ -16,7 +16,9 @@ Alternative, less restrictive, but slowe backend is [whisper-timestamped](https:
|
|
| 16 |
|
| 17 |
The backend is loaded only when chosen. The unused one does not have to be installed.
|
| 18 |
|
| 19 |
-
## Usage
|
|
|
|
|
|
|
| 20 |
|
| 21 |
```
|
| 22 |
usage: whisper_online.py [-h] [--min-chunk-size MIN_CHUNK_SIZE] [--model {tiny.en,tiny,base.en,base,small.en,small,medium.en,medium,large-v1,large-v2,large}] [--model_cache_dir MODEL_CACHE_DIR] [--model_dir MODEL_DIR] [--lan LAN] [--task {transcribe,translate}]
|
|
@@ -72,7 +74,7 @@ python3 whisper_online.py en-demo16.wav --language en --min-chunk-size 1 > out.t
|
|
| 72 |
|
| 73 |
[See description here](https://github.com/ufal/whisper_streaming/blob/d915d790a62d7be4e7392dde1480e7981eb142ae/whisper_online.py#L361)
|
| 74 |
|
| 75 |
-
|
| 76 |
|
| 77 |
TL;DR: use OnlineASRProcessor object and its methods insert_audio_chunk and process_iter.
|
| 78 |
|
|
@@ -110,9 +112,9 @@ print(o) # do something with the last output
|
|
| 110 |
online.init() # refresh if you're going to re-use the object for the next audio
|
| 111 |
```
|
| 112 |
|
| 113 |
-
|
| 114 |
|
| 115 |
-
`whisper_online_server.py`
|
| 116 |
|
| 117 |
Client example:
|
| 118 |
|
|
@@ -120,9 +122,9 @@ Client example:
|
|
| 120 |
arecord -f S16_LE -c1 -r 16000 -t raw -D default | nc localhost 43001
|
| 121 |
```
|
| 122 |
|
| 123 |
-
- arecord
|
| 124 |
|
| 125 |
-
- nc is netcat
|
| 126 |
|
| 127 |
|
| 128 |
## Background
|
|
|
|
| 16 |
|
| 17 |
The backend is loaded only when chosen. The unused one does not have to be installed.
|
| 18 |
|
| 19 |
+
## Usage
|
| 20 |
+
|
| 21 |
+
### Realtime simulation from audio file
|
| 22 |
|
| 23 |
```
|
| 24 |
usage: whisper_online.py [-h] [--min-chunk-size MIN_CHUNK_SIZE] [--model {tiny.en,tiny,base.en,base,small.en,small,medium.en,medium,large-v1,large-v2,large}] [--model_cache_dir MODEL_CACHE_DIR] [--model_dir MODEL_DIR] [--lan LAN] [--task {transcribe,translate}]
|
|
|
|
| 74 |
|
| 75 |
[See description here](https://github.com/ufal/whisper_streaming/blob/d915d790a62d7be4e7392dde1480e7981eb142ae/whisper_online.py#L361)
|
| 76 |
|
| 77 |
+
### As a module
|
| 78 |
|
| 79 |
TL;DR: use OnlineASRProcessor object and its methods insert_audio_chunk and process_iter.
|
| 80 |
|
|
|
|
| 112 |
online.init() # refresh if you're going to re-use the object for the next audio
|
| 113 |
```
|
| 114 |
|
| 115 |
+
### Server
|
| 116 |
|
| 117 |
+
`whisper_online_server.py` has the same model options as `whisper_online.py`, plus `--host` and `--port` of the TCP connection.
|
| 118 |
|
| 119 |
Client example:
|
| 120 |
|
|
|
|
| 122 |
arecord -f S16_LE -c1 -r 16000 -t raw -D default | nc localhost 43001
|
| 123 |
```
|
| 124 |
|
| 125 |
+
- arecord sends realtime audio from a sound device, in raw audio format -- 16000 sampling rate, mono channel, S16\_LE -- signed 16-bit integer low endian. (use the alternative to arecord that works for you)
|
| 126 |
|
| 127 |
+
- nc is netcat with server's host and port
|
| 128 |
|
| 129 |
|
| 130 |
## Background
|