misc(readme): add example queries
Browse files
    	
        README.md
    CHANGED
    
    | @@ -23,6 +23,39 @@ which you can query using the `OpenAi` Libraries or directly through `cURL` for | |
| 23 | 
             
            | /api/v1/audio/transcriptions | Transcription endpoint to interact with the model |
         | 
| 24 | 
             
            | /docs                        | Visual documentation                              | 
         | 
| 25 |  | 
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
| 26 | 
             
            ## Specifications
         | 
| 27 |  | 
| 28 | 
             
            | spec               | value                 | description                                                                                                |
         | 
| @@ -33,3 +66,6 @@ which you can query using the `OpenAi` Libraries or directly through `cURL` for | |
| 33 | 
             
            | KV cache data type | `float8` (e4m3)       | Key-Value cache is stored on the GPU using `float8` (`float8_e4m3`) precision to save space                |
         | 
| 34 | 
             
            | PyTorch Compile    | ✅                    | Enable the use of `torch.compile` to further optimize model's execution with more optimizations            |
         | 
| 35 | 
             
            | CUDA Graphs        | ✅                    | Enable the use of so called "[CUDA Graphs](https://developer.nvidia.com/blog/cuda-graphs/)" to reduce overhead executing GPU computations | 
         | 
|  | |
|  | |
|  | 
|  | |
| 23 | 
             
            | /api/v1/audio/transcriptions | Transcription endpoint to interact with the model |
         | 
| 24 | 
             
            | /docs                        | Visual documentation                              | 
         | 
| 25 |  | 
| 26 | 
            +
            ## Getting started
         | 
| 27 | 
            +
             | 
| 28 | 
            +
            - **Getting text output from audio file**
         | 
| 29 | 
            +
             | 
| 30 | 
            +
            ```bash
         | 
| 31 | 
            +
            curl http://localhost:8000/api/v1/audio/transcriptions \
         | 
| 32 | 
            +
              --request POST \
         | 
| 33 | 
            +
              --header 'Content-Type: multipart/form-data' \
         | 
| 34 | 
            +
              -F file=@</path/to/audio/file> \
         | 
| 35 | 
            +
              -F "response_format": "text"
         | 
| 36 | 
            +
            ```
         | 
| 37 | 
            +
             | 
| 38 | 
            +
            - **Getting JSON output from audio file**
         | 
| 39 | 
            +
             | 
| 40 | 
            +
            ```bash
         | 
| 41 | 
            +
            curl http://localhost:8000/api/v1/audio/transcriptions \
         | 
| 42 | 
            +
              --request POST \
         | 
| 43 | 
            +
              --header 'Content-Type: multipart/form-data' \
         | 
| 44 | 
            +
              -F file=@</path/to/audio/file> \
         | 
| 45 | 
            +
              -F "response_format": "json"
         | 
| 46 | 
            +
            ```
         | 
| 47 | 
            +
             | 
| 48 | 
            +
            - **Getting segmented JSON output from audio file**
         | 
| 49 | 
            +
              
         | 
| 50 | 
            +
            ```bash
         | 
| 51 | 
            +
            curl http://localhost:8000/api/v1/audio/transcriptions \
         | 
| 52 | 
            +
              --request POST \
         | 
| 53 | 
            +
              --header 'Content-Type: multipart/form-data' \
         | 
| 54 | 
            +
              -F file=@</path/to/audio/file> \
         | 
| 55 | 
            +
              -F "response_format": "verbose_json"
         | 
| 56 | 
            +
            ```
         | 
| 57 | 
            +
             | 
| 58 | 
            +
             | 
| 59 | 
             
            ## Specifications
         | 
| 60 |  | 
| 61 | 
             
            | spec               | value                 | description                                                                                                |
         | 
|  | |
| 66 | 
             
            | KV cache data type | `float8` (e4m3)       | Key-Value cache is stored on the GPU using `float8` (`float8_e4m3`) precision to save space                |
         | 
| 67 | 
             
            | PyTorch Compile    | ✅                    | Enable the use of `torch.compile` to further optimize model's execution with more optimizations            |
         | 
| 68 | 
             
            | CUDA Graphs        | ✅                    | Enable the use of so called "[CUDA Graphs](https://developer.nvidia.com/blog/cuda-graphs/)" to reduce overhead executing GPU computations | 
         | 
| 69 | 
            +
             | 
| 70 | 
            +
             | 
| 71 | 
            +
             | 

