X-iZhang commited on
Commit
6670ba7
·
verified ·
1 Parent(s): 0bba3db

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +3 -3
app.py CHANGED
@@ -463,11 +463,11 @@ def main():
463
 
464
  **🚨 Performance Warning**
465
 
466
- This demo is running on **CPU-only** mode. A single inference may take **5-10 minutes** depending on the model and parameters.
467
 
468
  **Recommendations for faster inference:**
469
- - Use smaller models (Libra-v1.0-3B is faster than 7B models) The model has already been loaded ⏬
470
- - Please do not attempt to load other models, as this may cause a runtime error: "Workload evicted, storage limit exceeded (50G)"
471
  - Reduce `Max New Tokens` to 64-128 (default: 128)
472
  - Disable baseline comparison
473
  - For GPU acceleration, please [run the demo locally](https://github.com/X-iZhang/CCD#gradio-web-interface)
 
463
 
464
  **🚨 Performance Warning**
465
 
466
+ This demo is running on **CPU-only** mode. A single inference may take **25-30 minutes** depending on the model and parameters.
467
 
468
  **Recommendations for faster inference:**
469
+ - Use smaller models (Libra-v1.0-3B is faster than 7B models) **The model has already been loaded**
470
+ - Please do not attempt to load other models, as this may cause a **runtime error**: "Workload evicted, storage limit exceeded (50G)"
471
  - Reduce `Max New Tokens` to 64-128 (default: 128)
472
  - Disable baseline comparison
473
  - For GPU acceleration, please [run the demo locally](https://github.com/X-iZhang/CCD#gradio-web-interface)