Update README.md
Browse files
README.md
CHANGED
|
@@ -61,7 +61,7 @@ For further information, please see the [llamafile
|
|
| 61 |
README](https://github.com/mozilla-ocho/llamafile/).
|
| 62 |
|
| 63 |
Having **trouble?** See the ["Gotchas"
|
| 64 |
-
section](https://github.com/mozilla-ocho/llamafile/?tab=readme-ov-file#gotchas)
|
| 65 |
of the README.
|
| 66 |
|
| 67 |
## About Upload Limits
|
|
@@ -117,7 +117,7 @@ Your choice of quantization format depends on three things:
|
|
| 117 |
|
| 118 |
1. Will it fit in RAM or VRAM?
|
| 119 |
2. Is your use case reading (e.g. summarization) or writing (e.g. chatbot)?
|
| 120 |
-
3. llamafiles bigger than 4.30 GB are hard to run on Windows (see [gotchas](https://github.com/mozilla-ocho/llamafile/?tab=readme-ov-file#gotchas))
|
| 121 |
|
| 122 |
Good quants for writing (prediction speed) are Q5\_K\_M, and Q4\_0. Text
|
| 123 |
generation is bounded by memory speed, so smaller quants help, but they
|
|
|
|
| 61 |
README](https://github.com/mozilla-ocho/llamafile/).
|
| 62 |
|
| 63 |
Having **trouble?** See the ["Gotchas"
|
| 64 |
+
section](https://github.com/mozilla-ocho/llamafile/?tab=readme-ov-file#gotchas-and-troubleshooting)
|
| 65 |
of the README.
|
| 66 |
|
| 67 |
## About Upload Limits
|
|
|
|
| 117 |
|
| 118 |
1. Will it fit in RAM or VRAM?
|
| 119 |
2. Is your use case reading (e.g. summarization) or writing (e.g. chatbot)?
|
| 120 |
+
3. llamafiles bigger than 4.30 GB are hard to run on Windows (see [gotchas](https://github.com/mozilla-ocho/llamafile/?tab=readme-ov-file#gotchas-and-troubleshooting))
|
| 121 |
|
| 122 |
Good quants for writing (prediction speed) are Q5\_K\_M, and Q4\_0. Text
|
| 123 |
generation is bounded by memory speed, so smaller quants help, but they
|