Update README.md
Browse files
README.md
CHANGED
|
@@ -41,7 +41,7 @@ All experimental versions were generated using an appropriate imatrix created fr
|
|
| 41 |
The process to generate these models is roughly as follows:
|
| 42 |
|
| 43 |
1. Convert the the original model's tensors to [GGUF](<https://huggingface.co/docs/hub/en/gguf>) F16*
|
| 44 |
-
2. Estimate the Perplexity score for the F16 model (baseline) using the [wikitext-2-raw-v1](<https://huggingface.co/datasets/Salesforce/wikitext/tree/main/wikitext-2-raw-v1>) dataset, and save the [logits](<https://huggingface.co/eaddario/DeepSeek-R1-Distill-
|
| 45 |
3. Generate an [imatrix](<https://huggingface.co/eaddario/DeepSeek-R1-Distill-Qwen-7B-GGUF/tree/main/imatrix>) from selected calibration datasets
|
| 46 |
4. Determine tensor and layer Importance Score contribution using a modified version of `llama-imatrix`
|
| 47 |
5. Select an appropiate quant level for each tensor using a modified version of `llama-quantize`
|
|
|
|
| 41 |
The process to generate these models is roughly as follows:
|
| 42 |
|
| 43 |
1. Convert the the original model's tensors to [GGUF](<https://huggingface.co/docs/hub/en/gguf>) F16*
|
| 44 |
+
2. Estimate the Perplexity score for the F16 model (baseline) using the [wikitext-2-raw-v1](<https://huggingface.co/datasets/Salesforce/wikitext/tree/main/wikitext-2-raw-v1>) dataset, and save the [logits](<https://huggingface.co/eaddario/DeepSeek-R1-Distill-Qwen-7B-GGUF/tree/main/logits>)
|
| 45 |
3. Generate an [imatrix](<https://huggingface.co/eaddario/DeepSeek-R1-Distill-Qwen-7B-GGUF/tree/main/imatrix>) from selected calibration datasets
|
| 46 |
4. Determine tensor and layer Importance Score contribution using a modified version of `llama-imatrix`
|
| 47 |
5. Select an appropiate quant level for each tensor using a modified version of `llama-quantize`
|