eaddario
/

DeepSeek-R1-Distill-Qwen-7B-GGUF

Text Generation

Model card Files Files and versions

eaddario commited on Apr 24

Commit

ee80267

·

verified ·

1 Parent(s): db27882

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -41,7 +41,7 @@ All experimental versions were generated using an appropriate imatrix created fr
 The process to generate these models is roughly as follows:
 1. Convert the the original model's tensors to [GGUF](<https://huggingface.co/docs/hub/en/gguf>) F16*
-2. Estimate the Perplexity score for the F16 model (baseline) using the [wikitext-2-raw-v1](<https://huggingface.co/datasets/Salesforce/wikitext/tree/main/wikitext-2-raw-v1>) dataset, and save the [logits](<https://huggingface.co/eaddario/DeepSeek-R1-Distill-Llama-8B-GGUF/tree/main/logits>)
 3. Generate an [imatrix](<https://huggingface.co/eaddario/DeepSeek-R1-Distill-Qwen-7B-GGUF/tree/main/imatrix>) from selected calibration datasets
 4. Determine tensor and layer Importance Score contribution using a modified version of `llama-imatrix`
 5. Select an appropiate quant level for each tensor using a modified version of `llama-quantize`

 The process to generate these models is roughly as follows:
 1. Convert the the original model's tensors to [GGUF](<https://huggingface.co/docs/hub/en/gguf>) F16*
+2. Estimate the Perplexity score for the F16 model (baseline) using the [wikitext-2-raw-v1](<https://huggingface.co/datasets/Salesforce/wikitext/tree/main/wikitext-2-raw-v1>) dataset, and save the [logits](<https://huggingface.co/eaddario/DeepSeek-R1-Distill-Qwen-7B-GGUF/tree/main/logits>)
 3. Generate an [imatrix](<https://huggingface.co/eaddario/DeepSeek-R1-Distill-Qwen-7B-GGUF/tree/main/imatrix>) from selected calibration datasets
 4. Determine tensor and layer Importance Score contribution using a modified version of `llama-imatrix`
 5. Select an appropiate quant level for each tensor using a modified version of `llama-quantize`