QuantTrio
/

DeepSeek-V3.1-AWQ

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions

tclf90 commited on Aug 26

Commit

c8d6023

·

verified ·

1 Parent(s): 850ce88

Update README.md

move-up installation instruction

Files changed (1) hide show

README.md +14 -14

README.md CHANGED Viewed

@@ -16,6 +16,20 @@ base_model_relation: quantized
 # DeepSeek-V3.1-AWQ
 Base model: [DeepSeek-V3.1](https://huggingface.co/deepseek-ai/DeepSeek-V3.1)
 ### 【vLLM Single Node with 8 GPUs — Startup Command】
 ```
 CONTEXT_LENGTH=32768
@@ -35,20 +49,6 @@ vllm serve \
     --port 8000
 ```
-### 【Dependencies / Installation】
-As of **2025-08-23**, create a fresh Python environment and run:
-```bash
-# ❗there are glitches with vllm 0.10.1.1, still looking for resolutions❗
-# ❗downgrade vllm for now ❗
-pip install vllm==0.9.0
-pip install transformers==4.53
-# ❗patch up AWQ MoE quant config, otherwise some modules cannot be properly loaded❗
-SITE_PACKAGES=$(pip -V | awk '{print $4}' | sed 's/\/pip$//')
-cp awq_marlin.py "$SITE_PACKAGES/vllm/model_executor/layers/quantization/awq_marlin.py"
-```
 ### 【Logs】
 ```
 2025-08-23

 # DeepSeek-V3.1-AWQ
 Base model: [DeepSeek-V3.1](https://huggingface.co/deepseek-ai/DeepSeek-V3.1)
+### 【Dependencies / Installation】
+As of **2025-08-23**, create a fresh Python environment and run:
+```bash
+# ❗there are glitches with vllm 0.10.1.1, still looking for resolutions❗
+# ❗downgrade vllm for now ❗
+pip install vllm==0.9.0
+pip install transformers==4.53
+# ❗patch up AWQ MoE quant config, otherwise some modules cannot be properly loaded❗
+SITE_PACKAGES=$(pip -V | awk '{print $4}' | sed 's/\/pip$//')
+cp awq_marlin.py "$SITE_PACKAGES/vllm/model_executor/layers/quantization/awq_marlin.py"
+```
 ### 【vLLM Single Node with 8 GPUs — Startup Command】
 ```
 CONTEXT_LENGTH=32768
     --port 8000
 ```
 ### 【Logs】
 ```
 2025-08-23