Update README.md
Browse files
README.md
CHANGED
|
@@ -23,6 +23,10 @@ base_model: deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct
|
|
| 23 |
**Original model**: [DeepSeek-Coder-V2-Lite-Instruct](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct)<br>
|
| 24 |
**GGUF quantization:** provided by [bartowski](https://huggingface.co/bartowski) based on `llama.cpp` release [b3166](https://github.com/ggerganov/llama.cpp/releases/tag/b3166)<br>
|
| 25 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
## Model Summary:
|
| 27 |
|
| 28 |
This is a brand new Mixture of Export (MoE) model from DeepSeek, specializing in coding instructions.<br>
|
|
@@ -42,6 +46,7 @@ This will format the prompt as follows:
|
|
| 42 |
User: {user_message}
|
| 43 |
|
| 44 |
Assistant: {assistant_message}
|
|
|
|
| 45 |
|
| 46 |
## Technical Details
|
| 47 |
|
|
|
|
| 23 |
**Original model**: [DeepSeek-Coder-V2-Lite-Instruct](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct)<br>
|
| 24 |
**GGUF quantization:** provided by [bartowski](https://huggingface.co/bartowski) based on `llama.cpp` release [b3166](https://github.com/ggerganov/llama.cpp/releases/tag/b3166)<br>
|
| 25 |
|
| 26 |
+
## Model Settings:
|
| 27 |
+
|
| 28 |
+
Flash attention MUST be **disabled** for this model to work.
|
| 29 |
+
|
| 30 |
## Model Summary:
|
| 31 |
|
| 32 |
This is a brand new Mixture of Export (MoE) model from DeepSeek, specializing in coding instructions.<br>
|
|
|
|
| 46 |
User: {user_message}
|
| 47 |
|
| 48 |
Assistant: {assistant_message}
|
| 49 |
+
```
|
| 50 |
|
| 51 |
## Technical Details
|
| 52 |
|