Update README.md
Browse files
README.md
CHANGED
|
@@ -72,19 +72,6 @@ tags:
|
|
| 72 |
font-family: 'Fira Code', 'Courier New', monospace;
|
| 73 |
color: #4fc1ff;
|
| 74 |
}
|
| 75 |
-
.code-block-dark {
|
| 76 |
-
background-color: #1e1e1e;
|
| 77 |
-
color: #dcdcdc;
|
| 78 |
-
padding: 16px;
|
| 79 |
-
border-radius: 8px;
|
| 80 |
-
font-family: 'Fira Code', 'Courier New', monospace;
|
| 81 |
-
overflow-x: auto;
|
| 82 |
-
white-space: pre-wrap;
|
| 83 |
-
border: 1px solid #3c3c3c;
|
| 84 |
-
}
|
| 85 |
-
.code-block-dark .comment {
|
| 86 |
-
color: #6a9955;
|
| 87 |
-
}
|
| 88 |
a {
|
| 89 |
color: #569cd6;
|
| 90 |
text-decoration: none;
|
|
@@ -152,13 +139,11 @@ tags:
|
|
| 152 |
<h2>How to Download and Use EXL Quants</h2>
|
| 153 |
<p>Each quantization size for a model is stored in a separate HF repository branch. You can download a specific quant size by its branch.</p>
|
| 154 |
<p>For example, to download the <code class="inline-code-dark">4.0bpw_H6</code> quant:</p>
|
| 155 |
-
<
|
| 156 |
-
|
| 157 |
-
|
| 158 |
-
|
| 159 |
-
|
| 160 |
-
</div>
|
| 161 |
-
<p style="margin-top: 15px;">These quants can be run with any inference client that supports the EXL3 format, such as <a href="https://github.com/theroyallab/tabbyapi" target="_blank"><b>TabbyAPI</b></a>. Please refer to <a href="https://github.com/theroyallab/tabbyAPI/wiki/01.-Getting-Started" target="_blank">documentation</a> for set up instructions.</p>
|
| 162 |
</div>
|
| 163 |
|
| 164 |
<div class="card-dark">
|
|
|
|
| 72 |
font-family: 'Fira Code', 'Courier New', monospace;
|
| 73 |
color: #4fc1ff;
|
| 74 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 75 |
a {
|
| 76 |
color: #569cd6;
|
| 77 |
text-decoration: none;
|
|
|
|
| 139 |
<h2>How to Download and Use EXL Quants</h2>
|
| 140 |
<p>Each quantization size for a model is stored in a separate HF repository branch. You can download a specific quant size by its branch.</p>
|
| 141 |
<p>For example, to download the <code class="inline-code-dark">4.0bpw_H6</code> quant:</p>
|
| 142 |
+
<p>Install hugginface-cli:</p>
|
| 143 |
+
<pre><code>pip install -U "huggingface_hub[cli]"</code></pre>
|
| 144 |
+
<p>Download quant by targeting the specific quant size (revision):</p>
|
| 145 |
+
<pre><code>huggingface-cli download ArtusDev/MODEL_NAME --revision "4.0bpw_H6" --local-dir ./</code></pre>
|
| 146 |
+
<p style="margin-top: 15px;">EXL3 quants can be run with any inference client that supports the EXL3 format, such as <a href="https://github.com/theroyallab/tabbyapi" target="_blank"><b>TabbyAPI</b></a>. Please refer to <a href="https://github.com/theroyallab/tabbyAPI/wiki/01.-Getting-Started" target="_blank">documentation</a> for set up instructions.</p>
|
|
|
|
|
|
|
| 147 |
</div>
|
| 148 |
|
| 149 |
<div class="card-dark">
|