Vezora
/

Qwen2.5-Coder-32B-Instruct-fp8-W8A16

Model card Files Files and versions

Vezora commited on Nov 12, 2024

Commit

7761c1c

·

verified ·

1 Parent(s): b64d635

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -8,7 +8,7 @@ This model is optimized for use with [VLLM](https://github.com/vllm-project/vllm
 ### Key Features of FP8 Marlin
-The Marlin kernel achieves impressive efficiency by packing 4 8-bit values into an int32 and performing a 4xFP8 to 4xFP16/BF16 dequantization using bit arithmetic and SIMT operations. This approach yields nearly a **2x speedup** over FP16 on most models while maintaining **near lossless quality**.
 #### FP8 Advantages on NVIDIA GPUs

 ### Key Features of FP8 Marlin
+The NeuralMagic FP8 Marlin kernel achieves impressive efficiency by packing 4 8-bit values into an int32 and performing a 4xFP8 to 4xFP16/BF16 dequantization using bit arithmetic and SIMT operations. This approach yields nearly a **2x speedup** over FP16 on most models while maintaining **near lossless quality**.
 #### FP8 Advantages on NVIDIA GPUs