MiniMax-M2-exl3 / README.md
turboderp's picture
Update README.md
415929c verified
metadata
license: mit
base_model: MiniMaxAI/MiniMax-M2
base_model_relation: quantized
quantized_by: turboderp
tags:
  - exl3

EXL3 quants of MiniMax-M2

⚠️ Requires ExLlamaV3 v0.0.12 (or v0.0.11 dev branch)

Base bitrates:

2.00 bits per weight
3.00 bits per weight
4.00 bits per weight

Optimized:

2.04 bits per weight
2.27 bits per weight
3.04 bits per weight
3.50 bits per weight
4.03 bits per weight

. KL-div ppl HumanEval@1
2.00 bpw 0.400 10.92 80.5%
2.04 bpw 0.297 10.23 87.1%
2.27 bpw 0.252 9.78 88.4%
3.00 bpw 0.141 8.99 87.8%
3.04 bpw 0.117 8.73 87.2%
3.50 bpw 0.094 8.78 88.4%
4.00 bpw 0.087 8.58 89.6%
4.03 bpw 0.077 8.61 87.8%
original - 8.51 87.2%¹

¹ Unconfirmed

2.00 bpw
2.00 bpw
2.04 bpw
2.04 bpw
2.27 bpw
2.27 bpw
3.00 bpw
3.00 bpw
3.04 bpw
3.04 bpw
3.50 bpw
3.50 bpw
4.00 bpw
4.00 bpw
4.00 bpw
4.03 bpw
API
API