Is AWQ quantization possible for this model?
#17
by
VivekMalipatel23
- opened
I am planning to run this on two 3090s with pipeline parallelism, but looks like 3090 doesn't support FP8. Can we get a AWQ quantized version of this model and other newer Qwen variants?
Check cpatonn/Qwen3-Coder-30B-A3B-Instruct-AWQ-4bit it works on 2 x 3090. :)