Is AWQ quantization possible for this model?

#17
by VivekMalipatel23 - opened

I am planning to run this on two 3090s with pipeline parallelism, but looks like 3090 doesn't support FP8. Can we get a AWQ quantized version of this model and other newer Qwen variants?

Check cpatonn/Qwen3-Coder-30B-A3B-Instruct-AWQ-4bit it works on 2 x 3090. :)

Sign up or log in to comment