How to run this model on Atlas 800I A2(64G) with vLLM?If possible, What should be paid attention to?
#24
by
FrankDubai
- opened
I noticed that in the modeling_dots_vision.py file in line 294, there is a FA class of "ascend_fa", maybe I can use this key to run on Atlas servers?
And also I noticed the annotation that the "ascend_fa" may lead to some loss of accuracy, is there any optimization?