Spaces:

Dovakiins
/

qwerrwe

Build error

winglian commited on Dec 13, 2023

Commit

f1f60cb

unverified ·

1 Parent(s): 450e04d

Flash attn hotfix (#951)

* use previous arg

* use eager to use legacy attention that can be patched

Files changed (1) hide show

src/axolotl/utils/models.py CHANGED Viewed

@@ -324,6 +324,10 @@ def load_model(
                 model_config._attn_implementation = (  # pylint: disable=protected-access
                     "flash_attention_2"
                 )
     try:
         if cfg.is_llama_derived_model and not cfg.trust_remote_code and not cfg.gptq:

                 model_config._attn_implementation = (  # pylint: disable=protected-access
                     "flash_attention_2"
                 )
+            else:
+                model_config._attn_implementation = (  # pylint: disable=protected-access
+                    "eager"
+                )
     try:
         if cfg.is_llama_derived_model and not cfg.trust_remote_code and not cfg.gptq: