Make compatible with newer transformers

#38

by harpreetsahota - opened 7 days ago

7 days ago

Issue

The model fails to load with new Transformers versions due to removed classes:

ImportError: cannot import name 'LlamaFlashAttention2' from 'transformers.models.llama.modeling_llama'

Root Cause

In modeling_deepseekv2.py (lines 37-39), the code imports:

from transformers.models.llama.modeling_llama import (
    LlamaAttention,
    LlamaFlashAttention2
)

These classes were removed in Transformers 4.47+ as part of the attention refactoring.

Proposed Fix

Since DeepSeek-OCR uses MLA (Multi-head Latent Attention) by default (config.use_mla = True), the Llama attention classes are only used as fallbacks for MHA mode.

Option 1: Remove MHA support (simplest)

Remove the imports (lines 37-39)
Update ATTENTION_CLASSES dict (lines 1022-1029):

ATTENTION_CLASSES = {
    "eager": DeepseekV2Attention,
    "flash_attention_2": DeepseekV2FlashAttention2,
    "mla_eager": DeepseekV2Attention,
    "mla_flash_attention_2": DeepseekV2FlashAttention2,
    # Removed mha_eager and mha_flash_attention_2
}

Option 2: Use DeepSeek attention for MHA mode (backward compatible)

Keep the same keys but map to DeepSeek classes:

ATTENTION_CLASSES = {
    "eager": DeepseekV2Attention,
    "flash_attention_2": DeepseekV2FlashAttention2,
    "mla_eager": DeepseekV2Attention,
    "mla_flash_attention_2": DeepseekV2FlashAttention2,
    "mha_eager": DeepseekV2Attention,  # Changed
    "mha_flash_attention_2": DeepseekV2FlashAttention2,  # Changed
}

Option 3: Conditional import (most flexible)

try:
    from transformers.models.llama.modeling_llama import (
        LlamaAttention,
        LlamaFlashAttention2
    )
    HAS_LLAMA_ATTENTION = True
except ImportError:
    HAS_LLAMA_ATTENTION = False

ATTENTION_CLASSES = {
    "eager": DeepseekV2Attention,
    "flash_attention_2": DeepseekV2FlashAttention2,
    "mla_eager": DeepseekV2Attention,
    "mla_flash_attention_2": DeepseekV2FlashAttention2,
}

if HAS_LLAMA_ATTENTION:
    ATTENTION_CLASSES.update({
        "mha_eager": LlamaAttention,
        "mha_flash_attention_2": LlamaFlashAttention2
    })
else:
    ATTENTION_CLASSES.update({
        "mha_eager": DeepseekV2Attention,
        "mha_flash_attention_2": DeepseekV2FlashAttention2
    })

This works because DeepSeek-OCR uses MLA by default anyway.

bigpappic

7 days ago

All the same issues are still there nothing will open and nothing works this is a useless app fix it

laxmareddyp

6 days ago

•

edited 6 days ago

yayoimizuha

2 days ago

mingyi456

2 days ago

@harpreetsahota Does it really use MLA by default? Over here it says "use_mla": false, and mapping "mha_flash_attention_2" to DeepseekV2FlashAttention2 still does not work for me, but I am not sure if it is an unrelated issue.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment