rednote-hilab/dots.ocr · Fix: Resolve TypeError for video

Fix: Resolve TypeError for video_processor during model loading.

#38

by prithivMLmods - opened 3 days ago

base: refs/heads/main

←

from: refs/pr/38

Discussion Files changed

-3

prithivMLmods

3 days ago

Subject: Fix: Resolve `TypeError` for `video_processor` during model loading

Description:

This pull request addresses a TypeError that occurs when loading the dots-ocr model with the latest versions of the transformers library. The error message, "Received a NoneType for argument 'video_processor', but a BaseVideoProcessor was expected," is triggered because the DotsVLProcessor class inherits from a processor that now expects a video_processor attribute.

The Problem:

The current implementation of DotsVLProcessor does not explicitly handle the video_processor argument in its constructor. As the base classes in the transformers library have evolved, this argument has become a required part of the processor's initialization, leading to a NoneType being passed and causing the TypeError.

The Solution:

This has been resolved by making a minor but critical addition to the DotsVLProcessor class. By adding video_processor=None to the __init__ method, we explicitly initialize the video processor as None, satisfying the requirements of the parent class without altering the model's core OCR functionality.

The change is as follows:

class DotsVLProcessor(Qwen2_5_VLProcessor):
    attributes = ["image_processor", "tokenizer"]
    def __init__(self, image_processor=None, tokenizer=None, video_processor=None, chat_template=None, **kwargs):
        super().__init__(image_processor, tokenizer, chat_template=chat_template)
        self.image_token = "<|imgpad|>" if not hasattr(tokenizer, "image_token") else tokenizer.image_token
        self.image_token_id = 151665 if not hasattr(tokenizer, "image_token_id") else tokenizer.image_token_id

This ensures that the model remains compatible with recent library updates and can be loaded without error.

The updated implementation with `transformers==4.57.1` is as follows:

HF Space: https://huggingface.co/spaces/prithivMLmods/Multimodal-OCR3

Fix: Resolve TypeError for video_processor during model loading.300acbad

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment

Fix: Resolve TypeError for video_processor during model loading.

Subject: Fix: Resolve TypeError for video_processor during model loading

The updated implementation with transformers==4.57.1 is as follows:

Subject: Fix: Resolve `TypeError` for `video_processor` during model loading

The updated implementation with `transformers==4.57.1` is as follows: