rednote-hilab/dots.ocr · chat

chat_template error

#26

by yanshuang - opened Sep 11

Sep 11

我执行下面代码时报错:

import os
os.environ["CUDA_VISIBLE_DEVICES"] = "2"

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers import Qwen2_5_VLForConditionalGeneration, AutoProcessor

model_path = "/juicefs-algorithm/models/nlp/huggingface/ocr/DotsOCR"


processor = AutoProcessor.from_pretrained(model_path, trust_remote_code=True)

image_path = "demo/demo_image1.jpg"
prompt = """Please output the layout information from the PDF image, including each layout element's bbox, its category, and the corresponding text content within the bbox.

1. Bbox format: [x1, y1, x2, y2]

2. Layout Categories: The possible categories are ['Caption', 'Footnote', 'Formula', 'List-item', 'Page-footer', 'Page-header', 'Picture', 'Section-header', 'Table', 'Text', 'Title'].

3. Text Extraction & Formatting Rules:
    - Picture: For the 'Picture' category, the text field should be omitted.
    - Formula: Format its text as LaTeX.
    - Table: Format its text as HTML.
    - All Others (Text, Title, etc.): Format their text as Markdown.

4. Constraints:
    - The output text must be the original text from the image, with no translation.
    - All layout elements must be sorted according to human reading order.

5. Final Output: The entire output must be a single JSON object.
"""

messages = [
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "image": image_path
                },
                {"type": "text", "text": prompt}
            ]
        }
    ]

# Preparation for inference
text = processor.apply_chat_template(
    messages, 
    tokenize=False, 
    add_generation_prompt=True
)
print(f"\ntext = #{text}#")

报了以下错误：

Traceback (most recent call last):
  File "/data/shuang_yan/codes/cs_ai_system/tools_homework_grouping_sort/test_tokenizer_dotsocr.py", line 50, in <module>
    text = processor.apply_chat_template(
  File "/data/shuang_yan/qx_agent/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1695, in apply_chat_template
    rendered_chat = compiled_template.render(
  File "/data/shuang_yan/qx_agent/lib/python3.10/site-packages/jinja2/environment.py", line 1301, in render
    self.environment.handle_exception()
  File "/data/shuang_yan/qx_agent/lib/python3.10/site-packages/jinja2/environment.py", line 936, in handle_exception
    raise rewrite_traceback_stack(source=source)
  File "<template>", line 5, in top-level template code
TypeError: can only concatenate str (not "list") to str

问了gemini，原因出在chat_template的定义有问题：

其他小伙伴有遇到这个问题吗？

luguoyixiazi

Sep 19

本质问题是你使用了更高版本的transformers，说难听点，这个项目你必需1：1复刻环境，否则很多后面版本的破坏性变更，改是改不来的

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment