chat_template error
#26
by
yanshuang
- opened
我执行下面代码时报错:
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "2"
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers import Qwen2_5_VLForConditionalGeneration, AutoProcessor
model_path = "/juicefs-algorithm/models/nlp/huggingface/ocr/DotsOCR"
processor = AutoProcessor.from_pretrained(model_path, trust_remote_code=True)
image_path = "demo/demo_image1.jpg"
prompt = """Please output the layout information from the PDF image, including each layout element's bbox, its category, and the corresponding text content within the bbox.
1. Bbox format: [x1, y1, x2, y2]
2. Layout Categories: The possible categories are ['Caption', 'Footnote', 'Formula', 'List-item', 'Page-footer', 'Page-header', 'Picture', 'Section-header', 'Table', 'Text', 'Title'].
3. Text Extraction & Formatting Rules:
- Picture: For the 'Picture' category, the text field should be omitted.
- Formula: Format its text as LaTeX.
- Table: Format its text as HTML.
- All Others (Text, Title, etc.): Format their text as Markdown.
4. Constraints:
- The output text must be the original text from the image, with no translation.
- All layout elements must be sorted according to human reading order.
5. Final Output: The entire output must be a single JSON object.
"""
messages = [
{
"role": "user",
"content": [
{
"type": "image",
"image": image_path
},
{"type": "text", "text": prompt}
]
}
]
# Preparation for inference
text = processor.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
print(f"\ntext = #{text}#")
报了以下错误:
Traceback (most recent call last):
File "/data/shuang_yan/codes/cs_ai_system/tools_homework_grouping_sort/test_tokenizer_dotsocr.py", line 50, in <module>
text = processor.apply_chat_template(
File "/data/shuang_yan/qx_agent/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1695, in apply_chat_template
rendered_chat = compiled_template.render(
File "/data/shuang_yan/qx_agent/lib/python3.10/site-packages/jinja2/environment.py", line 1301, in render
self.environment.handle_exception()
File "/data/shuang_yan/qx_agent/lib/python3.10/site-packages/jinja2/environment.py", line 936, in handle_exception
raise rewrite_traceback_stack(source=source)
File "<template>", line 5, in top-level template code
TypeError: can only concatenate str (not "list") to str
问了gemini,原因出在chat_template的定义有问题:
其他小伙伴有遇到这个问题吗?
本质问题是你使用了更高版本的transformers,说难听点,这个项目你必需1:1复刻环境,否则很多后面版本的破坏性变更,改是改不来的
