Improve model card: add `library_name`, refine `pipeline_tag`, add HF paper and project links

#29
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +17 -8
README.md CHANGED
@@ -1,14 +1,16 @@
1
  ---
2
- pipeline_tag: image-text-to-text
3
  language:
4
  - multilingual
 
 
5
  tags:
6
  - deepseek
7
  - vision-language
8
  - ocr
9
  - custom_code
10
- license: mit
11
  ---
 
12
  <div align="center">
13
  <img src="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true" width="60%" alt="DeepSeek AI" />
14
  </div>
@@ -39,21 +41,25 @@ license: mit
39
  <p align="center">
40
  <a href="https://github.com/deepseek-ai/DeepSeek-OCR"><b>🌟 Github</b></a> |
41
  <a href="https://huggingface.co/deepseek-ai/DeepSeek-OCR"><b>📥 Model Download</b></a> |
42
- <a href="https://github.com/deepseek-ai/DeepSeek-OCR/blob/main/DeepSeek_OCR_paper.pdf"><b>📄 Paper Link</b></a> |
43
  <a href="https://arxiv.org/abs/2510.18234"><b>📄 Arxiv Paper Link</b></a> |
 
44
  </p>
45
  <h2>
46
  <p align="center">
47
- <a href="">DeepSeek-OCR: Contexts Optical Compression</a>
48
  </p>
49
  </h2>
50
  <p align="center">
51
  <img src="assets/fig1.png" style="width: 1000px" align=center>
52
  </p>
53
  <p align="center">
54
- <a href="">Explore the boundaries of visual-text compression.</a>
55
  </p>
56
 
 
 
 
57
  ## Usage
58
  Inference using Huggingface transformers on NVIDIA GPUs. Requirements tested on python 3.12.9 + CUDA11.8:
59
 
@@ -78,8 +84,10 @@ tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
78
  model = AutoModel.from_pretrained(model_name, _attn_implementation='flash_attention_2', trust_remote_code=True, use_safetensors=True)
79
  model = model.eval().cuda().to(torch.bfloat16)
80
 
81
- # prompt = "<image>\nFree OCR. "
82
- prompt = "<image>\n<|grounding|>Convert the document to markdown. "
 
 
83
  image_file = 'your_image.jpg'
84
  output_path = 'your/output/dir'
85
 
@@ -125,4 +133,5 @@ We also appreciate the benchmarks: [Fox](https://github.com/ucaslcl/Fox), [Omini
125
  author={Wei, Haoran and Sun, Yaofeng and Li, Yukun},
126
  journal={arXiv preprint arXiv:2510.18234},
127
  year={2025}
128
- }
 
 
1
  ---
 
2
  language:
3
  - multilingual
4
+ license: mit
5
+ pipeline_tag: image-to-text
6
  tags:
7
  - deepseek
8
  - vision-language
9
  - ocr
10
  - custom_code
11
+ library_name: transformers
12
  ---
13
+
14
  <div align="center">
15
  <img src="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true" width="60%" alt="DeepSeek AI" />
16
  </div>
 
41
  <p align="center">
42
  <a href="https://github.com/deepseek-ai/DeepSeek-OCR"><b>🌟 Github</b></a> |
43
  <a href="https://huggingface.co/deepseek-ai/DeepSeek-OCR"><b>📥 Model Download</b></a> |
44
+ <a href="https://github.com/deepseek-ai/DeepSeek-OCR/blob/main/DeepSeek_OCR_paper.pdf"><b>📄 PDF Paper Link</b></a> |
45
  <a href="https://arxiv.org/abs/2510.18234"><b>📄 Arxiv Paper Link</b></a> |
46
+ <a href="https://huggingface.co/papers/2510.18234"><b>📄 Hugging Face Paper Link</b></a>
47
  </p>
48
  <h2>
49
  <p align="center">
50
+ <a href="https://huggingface.co/papers/2510.18234">DeepSeek-OCR: Contexts Optical Compression</a>
51
  </p>
52
  </h2>
53
  <p align="center">
54
  <img src="assets/fig1.png" style="width: 1000px" align=center>
55
  </p>
56
  <p align="center">
57
+ <a href="https://huggingface.co/papers/2510.18234">Explore the boundaries of visual-text compression.</a>
58
  </p>
59
 
60
+ ## Project Page
61
+ https://www.deepseek.com/
62
+
63
  ## Usage
64
  Inference using Huggingface transformers on NVIDIA GPUs. Requirements tested on python 3.12.9 + CUDA11.8:
65
 
 
84
  model = AutoModel.from_pretrained(model_name, _attn_implementation='flash_attention_2', trust_remote_code=True, use_safetensors=True)
85
  model = model.eval().cuda().to(torch.bfloat16)
86
 
87
+ # prompt = "<image>
88
+ Free OCR. "
89
+ prompt = "<image>
90
+ <|grounding|>Convert the document to markdown. "
91
  image_file = 'your_image.jpg'
92
  output_path = 'your/output/dir'
93
 
 
133
  author={Wei, Haoran and Sun, Yaofeng and Li, Yukun},
134
  journal={arXiv preprint arXiv:2510.18234},
135
  year={2025}
136
+ }
137
+ ```