Theo Viel commited on
Commit
af7c76e
·
1 Parent(s): c35c339
Files changed (2) hide show
  1. README.md +6 -3
  2. viz.png +3 -0
README.md CHANGED
@@ -18,6 +18,11 @@ tags:
18
 
19
  ## **Model Overview**
20
 
 
 
 
 
 
21
  ### **Description**
22
 
23
  The **NeMo Retriever Graphic Elements v1** model is a specialized object detection system designed to identify and extract key elements from charts and graphs. Based on YOLOX, an anchor-free version of YOLO (You Only Look Once), this model combines a simpler architecture with enhanced performance. While the underlying technology builds upon work from [Megvii Technology](https://github.com/Megvii-BaseDetection/YOLOX), we developed our own base model through complete retraining rather than using pre-trained weights.
@@ -73,7 +78,7 @@ The **NeMo Retriever Graphic Elements v1** is designed for automating extraction
73
  **Architecture Type**: YOLOX <br>
74
  **Network Architecture**: DarkNet53 Backbone \+ FPN Decoupled head (one 1x1 convolution \+ 2 parallel 3x3 convolutions (one for the classification and one for the bounding box prediction). YOLOX is a single-stage object detector that improves on Yolo-v3. <br>
75
  **This model was developed based on the Yolo architecture** <br>
76
- **Number of model parameters**: $5.4*10^7$ <br>
77
 
78
  ### Input
79
 
@@ -173,12 +178,10 @@ plt.show()
173
  Note that this repository only provides minimal code to infer the model.
174
  If you wish to do additional training, [refer to the original repo](https://github.com/Megvii-BaseDetection/YOLOX).
175
 
176
- <!---
177
  3. Advanced post-processing
178
 
179
  Additional post-processing might be required to use the model as part of a data extraction pipeline.
180
  We provide examples in the notebook `Demo.ipynb`.
181
- --->
182
 
183
  <!---
184
  ### Software Integration
 
18
 
19
  ## **Model Overview**
20
 
21
+ ![viz.png](viz.png)
22
+ *Preview of the model output on the example image.*
23
+
24
+ The input of this model is expected to be a chart image. You can use the [Nemoretriever Page Element v3](https://huggingface.co/nvidia/nemoretriever-page-elements-v3) to detect and crop such images.
25
+
26
  ### **Description**
27
 
28
  The **NeMo Retriever Graphic Elements v1** model is a specialized object detection system designed to identify and extract key elements from charts and graphs. Based on YOLOX, an anchor-free version of YOLO (You Only Look Once), this model combines a simpler architecture with enhanced performance. While the underlying technology builds upon work from [Megvii Technology](https://github.com/Megvii-BaseDetection/YOLOX), we developed our own base model through complete retraining rather than using pre-trained weights.
 
78
  **Architecture Type**: YOLOX <br>
79
  **Network Architecture**: DarkNet53 Backbone \+ FPN Decoupled head (one 1x1 convolution \+ 2 parallel 3x3 convolutions (one for the classification and one for the bounding box prediction). YOLOX is a single-stage object detector that improves on Yolo-v3. <br>
80
  **This model was developed based on the Yolo architecture** <br>
81
+ **Number of model parameters**: 5.4e7 <br>
82
 
83
  ### Input
84
 
 
178
  Note that this repository only provides minimal code to infer the model.
179
  If you wish to do additional training, [refer to the original repo](https://github.com/Megvii-BaseDetection/YOLOX).
180
 
 
181
  3. Advanced post-processing
182
 
183
  Additional post-processing might be required to use the model as part of a data extraction pipeline.
184
  We provide examples in the notebook `Demo.ipynb`.
 
185
 
186
  <!---
187
  ### Software Integration
viz.png ADDED

Git LFS Details

  • SHA256: 8057a578d8e521eef82d33ed79a1c72d3eca19fc802c6e736b7d7912ca25449b
  • Pointer size: 131 Bytes
  • Size of remote file: 232 kB