Add viz

Files changed (2) hide show

README.md +6 -3
viz.png +3 -0

README.md CHANGED Viewed

@@ -18,6 +18,11 @@ tags:
 ## **Model Overview**
 ### **Description**
 The **NeMo Retriever Graphic Elements v1** model is a specialized object detection system designed to identify and extract key elements from charts and graphs. Based on YOLOX, an anchor-free version of YOLO (You Only Look Once), this model combines a simpler architecture with enhanced performance. While the underlying technology builds upon work from [Megvii Technology](https://github.com/Megvii-BaseDetection/YOLOX), we developed our own base model through complete retraining rather than using pre-trained weights.
@@ -73,7 +78,7 @@ The **NeMo Retriever Graphic Elements v1** is designed for automating extraction
 **Architecture Type**: YOLOX <br>
 **Network Architecture**: DarkNet53 Backbone \+ FPN Decoupled head (one 1x1 convolution \+ 2 parallel 3x3 convolutions (one for the classification and one for the bounding box prediction). YOLOX is a single-stage object detector that improves on Yolo-v3. <br>
 **This model was developed based on the Yolo architecture** <br>
-**Number of model parameters**: $5.4*10^7$ <br>
 ### Input
@@ -173,12 +178,10 @@ plt.show()
 Note that this repository only provides minimal code to infer the model.
 If you wish to do additional training, [refer to the original repo](https://github.com/Megvii-BaseDetection/YOLOX).
-<!---
 3. Advanced post-processing
 Additional post-processing might be required to use the model as part of a data extraction pipeline.
 We provide examples in the notebook `Demo.ipynb`.
---->
 <!---
 ### Software Integration

 ## **Model Overview**
+![viz.png](viz.png)
+*Preview of the model output on the example image.*
+The input of this model is expected to be a chart image. You can use the [Nemoretriever Page Element v3](https://huggingface.co/nvidia/nemoretriever-page-elements-v3) to detect and crop such images.
 ### **Description**
 The **NeMo Retriever Graphic Elements v1** model is a specialized object detection system designed to identify and extract key elements from charts and graphs. Based on YOLOX, an anchor-free version of YOLO (You Only Look Once), this model combines a simpler architecture with enhanced performance. While the underlying technology builds upon work from [Megvii Technology](https://github.com/Megvii-BaseDetection/YOLOX), we developed our own base model through complete retraining rather than using pre-trained weights.
 **Architecture Type**: YOLOX <br>
 **Network Architecture**: DarkNet53 Backbone \+ FPN Decoupled head (one 1x1 convolution \+ 2 parallel 3x3 convolutions (one for the classification and one for the bounding box prediction). YOLOX is a single-stage object detector that improves on Yolo-v3. <br>
 **This model was developed based on the Yolo architecture** <br>
+**Number of model parameters**: 5.4e7 <br>
 ### Input
 Note that this repository only provides minimal code to infer the model.
 If you wish to do additional training, [refer to the original repo](https://github.com/Megvii-BaseDetection/YOLOX).
 3. Advanced post-processing
 Additional post-processing might be required to use the model as part of a data extraction pipeline.
 We provide examples in the notebook `Demo.ipynb`.
 <!---
 ### Software Integration

viz.png ADDED Viewed

Git LFS Details

SHA256: 8057a578d8e521eef82d33ed79a1c72d3eca19fc802c6e736b7d7912ca25449b
Pointer size: 131 Bytes
Size of remote file: 232 kB