Spaces:
Runtime error
Runtime error
| title: Rex-Thinker Demo | |
| emoji: π | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: 4.44.1 | |
| app_file: demo/app.py | |
| pinned: false | |
| license: apache-2.0 | |
| # Rex-Thinker Demo | |
| This is a demo application for Rex-Thinker-GRPO, a visual reasoning model that combines GroundingDINO for object detection with advanced referring expression comprehension. | |
| ## Features | |
| - **Object Detection**: Uses GroundingDINO to detect objects based on category names | |
| - **Referring Expression Comprehension**: Identifies specific objects based on detailed descriptions | |
| - **Interactive Web Interface**: Easy-to-use Gradio interface with real-time streaming | |
| - **Visual Reasoning**: Shows the model's thinking process with detailed explanations | |
| ## How to Use | |
| 1. **Upload an Image**: Click on "Input Image" to upload your image | |
| 2. **Set Object Category**: Enter the general category of objects you want to detect (e.g., "person", "car", "dog") | |
| 3. **Enter Referring Expression**: Provide a detailed description of the specific object you want to identify (e.g., "person wearing red shirt and black hat") | |
| 4. **Adjust Visualization Settings**: Modify draw width and font size for better visualization | |
| 5. **Run the Model**: Click "Run with Streaming" to see the results | |
| ## Examples | |
| The demo includes several pre-loaded examples: | |
| - Tomato detection | |
| - Helmet identification | |
| - Person in vehicle | |
| - Text recognition on clothing | |
| - Pet detection | |
| ## Technical Details | |
| - **Base Model**: Rex-Thinker-GRPO-7B | |
| - **Object Detection**: GroundingDINO with SwinT backbone | |
| - **Framework**: Gradio for web interface | |
| - **Inference**: Supports streaming text generation | |
| ## Model Information | |
| Rex-Thinker-GRPO is a multimodal reasoning model that: | |
| 1. Uses GroundingDINO to propose candidate object locations | |
| 2. Applies visual reasoning to identify specific objects based on referring expressions | |
| 3. Provides detailed explanations of its reasoning process | |
| 4. Outputs precise bounding box coordinates for detected objects | |
| For more information, visit the [original repository](https://github.com/IDEA-Research/Rex-Thinker-GRPO). |