Spaces:

Mountchicken
/

Rex-Thinker-Demo

Runtime error

App Files Files Community

Rex-Thinker-Demo / README_HF.md

Mountchicken

Upload 53 files

e0483c8 verified 5 months ago

preview code

raw

history blame contribute delete

2.09 kB

	---
	title: Rex-Thinker Demo
	emoji: 🔍
	colorFrom: blue
	colorTo: purple
	sdk: gradio
	sdk_version: 4.44.1
	app_file: demo/app.py
	pinned: false
	license: apache-2.0
	---

	# Rex-Thinker Demo

	This is a demo application for Rex-Thinker-GRPO, a visual reasoning model that combines GroundingDINO for object detection with advanced referring expression comprehension.

	## Features

	- Object Detection: Uses GroundingDINO to detect objects based on category names
	- Referring Expression Comprehension: Identifies specific objects based on detailed descriptions
	- Interactive Web Interface: Easy-to-use Gradio interface with real-time streaming
	- Visual Reasoning: Shows the model's thinking process with detailed explanations

	## How to Use

	1. Upload an Image: Click on "Input Image" to upload your image
	2. Set Object Category: Enter the general category of objects you want to detect (e.g., "person", "car", "dog")
	3. Enter Referring Expression: Provide a detailed description of the specific object you want to identify (e.g., "person wearing red shirt and black hat")
	4. Adjust Visualization Settings: Modify draw width and font size for better visualization
	5. Run the Model: Click "Run with Streaming" to see the results

	## Examples

	The demo includes several pre-loaded examples:
	- Tomato detection
	- Helmet identification
	- Person in vehicle
	- Text recognition on clothing
	- Pet detection

	## Technical Details

	- Base Model: Rex-Thinker-GRPO-7B
	- Object Detection: GroundingDINO with SwinT backbone
	- Framework: Gradio for web interface
	- Inference: Supports streaming text generation

	## Model Information

	Rex-Thinker-GRPO is a multimodal reasoning model that:
	1. Uses GroundingDINO to propose candidate object locations
	2. Applies visual reasoning to identify specific objects based on referring expressions
	3. Provides detailed explanations of its reasoning process
	4. Outputs precise bounding box coordinates for detected objects

	For more information, visit the [original repository](https://github.com/IDEA-Research/Rex-Thinker-GRPO).