Spaces:

Jay-Rajput
/

AIDetector

Sleeping

File size: 3,077 Bytes

---
title: AIDetector
emoji: 📉
colorFrom: purple
colorTo: pink
sdk: gradio
sdk_version: 5.45.0
app_file: app.py
pinned: false
license: mit
---

# Advanced AI Text Detector 🔍

An advanced AI text detection system that identifies AI-generated content, particularly from ChatGPT and similar language models.

## Features

### 🤖 Dual Detection Methods
- **Transformer-based Detection**: Uses fine-tuned RoBERTa model specifically trained on ChatGPT detection
- **Statistical Analysis**: Employs multiple linguistic metrics for robust detection

### 📊 Comprehensive Analysis Metrics
- **Burstiness Analysis**: Measures sentence length variation (human text is typically more "bursty")
- **Vocabulary Diversity**: Analyzes lexical richness and word variety
- **Repetition Detection**: Identifies repeated phrases and patterns
- **Perplexity Scoring**: Evaluates text predictability
- **Punctuation Patterns**: Analyzes punctuation consistency

### 🎯 High Accuracy Features
- Multi-method ensemble approach for improved accuracy
- Confidence scoring system
- Detailed explanations for each detection
- Visual probability distribution

## How It Works

1. **Input Processing**: The text is tokenized and prepared for analysis
2. **Transformer Analysis**: If available, the RoBERTa model provides initial AI probability
3. **Statistical Analysis**: Multiple linguistic features are extracted and analyzed
4. **Score Combination**: Results are weighted and combined for final prediction
5. **Result Generation**: Detailed report with classification, confidence, and explanations

## Detection Categories

- **AI-Generated**: >80% AI probability (High confidence)
- **Likely AI-Generated**: 60-80% AI probability (Medium confidence)
- **Uncertain**: 40-60% AI probability (Low confidence)
- **Likely Human-Written**: 20-40% AI probability (Medium confidence)
- **Human-Written**: <20% AI probability (High confidence)

## Usage Tips

- Provide at least 100 words for optimal accuracy
- Longer texts generally yield more reliable results
- The detector works best with English text
- Results are probabilistic - use them as guidance, not absolute truth

## Technical Stack

- **Gradio**: Interactive web interface
- **Transformers**: Hugging Face transformer models
- **PyTorch**: Deep learning backend
- **SciPy/NumPy**: Statistical analysis

## Limitations

- Best performance with English text
- Requires sufficient text length (minimum 50 characters, optimal 100+ words)
- Detection accuracy may vary with highly technical or specialized content
- Should be used as a tool for guidance, not definitive judgment

## Deployment

This app is designed to run on Hugging Face Spaces. Simply upload the files to your Space and it will automatically deploy.

## Model Credit

This detector uses the `Hello-SimpleAI/chatgpt-detector-roberta` model from Hugging Face, combined with custom statistical analysis methods.

---

**Note**: AI detection is a rapidly evolving field. No detector is 100% accurate, and results should be interpreted with appropriate context and judgment.