Spaces:

serhany
/

pas2-llm-hallucination-detector

Sleeping

App Files Files Community

pas2-llm-hallucination-detector / README.md

serhany

correct pr for the added MongoDB support (#2)

b7e24e5 verified 8 months ago

preview code

raw

history blame contribute delete

6.34 kB

A newer version of the Gradio SDK is available: 5.49.1

Upgrade

metadata

title: Pas2 Llm Hallucination Detector
emoji: 🐠
colorFrom: purple
colorTo: yellow
sdk: gradio
sdk_version: 5.20.1
app_file: app.py
pinned: false
license: mit
short_description: pas2 is an llm-as-a-judge system used to verify outputs

PAS2 - Hallucination Detection System

A sophisticated system for detecting hallucinations in AI responses using a paraphrase-based approach with model-as-judge verification.

Features

Paraphrase Generation: Automatically generates semantically equivalent variations of user queries
Multi-Model Architecture: Uses Mistral Large for responses and OpenAI's o3-mini as a judge
Real-time Progress Tracking: Visual feedback during the analysis process
Permanent Cloud Storage: User feedback and results are stored in MongoDB Atlas for persistent storage across restarts
Interactive Web Interface: Clean, responsive Gradio interface with example queries
Detailed Analysis: Provides confidence scores, reasoning, and specific conflicting facts
Statistics Dashboard: Real-time tracking of hallucination detection statistics

Setup

Clone this repository
Install dependencies:
```
pip install -r requirements.txt
```
Set up your API keys as environment variables:
- HF_MISTRAL_API_KEY: Your Mistral AI API key
- HF_OPENAI_API_KEY: Your OpenAI API key

Deployment on Hugging Face Spaces

Create a new Space on Hugging Face
Select "Gradio" as the SDK
Add your repository
Set up a MongoDB Atlas database (see below)
Set the following secrets in your Space's settings:
- HF_MISTRAL_API_KEY
- HF_OPENAI_API_KEY
- MONGODB_URI

MongoDB Atlas Setup

For permanent data storage that persists across HuggingFace Space restarts:

Create a free MongoDB Atlas account
Create a new cluster (the free tier is sufficient)
In the "Database Access" menu, create a database user with read/write permissions
In the "Network Access" menu, add IP 0.0.0.0/0 to allow access from anywhere (required for HuggingFace Spaces)
In the "Databases" section, click "Connect" and choose "Connect your application"
Copy the connection string and replace <password> with your database user's password
Set this as your MONGODB_URI secret in HuggingFace Spaces settings

Usage

Enter a factual question or select from example queries
Click "Detect Hallucinations" to start the analysis
Review the detailed results:
- Hallucination detection status
- Confidence score
- Original and paraphrased responses
- Detailed reasoning and analysis
Provide feedback to help improve the system

How It Works

Query Processing:
- Your question is paraphrased multiple ways
- Each version is sent to Mistral Large
- Responses are collected and compared
Hallucination Detection:
- OpenAI's o3-mini analyzes responses
- Identifies factual inconsistencies
- Provides confidence scores and reasoning
Feedback Collection:
- User feedback is stored in MongoDB Atlas
- Cloud-based persistent storage ensures data survival
- Statistics are updated in real-time
- Data can be exported for further analysis

Data Persistence

The application uses MongoDB Atlas for data storage, providing several benefits:

Permanent Storage: Data persists even when Hugging Face Spaces restart
Scalability: MongoDB scales as your data grows
Cloud-based: No reliance on Space-specific storage that can be lost
Query Capabilities: Powerful query functionality for data analysis
Export Options: Built-in methods to export data to CSV

Contributing

Contributions are welcome! Please feel free to submit pull requests.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

This application uses a combination of paraphrasing techniques and model-as-judge approaches to identify potential hallucinations in LLM responses. It provides confidence scores, identifies conflicting facts, and offers detailed reasoning for its judgments.

Features

Generates paraphrased versions of input queries
Evaluates responses using semantic similarity analysis
Provides match percentage and similarity metrics
Includes visualization tools for similarity matrices
Web interface for interactive testing
Benchmarking capabilities for bulk evaluation

Installation

git clone https://github.com/serhanylmz/pas2
cd pas2
pip install -r requirements.txt

Set up your OpenAI API key in a .env file:

OPENAI_API_KEY=your_api_key_here

Usage

Web Interface

Run the Gradio interface:

python pas2-gradio.py

Benchmark Tool

Run the benchmark tool:

python pas2-benchmark.py --json_file your_data.json --num_samples 10

Library Usage

from pas2 import PAS2

detector = PAS2()
hallucinated, response, questions, answers = detector.detect_hallucination(
    "your question",
    n_paraphrases=5,
    similarity_threshold=0.9,
    match_percentage_threshold=0.7
)

Configuration

Default model: gpt-4-2024-08-06
Default embedding model: text-embedding-3-small
Adjustable similarity and match percentage thresholds

Output Files

Similarity matrix plots (PNG)
Match matrix plots (PNG)
Benchmark results (CSV, TXT)
User feedback logs (XLSX)

License

This project is licensed under the MIT License with an attribution requirement - see the LICENSE file for details.

Citation

If you use PAS2 in your research or project, please cite it as:

@software{pas2_2024,
  author = {Serhan Yilmaz},
  title = {PAS2 - Paraphrase-based AI System for Semantic Similarity},
  year = {2024},
  publisher = {GitHub},
  url = {https://github.com/serhanylmz/pas2}
}

Attribution Requirements

When using PAS2, you must provide appropriate attribution by:

Including the copyright notice and license in any copy or substantial portion of the software
Citing the project in any publications, presentations, or documentation that uses or builds upon this work
Maintaining a link to the original repository in any forks or derivative works

Contact

Serhan Yilmaz [email protected] Sabanci University