File size: 6,102 Bytes

ca7b070
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3d18a5f
ca7b070
 
 
1350a96
c9cb67b
34efee7
264debb
fb0dc7f
5dd87f5
34efee7
c9cb67b
34efee7
1350a96
c9cb67b
34efee7
c9cb67b
34efee7
1350a96
34efee7
3d18a5f
83c560d
1350a96
83c560d
 
 
 
 
 
 
 
 
a754858
1350a96
c9cb67b
 
 
 
34efee7
 
 
1350a96
87d9f5a
c9cb67b
87d9f5a
 
67359b6
c9cb67b
67359b6
3c1d22f
2ded6cd
67359b6
 
 
c9cb67b
67359b6
2ded6cd
a754858
b694fec
a754858
 
2ded6cd
67359b6
c9cb67b
 
 
 
 
2ded6cd
c9cb67b
2ded6cd
 
67359b6
c9cb67b
2ded6cd
 
 
c9cb67b
67359b6
c9cb67b
2ded6cd
 
67359b6
c9cb67b
2ded6cd
 
c9cb67b
2ded6cd
 
 
 
c9cb67b
 
 
 
2ded6cd
 
 
3c1d22f
 
 
 
 
 
1350a96
3c1d22f
 
 
 
 
 
 
 
 
 
1350a96
3c1d22f
 
 
 
 
1350a96
3c1d22f
 
 
 
 
 
 
 
 
 
1350a96
3c1d22f
 
 
 
 
 
 
1350a96
3c1d22f
9164a4d
3c1d22f

---
license: mit
tags:
  - image-classification
  - deepfake-detection
  - computer-vision
  - vision-transformer
  - sdxl
  - fake-face-detection
datasets:
  - xhlulu/140k-real-and-fake-faces
metrics:
  - accuracy
  - f1
  - precision
  - recall
model-index:
  - name: SDXL-Deepfake-Detector
    results:
      - task:
          type: image-classification
          name: Image Classification
        dataset:
          name: 140k Real and Fake Faces
          type: xhlulu/140k-real-and-fake-faces
        metrics:
          - type: accuracy
            value: 0.86
            name: Accuracy
---

# SDXL-Deepfake-Detector  
### Detecting AI-Generated Faces with Precision and Purpose  

>*Not just another classifier — a tool for digital truth.*
>
Developed by **[Sadra Milani Moghaddam](https://sadramilani.ir/)**

---

## Why This Matters  
As generative AI (like SDXL, DALL·E, and Midjourney) becomes more accessible, the line between real and synthetic media blurs — especially for vulnerable communities. This project started as a technical experiment but evolved into a **privacy-aware, open-source defense** against visual misinformation, with a focus on **ethical AI deployment**.

---

## Model Overview  

**SDXL-Deepfake-Detector** is a fine-tuned vision transformer that classifies human faces as **artificial (0)** or **human (1)**, achieving an accuracy of **86%**.

## Training Approach

This model was obtained by **fine-tuning** the [`Organika/sdxl-detector`](https://huggingface.co/Organika/sdxl-detector) — a vision transformer pre-trained specifically to detect SDXL-generated faces — on the [140k Real and Fake Faces](https://www.kaggle.com/datasets/xhlulu/140k-real-and-fake-faces) dataset.  

This approach leverages:
- Prior knowledge of SDXL artifacts from the base model
- Broader generalization from a large-scale real/fake face dataset
- Efficient training on limited hardware (single RTX 3060)

The result is a lightweight, high-accuracy detector optimized for **both SDXL and general diffusion-based deepfakes**.

### Key Highlights
- **Architecture**: Fine-tuned Vision Transformer (ViT) via Hugging Face `transformers`
- **Dataset**: 140k balanced real/fake face images
- **License**: [MIT](https://opensource.org/licenses/MIT) — free for research and commercial use
- **Hardware**: Trained on a single NVIDIA RTX 3060 (12GB VRAM) — proving high impact doesn’t require massive resources

---

## Quick Start

### Dependencies
```bash
pip install transformers torch pillow
```
### Python Script
```python
#predict.py
import argparse
from transformers import AutoModelForImageClassification, AutoFeatureExtractor
from PIL import Image
import torch
import os

def main():
    parser = argparse.ArgumentParser(
        description="Classify an image as 'artificial' or 'human' using the SDXL-Deepfake-Detector."
    )
    parser.add_argument("--image", type=str, required=True, help="Path to the input image file")
    args = parser.parse_args()

    # Validate image path
    if not os.path.isfile(args.image):
        raise FileNotFoundError(f"Image file not found: {args.image}")

    # Load model and feature extractor from Hugging Face Hub
    model_name = "SADRACODING/SDXL-Deepfake-Detector"
    print(f"Loading model '{model_name}'...")
    model = AutoModelForImageClassification.from_pretrained(model_name)
    feature_extractor = AutoFeatureExtractor.from_pretrained(model_name)

    # Set device (GPU if available)
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    model.to(device)
    model.eval()
    print(f"Running on device: {device}")

    # Load and preprocess image
    image = Image.open(args.image).convert("RGB")
    inputs = feature_extractor(images=image, return_tensors="pt").to(device)

    # Inference
    with torch.no_grad():
        outputs = model(**inputs)
    
    logits = outputs.logits
    predicted_class_idx = logits.argmax(-1).item()
    predicted_label = model.config.id2label[predicted_class_idx]

    # Output
    print(f"Prediction Result")
    print(f"Class Index: {predicted_class_idx}")
    print(f"Label      : {predicted_label}")

if __name__ == "__main__":
    main()
```
### How to use
```bash
python predict.py --image path/to/image
```

## Performance & Limitations

> **Note**: Final test accuracy will be reported after full evaluation. Preliminary results show strong generalization on SDXL- and diffusion-based face forgeries.

### Known Limitations
- Trained primarily on **frontal, well-lit, aligned face crops** — may underperform on:
  - Low-resolution or blurry images
  - Heavily occluded or non-frontal faces
  - GAN-generated faces (e.g., StyleGAN2/3)
- Label mapping:  
  - `0` → `"artificial"` (AI-generated / Deepfake)  
  - `1` → `"human"` (authentic human face)

> ⚠️ This tool is **not a forensic proof**, but a probabilistic detector. Use responsibly.

---

## Philosophy & Ethics

This model is open-source because:
- **Transparency** is essential in the fight against synthetic media.
- **Accessibility** ensures researchers, journalists, and civil society can audit and use detection tools without gatekeeping.
- **Privacy matters**: The model runs **entirely offline** — your images never leave your device.

As a developer from a vulnerable community, I believe AI safety tools must be **inclusive, ethical, and human-centered** — not just technically accurate.

---

## Acknowledgements

- **Dataset**: [140k Real and Fake Faces](https://www.kaggle.com/datasets/xhlulu/140k-real-and-fake-faces) by xhlulu  
- **Framework**: [Hugging Face Transformers](https://huggingface.co/docs/transformers)  
- **Model & Code**: [GitHub Repository](https://github.com/SadraCoding/SDXL-Deepfake-Detector) | [Hugging Face Hub](https://huggingface.co/SADRACODING/SDXL-Deepfake-Detector)

---

## How to Contribute

Fine-tune this model on your domain-specific data using Hugging Face `Trainer`.

---

> *Built with curiosity, ethics, and a 12GB GPU — because impactful AI doesn’t require a data center, just purpose.*  
> — Sadra Milani Moghaddam