File size: 6,102 Bytes
ca7b070
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3d18a5f
ca7b070
 
 
1350a96
c9cb67b
34efee7
264debb
fb0dc7f
5dd87f5
34efee7
c9cb67b
34efee7
1350a96
c9cb67b
34efee7
c9cb67b
34efee7
1350a96
34efee7
3d18a5f
83c560d
1350a96
83c560d
 
 
 
 
 
 
 
 
a754858
1350a96
c9cb67b
 
 
 
34efee7
 
 
1350a96
87d9f5a
c9cb67b
87d9f5a
 
67359b6
c9cb67b
67359b6
3c1d22f
2ded6cd
67359b6
 
 
c9cb67b
67359b6
2ded6cd
a754858
b694fec
a754858
 
2ded6cd
67359b6
c9cb67b
 
 
 
 
2ded6cd
c9cb67b
2ded6cd
 
67359b6
c9cb67b
2ded6cd
 
 
c9cb67b
67359b6
c9cb67b
2ded6cd
 
67359b6
c9cb67b
2ded6cd
 
c9cb67b
2ded6cd
 
 
 
c9cb67b
 
 
 
2ded6cd
 
 
3c1d22f
 
 
 
 
 
1350a96
3c1d22f
 
 
 
 
 
 
 
 
 
1350a96
3c1d22f
 
 
 
 
1350a96
3c1d22f
 
 
 
 
 
 
 
 
 
1350a96
3c1d22f
 
 
 
 
 
 
1350a96
3c1d22f
9164a4d
3c1d22f
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
---
license: mit
tags:
  - image-classification
  - deepfake-detection
  - computer-vision
  - vision-transformer
  - sdxl
  - fake-face-detection
datasets:
  - xhlulu/140k-real-and-fake-faces
metrics:
  - accuracy
  - f1
  - precision
  - recall
model-index:
  - name: SDXL-Deepfake-Detector
    results:
      - task:
          type: image-classification
          name: Image Classification
        dataset:
          name: 140k Real and Fake Faces
          type: xhlulu/140k-real-and-fake-faces
        metrics:
          - type: accuracy
            value: 0.86
            name: Accuracy
---

# SDXL-Deepfake-Detector  
### Detecting AI-Generated Faces with Precision and Purpose  

>*Not just another classifier — a tool for digital truth.*
>
Developed by **[Sadra Milani Moghaddam](https://sadramilani.ir/)**

---

## Why This Matters  
As generative AI (like SDXL, DALL·E, and Midjourney) becomes more accessible, the line between real and synthetic media blurs — especially for vulnerable communities. This project started as a technical experiment but evolved into a **privacy-aware, open-source defense** against visual misinformation, with a focus on **ethical AI deployment**.

---

## Model Overview  

**SDXL-Deepfake-Detector** is a fine-tuned vision transformer that classifies human faces as **artificial (0)** or **human (1)**, achieving an accuracy of **86%**.

## Training Approach

This model was obtained by **fine-tuning** the [`Organika/sdxl-detector`](https://huggingface.co/Organika/sdxl-detector) — a vision transformer pre-trained specifically to detect SDXL-generated faces — on the [140k Real and Fake Faces](https://www.kaggle.com/datasets/xhlulu/140k-real-and-fake-faces) dataset.  

This approach leverages:
- Prior knowledge of SDXL artifacts from the base model
- Broader generalization from a large-scale real/fake face dataset
- Efficient training on limited hardware (single RTX 3060)

The result is a lightweight, high-accuracy detector optimized for **both SDXL and general diffusion-based deepfakes**.

### Key Highlights
- **Architecture**: Fine-tuned Vision Transformer (ViT) via Hugging Face `transformers`
- **Dataset**: 140k balanced real/fake face images
- **License**: [MIT](https://opensource.org/licenses/MIT) — free for research and commercial use
- **Hardware**: Trained on a single NVIDIA RTX 3060 (12GB VRAM) — proving high impact doesn’t require massive resources

---

## Quick Start

### Dependencies
```bash
pip install transformers torch pillow
```
### Python Script
```python
#predict.py
import argparse
from transformers import AutoModelForImageClassification, AutoFeatureExtractor
from PIL import Image
import torch
import os

def main():
    parser = argparse.ArgumentParser(
        description="Classify an image as 'artificial' or 'human' using the SDXL-Deepfake-Detector."
    )
    parser.add_argument("--image", type=str, required=True, help="Path to the input image file")
    args = parser.parse_args()

    # Validate image path
    if not os.path.isfile(args.image):
        raise FileNotFoundError(f"Image file not found: {args.image}")

    # Load model and feature extractor from Hugging Face Hub
    model_name = "SADRACODING/SDXL-Deepfake-Detector"
    print(f"Loading model '{model_name}'...")
    model = AutoModelForImageClassification.from_pretrained(model_name)
    feature_extractor = AutoFeatureExtractor.from_pretrained(model_name)

    # Set device (GPU if available)
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    model.to(device)
    model.eval()
    print(f"Running on device: {device}")

    # Load and preprocess image
    image = Image.open(args.image).convert("RGB")
    inputs = feature_extractor(images=image, return_tensors="pt").to(device)

    # Inference
    with torch.no_grad():
        outputs = model(**inputs)
    
    logits = outputs.logits
    predicted_class_idx = logits.argmax(-1).item()
    predicted_label = model.config.id2label[predicted_class_idx]

    # Output
    print(f"Prediction Result")
    print(f"Class Index: {predicted_class_idx}")
    print(f"Label      : {predicted_label}")

if __name__ == "__main__":
    main()
```
### How to use
```bash
python predict.py --image path/to/image
```

## Performance & Limitations

> **Note**: Final test accuracy will be reported after full evaluation. Preliminary results show strong generalization on SDXL- and diffusion-based face forgeries.

### Known Limitations
- Trained primarily on **frontal, well-lit, aligned face crops** — may underperform on:
  - Low-resolution or blurry images
  - Heavily occluded or non-frontal faces
  - GAN-generated faces (e.g., StyleGAN2/3)
- Label mapping:  
  - `0``"artificial"` (AI-generated / Deepfake)  
  - `1``"human"` (authentic human face)

> ⚠️ This tool is **not a forensic proof**, but a probabilistic detector. Use responsibly.

---

## Philosophy & Ethics

This model is open-source because:
- **Transparency** is essential in the fight against synthetic media.
- **Accessibility** ensures researchers, journalists, and civil society can audit and use detection tools without gatekeeping.
- **Privacy matters**: The model runs **entirely offline** — your images never leave your device.

As a developer from a vulnerable community, I believe AI safety tools must be **inclusive, ethical, and human-centered** — not just technically accurate.

---

## Acknowledgements

- **Dataset**: [140k Real and Fake Faces](https://www.kaggle.com/datasets/xhlulu/140k-real-and-fake-faces) by xhlulu  
- **Framework**: [Hugging Face Transformers](https://huggingface.co/docs/transformers)  
- **Model & Code**: [GitHub Repository](https://github.com/SadraCoding/SDXL-Deepfake-Detector) | [Hugging Face Hub](https://huggingface.co/SADRACODING/SDXL-Deepfake-Detector)

---

## How to Contribute

Fine-tune this model on your domain-specific data using Hugging Face `Trainer`.

---

> *Built with curiosity, ethics, and a 12GB GPU — because impactful AI doesn’t require a data center, just purpose.*  
> — Sadra Milani Moghaddam