faisalishfaq2005

updated readme

b8dafec 8 days ago

3.92 kB

	---
	language: en
	library_name: pytorch
	license: mit
	tags:
	- deepfake-detection
	- image-classification
	- video-analysis
	- efficientvit
	- pytorch
	pipeline_tag: image-classification

	safetensors:
	total: 1
	format: safetensors
	weight_dtype: float32
	size_in_bytes: 80000000

	model-index:
	- name: Deepfake Detection with Improved EfficientViT
	results:
	- task:
	type: image-classification
	name: Deepfake Detection
	dataset:
	type: custom
	name: FaceForensics++,Celeb-DF
	metrics:
	- name: Accuracy
	type: accuracy
	value: 0.8864
	- name: Precision
	type: precision
	value: 0.8920
	- name: Recall
	type: recall
	value: 0.8792
	- name: F1-score
	type: f1
	value: 0.8856

	config: config.json
	metadata:
	model_type: EfficientViT
	num_parameters: 20026725
	precision: float32
	framework: pytorch
	license: mit
	model_format: safetensors
	size: 82MB
	---

	# Deepfake Detection with Improved EfficientViT

	## Model Architecture

	![Model Architecture](assets/architecture.png)

	## Inference Pipeline

	![Inference Pipeline](assets/inference_pipeline.png)


	This repository contains a PyTorch model for deepfake detection based on an improved EfficientViT architecture, trained on video data.

	The model predicts whether a video is real (0) or fake (1) using both visual information and temporal cues.

	---

	## 🧩 Model Description

	Architecture: Improved EfficientViT
	Backbone: EfficientNet-B0 for feature extraction
	Head: Transformer-based temporal modeling with classification head
	Input: Video frames (224×224 RGB images)
	Output: Binary label (0=Real, 1=Fake) and frame-level probabilities

	Key Features:

	- Extracts faces from frames using MTCNN
	- Supports inference on raw video files
	- Provides frame-level probabilities for fine-grained analysis

	---

	## 📁 Repository Structure

	```
	deepfake-efficientvit/
	│
	├── model.py # ImprovedEfficientViT class
	├── inference.py # Functions to run inference on videos
	├── model.pth # Trained weights
	├── config.json # Optional model metadata
	├── requirements.txt # Required packages
	├── README.md

	```

	## ⚡ Installation
	git clone https://huggingface.co/faisalishfaq2005/deepfake-detection-efficientnet-vit

	cd deepfake-detection-efficientnet-vit

	pip install -r requirements.txt

	## 🚀 Usage
	# 1.Programmatic Inference

	```python

	from huggingface_hub import hf_hub_download
	from safetensors.torch import load_file
	import torch
	from model import ImprovedEfficientViT
	from inference import predict_vedio

	# 1️⃣ Download the checkpoint from Hugging Face
	checkpoint_path = hf_hub_download(
	repo_id="faisalishfaq2005/deepfake-detection-efficientnet-vit",
	filename="model.safetensors"
	)

	# 2️⃣ Load the model weights safely
	state_dict = load_file(checkpoint_path, device="cpu")
	model = ImprovedEfficientViT()
	model.load_state_dict(state_dict)
	model.eval()

	# 4️⃣ Move to GPU if available
	device = "cuda" if torch.cuda.is_available() else "cpu"
	model.to(device)

	# 3️⃣ Run inference on a video
	video_path = "sample_video.mp4"
	result = predict_vedio(video_path, model)
	print(result)
	# Example Output: {'class': 1}

	```
	# 2. Manual Download

	Go to the Hugging Face model page

	Download:

	model.pth

	model.py

	inference.py

	Place them in the same folder locally.

	Install requirements and run predict_video().

	## 📄 License

	This model is released under the MIT License.
	You are free to use, modify, and distribute it, with attribution.

	## 📚 Citation

	If you use this model in your research, please cite:

	```bibtex
	@inproceedings{faisalishfaq2025efficientvit,
	title={Deepfake Detection with Efficientnet and ViT},
	author={Faisal Ishfaq},
	year={2025}
	}
	```