detr-pose-coco50 / README.md
Koushim's picture
Update README.md
bef9ef9 verified
metadata
license: apache-2.0
tags:
  - pytorch
  - keypoint-detection
  - human-pose-estimation
  - heatmap-regression
  - computer-vision
  - detr
  - coco
model-index:
  - name: detr-pose-coco50
    results:
      - task:
          type: pose-estimation
          name: Human Pose Estimation
        dataset:
          type: COCO
          name: COCO 2017 (50-person subset)
        metrics:
          - type: MSELoss
            value: ~0.02
            name: Heatmap MSE

πŸ“Œ DETR + Keypoint Estimation (COCO Subset)

Author: @Koushik


🧠 Model Overview

This project combines:

The system detects people using DETR, then predicts 17 COCO-style keypoints (top-down) using heatmap regression.


πŸ“‚ Files Included

File Description
pytorch_model.bin Trained PyTorch model weights
05_detr_pose_coco_colab.ipynb Full Colab notebook (training + inference)
config.json Basic model metadata
README.md Project description

πŸ“š Dataset

  • Subset: 500 images from COCO val2017 with visible persons
  • Annotations: 17 keypoints per person
  • Source: COCO Keypoints

πŸ—οΈ Architecture

[ Input Image ]
      β”‚
      β–Ό
[ DETR (Person BBox) ]
      β”‚
      β–Ό
[ Crop + Resize (256Γ—256) ]
      β”‚
      β–Ό
[ CNN Keypoint Head ]
      β”‚
      β–Ό
[ 17 Heatmaps (Keypoints) ]

πŸš€ Quick Start

import torch
from model import KeypointHead

model = KeypointHead()
model.load_state_dict(torch.load('pytorch_model.bin'))
model.eval()

πŸ§ͺ Inference Demo

from PIL import Image
import cv2, numpy as np
from transformers import DetrImageProcessor, DetrForObjectDetection

img = Image.open('sample.jpg')
processor = DetrImageProcessor.from_pretrained("facebook/detr-resnet-50")
detector = DetrForObjectDetection.from_pretrained("facebook/detr-resnet-50")

inputs = processor(images=img, return_tensors="pt")
outputs = detector(**inputs)
results = processor.post_process_object_detection(outputs, target_sizes=[img.size[::-1]], threshold=0.8)[0]

# Use results['boxes'][0] to crop person
# Feed crop into model(img) to get 17 heatmaps

🧠 Training (optional)

To fine-tune on your own dataset:

  • Convert your data to COCO format
  • Use the notebook provided (05_detr_pose_coco_colab.ipynb)
  • Change paths and re-train

✨ Credit