File size: 2,503 Bytes
6f2c7f0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
---
title: MIMO - Character Video Synthesis
emoji: 🎭
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.7.1
app_file: app.py
pinned: false
license: apache-2.0
python_version: "3.10"
---IMO - Character Video Synthesis
emoji: 
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.7.1
app_file: app.py
pinned: false
license: apache-2.0
python_version: "3.10"
---

# MIMO - Controllable Character Video Synthesis

**🎬 Complete Implementation Matching Research Paper**

Transform character images into animated videos with controllable motion and advanced video editing capabilities.

## Features

- **Character Animation**: Animate character images with driving 3D poses from motion datasets
- **Spatial 3D Motion**: Support for in-the-wild video with spatial 3D motion and interactive scenes
- **Real-time Processing**: Optimized for interactive use in web interface
- **Multiple Templates**: Pre-built motion templates for various activities (sports, dance, martial arts, etc.)

## How to Use

1. **Upload a character image**: Choose a full-body, front-facing image with no occlusion or handheld objects
2. **Select motion template**: Pick from various pre-built motion templates in the gallery
3. **Generate**: Click "Run" to synthesize the character animation video

## Technical Details

- **Model Architecture**: Based on spatial decomposed modeling with UNet 2D/3D architectures
- **Motion Control**: Uses 3D pose guidance for precise motion control
- **Scene Handling**: Supports background separation and occlusion handling
- **Resolution**: Generates videos at 784x784 resolution

## Citation

If you find this work useful, please cite:

```bibtex
@inproceedings{men2025mimo,
  title={MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling},
  author={Men, Yifang and Yao, Yuan and Cui, Miaomiao and Liefeng Bo},
  booktitle={Computer Vision and Pattern Recognition (CVPR), 2025 IEEE Conference on},
  year={2025}
}
```

## Links

- [Project Page](https://menyifang.github.io/projects/MIMO/index.html)
- [Paper](https://arxiv.org/abs/2409.16160)
- [Original Repository](https://github.com/menyifang/MIMO)
- [Video Demo](https://www.youtube.com/watch?v=skw9lPKFfcE)

## Acknowledgments

This work builds upon several excellent open-source projects including Moore-AnimateAnyone, SAM, 4D-Humans, and ProPainter.

---

**Note**: This Space requires GPU resources for optimal performance. Processing time may vary depending on video length and complexity.