A newer version of the Gradio SDK is available:
5.49.1
MIMO - Controllable Character Video Synthesis
🎬 Complete Implementation Matching Research Paper
Transform character images into animated videos with controllable motion and advanced video editing capabilities.
Features
- Character Animation: Animate character images with driving 3D poses from motion datasets
- Spatial 3D Motion: Support for in-the-wild video with spatial 3D motion and interactive scenes
- Real-time Processing: Optimized for interactive use in web interface
- Multiple Templates: Pre-built motion templates for various activities (sports, dance, martial arts, etc.)
How to Use
- Upload a character image: Choose a full-body, front-facing image with no occlusion or handheld objects
- Select motion template: Pick from various pre-built motion templates in the gallery
- Generate: Click "Run" to synthesize the character animation video
Technical Details
- Model Architecture: Based on spatial decomposed modeling with UNet 2D/3D architectures
- Motion Control: Uses 3D pose guidance for precise motion control
- Scene Handling: Supports background separation and occlusion handling
- Resolution: Generates videos at 784x784 resolution
Citation
If you find this work useful, please cite:
@inproceedings{men2025mimo,
title={MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling},
author={Men, Yifang and Yao, Yuan and Cui, Miaomiao and Liefeng Bo},
booktitle={Computer Vision and Pattern Recognition (CVPR), 2025 IEEE Conference on},
year={2025}
}
Links
Acknowledgments
This work builds upon several excellent open-source projects including Moore-AnimateAnyone, SAM, 4D-Humans, and ProPainter.
Note: This Space requires GPU resources for optimal performance. Processing time may vary depending on video length and complexity.