π¨ LightVAE
β‘ Efficient Video Autoencoder (VAE) Model Collection
From Official Models to Lightx2v Distilled Optimized Versions - Balancing Quality, Speed and Memory

For VAE, the LightX2V team has conducted a series of deep optimizations, deriving two major series: LightVAE and LightTAE, which significantly reduce memory consumption and improve inference speed while maintaining high quality.
π‘ Core Advantages
π¦ Available Models
π― Wan2.1 Series VAE
| Model Name | Type | Architecture | Description |
|---|---|---|---|
Wan2.1_VAE |
Official VAE | Causal Conv3D | Wan2.1 official video VAE model Highest quality, large memory, slow speed |
taew2_1 |
Open Source Small AE | Conv2D | Open source model based on taeHV Small memory, fast speed, average quality |
lighttaew2_1 |
LightTAE Series | Conv2D | Our distilled optimized version based on taew2_1Small memory, fast speed, quality close to official β¨ |
lightvaew2_1 |
LightVAE Series | Causal Conv3D | Our pruned 75% on WanVAE2.1 architecture then trained+distilled Best balance: high quality + low memory + fast speed π |
π― Wan2.2 Series VAE
| Model Name | Type | Architecture | Description |
|---|---|---|---|
Wan2.2_VAE |
Official VAE | Causal Conv3D | Wan2.2 official video VAE model Highest quality, large memory, slow speed |
taew2_2 |
Open Source Small AE | Conv2D | Open source model based on taeHV Small memory, fast speed, average quality |
lighttaew2_2 |
LightTAE Series | Conv2D | Our distilled optimized version based on taew2_2Small memory, fast speed, quality close to official β¨ |
π Wan2.1 Series Performance Comparison
- Precision: BF16
- Test Hardware: NVIDIA H100
Video Reconstruction (5s 81-frame video)
| Speed | Wan2.1_VAE | taew2_1 | lighttaew2_1 | lightvaew2_1 |
|---|---|---|---|---|
| Encode Speed | 4.1721 s | 0.3956 s | 0.3956 s | 1.5014s |
| Decode Speed | 5.4649 s | 0.2463 s | 0.2463 s | 2.0697s |
| GPU Memory | Wan2.1_VAE | taew2_1 | lighttaew2_1 | lightvaew2_1 |
|---|---|---|---|---|
| Encode Memory | 8.4954 GB | 0.00858 GB | 0.00858 GB | 4.7631 GB |
| Decode Memory | 10.1287 GB | 0.41199 GB | 0.41199 GB | 5.5673 GB |
Video Generation
Task: s2v(speech to video)
Model: seko-talk
|
Wan2.1_VAE |
taew2_1 |
lighttaew2_1 |
lightvaew2_1 |
π Wan2.2 Series Performance Comparison
- Precision: BF16
- Test Hardware: NVIDIA H100
Video Reconstruction
| Speed | Wan2.2_VAE | taew2_2 | lighttaew2_2 |
|---|---|---|---|
| Encode Speed | 1.1369s | 0.3499 s | 0.3499 s |
| Decode Speed | 3.1268 s | 0.0891 s | 0.0891 s |
| GPU Memory | Wan2.2_VAE | taew2_2 | lighttaew2_2 |
|---|---|---|---|
| Encode Memory | 6.1991 GB | 0.0064 GB | 0.0064 GB |
| Decode Memory | 12.3487 GB | 0.4120 GB | 0.4120 GB |
Video Generation
Task: t2v(text to video)
Model: Wan2.2-TI2V-5B
|
Wan2.2_VAE |
taew2_2 |
lighttaew2_2 |
π― Model Selection Recommendations
Selection by Use Case
π₯ Our Optimization Results Comparison
| Comparison | Open Source TAE | LightTAE (Ours) | Official VAE | LightVAE (Ours) |
|---|---|---|---|---|
| Architecture | Conv2D | Conv2D | Causal Conv3D | Causal Conv3D |
| Memory Usage | Minimal (~0.4 GB) | Minimal (~0.4 GB) | Large (~8-12 GB) | Medium (~4-5 GB) |
| Inference Speed | Extremely Fast β‘β‘β‘β‘β‘ | Extremely Fast β‘β‘β‘β‘β‘ | Slow β‘β‘ | Fast β‘β‘β‘β‘ |
| Generation Quality | Average βββ | Close to Official βββββ | Highest βββββ | Close to Official βββββ |
π Todo List
- LightX2V integration
- ComfyUI integration
- Training & Distillation Code
π Usage
Download VAE Models
# Download Wan2.1 official VAE
huggingface-cli download lightx2v/Autoencoders \
--local-dir ./models/vae/
π§ͺ Video Reconstruction Test
We provide a standalone script vid_recon.py to test VAE models independently. This script reads a video, encodes it through VAE, then decodes it back to verify the reconstruction quality.
Script Location: LightX2V/lightx2v/models/video_encoders/hf/vid_recon.py
git clone https://github.com/ModelTC/LightX2V.git
cd LightX2V
1. Test Official VAE (Wan2.1)
python -m lightx2v.models.video_encoders.hf.vid_recon \
input_video.mp4 \
--checkpoint ./models/vae/Wan2.1_VAE.pth \
--model_type vaew2_1 \
--device cuda \
--dtype bfloat16
2. Test Official VAE (Wan2.2)
python -m lightx2v.models.video_encoders.hf.vid_recon \
input_video.mp4 \
--checkpoint ./models/vae/Wan2.2_VAE.pth \
--model_type vaew2_2 \
--device cuda \
--dtype bfloat16
3. Test LightTAE (Wan2.1)
python -m lightx2v.models.video_encoders.hf.vid_recon \
input_video.mp4 \
--checkpoint ./models/vae/lighttaew2_1.pth \
--model_type taew2_1 \
--device cuda \
--dtype bfloat16
4. Test LightTAE (Wan2.2)
python -m lightx2v.models.video_encoders.hf.vid_recon \
input_video.mp4 \
--checkpoint ./models/vae/lighttaew2_2.pth \
--model_type taew2_2 \
--device cuda \
--dtype bfloat16
5. Test LightVAE (Wan2.1)
python -m lightx2v.models.video_encoders.hf.vid_recon \
input_video.mp4 \
--checkpoint ./models/vae/lightvaew2_1.pth \
--model_type vaew2_1 \
--device cuda \
--dtype bfloat16 \
--use_lightvae
6. Test TAE (Wan2.1)
python -m lightx2v.models.video_encoders.hf.vid_recon \
input_video.mp4 \
--checkpoint ./models/vae/taew2_1.pth \
--model_type taew2_1 \
--device cuda \
--dtype bfloat16
7. Test TAE (Wan2.2)
python -m lightx2v.models.video_encoders.hf.vid_recon \
input_video.mp4 \
--checkpoint ./models/vae/taew2_2.pth \
--model_type taew2_1 \
--device cuda \
--dtype bfloat16
Use in LightX2V
Specify the VAE path in the configuration file:
Using Official VAE Series:
{
"vae_path": "./models/vae/Wan2.1_VAE.pth"
}
Using LightVAE Series:
{
"use_lightvae": true,
"vae_path": "./models/vae/lightvaew2_1.pth"
}
Using LightTAE Series:
{
"use_tae": true,
"need_scaled": true,
"tae_path": "./models/vae/lighttaew2_1.pth"
}
Using TAE Series:
{
"use_tae": true,
"tae_path": "./models/vae/taew2_1.pth"
}
Then run the inference script:
cd LightX2V/scripts
bash wan/run_wan_i2v.sh # or other inference scripts
β οΈ Important Notes
1. Compatibility
- Wan2.1 series VAE only works with Wan2.1 backbone models
- Wan2.2 series VAE only works with Wan2.2 backbone models
- Do not mix different versions of VAE and backbone models
π Related Resources
Documentation Links
- LightX2V Quick Start: Quick Start Documentation
- Model Structure Description: Model Structure Documentation
- taeHV Project: GitHub - madebyollin/taeHV
Related Models
- Wan2.1 Backbone Models: Wan-AI Model Collection
- Wan2.2 Backbone Models: Wan-AI/Wan2.2-TI2V-5B
- LightX2V Optimized Models: lightx2v Model Collection
π€ Community & Support
- GitHub Issues: https://github.com/ModelTC/LightX2V/issues
- HuggingFace: https://huggingface.co/lightx2v
- LightX2V Homepage: https://github.com/ModelTC/LightX2V
If you find this project helpful, please give us a β on GitHub
- Downloads last month
- 1,104
Model tree for lightx2v/Autoencoders
Base model
Wan-AI/Wan2.1-I2V-14B-720P