Qwen-Image Full Distillation Accelerated Model
Model Introduction
This model is a distilled and accelerated version of Qwen-Image. The original model requires 40 inference steps and classifier-free guidance (CFG), resulting in a total of 80 forward passes. In contrast, the distilled accelerated model only requires 15 inference steps without CFG, totaling just 15 forward passes—achieving approximately 5x speedup. Of course, the number of inference steps can be further reduced based on requirements, though this may lead to some degradation in generation quality.
The training framework is built upon DiffSynth-Studio. The training data consists of 16,000 images generated by the original model using randomly sampled prompts from DiffusionDB. The training process was conducted on 8 * MI308X GPUs and took approximately one day.
Performance Comparison
| Original Model | Original Model | Accelerated Model | |
|---|---|---|---|
| Inference Steps | 40 | 15 | 15 |
| CFG Scale | 4 | 1 | 1 |
| Forward Passes | 80 | 15 | 15 |
| Example 1 | ![]() |
![]() |
![]() |
| Example 2 | ![]() |
![]() |
![]() |
| Example 3 | ![]() |
![]() |
![]() |
Inference Code
git clone https://github.com/modelscope/DiffSynth-Studio.git
cd DiffSynth-Studio
pip install -e .
from diffsynth.pipelines.qwen_image import QwenImagePipeline, ModelConfig
import torch
pipe = QwenImagePipeline.from_pretrained(
torch_dtype=torch.bfloat16,
device="cuda",
model_configs=[
ModelConfig(model_id="DiffSynth-Studio/Qwen-Image-Distill-Full", origin_file_pattern="diffusion_pytorch_model*.safetensors"),
ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="text_encoder/model*.safetensors"),
ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="vae/diffusion_pytorch_model.safetensors"),
],
tokenizer_config=ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="tokenizer/"),
)
prompt = "精致肖像,水下少女,蓝裙飘逸,发丝轻扬,光影透澈,气泡环绕,面容恬静,细节精致,梦幻唯美。"
image = pipe(prompt, seed=0, num_inference_steps=15, cfg_scale=1)
image.save("image.jpg")
- Downloads last month
- 54









