|
|
--- |
|
|
license: apache-2.0 |
|
|
--- |
|
|
# Qwen-Image LoRA Distillation Acceleration Model |
|
|
|
|
|
## Model Introduction |
|
|
|
|
|
This model is a distilled and accelerated LoRA version of [Qwen-Image](https://www.modelscope.cn/models/Qwen/Qwen-Image). We follow the same training procedure as used in [DiffSynth-Studio/Qwen-Image-Distill-Full](https://modelscope.cn/models/DiffSynth-Studio/Qwen-Image-Distill-Full), but replace the trainable model parameters with LoRA, making it easier to integrate into various image generation frameworks. |
|
|
|
|
|
The training framework is built on [DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio). The training data consists of 16,000 images generated by the original model using randomly sampled prompts from [DiffusionDB](https://www.modelscope.cn/datasets/AI-ModelScope/diffusiondb). The training process ran for approximately one day on 8 * MI308X GPUs. |
|
|
|
|
|
## Performance Comparison |
|
|
|
|
|
||Original Model|Original Model|Accelerated Model| |
|
|
|-|-|-|-| |
|
|
|Inference Steps|40|15|15| |
|
|
|CFG Scale|4|1|1| |
|
|
|Forward Passes|80|15|15| |
|
|
|Example 1|||| |
|
|
|Example 2|||| |
|
|
|Example 3|||| |
|
|
|
|
|
## Inference Code |
|
|
|
|
|
```shell |
|
|
git clone https://github.com/modelscope/DiffSynth-Studio.git |
|
|
cd DiffSynth-Studio |
|
|
pip install -e . |
|
|
``` |
|
|
|
|
|
```python |
|
|
from diffsynth.pipelines.qwen_image import QwenImagePipeline, ModelConfig |
|
|
from modelscope import snapshot_download |
|
|
import torch |
|
|
|
|
|
snapshot_download("DiffSynth-Studio/Qwen-Image-Distill-LoRA", local_dir="models/DiffSynth-Studio/Qwen-Image-Distill-LoRA") |
|
|
pipe = QwenImagePipeline.from_pretrained( |
|
|
torch_dtype=torch.bfloat16, |
|
|
device="cuda", |
|
|
model_configs=[ |
|
|
ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="transformer/diffusion_pytorch_model*.safetensors"), |
|
|
ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="text_encoder/model*.safetensors"), |
|
|
ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="vae/diffusion_pytorch_model.safetensors"), |
|
|
], |
|
|
tokenizer_config=ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="tokenizer/"), |
|
|
) |
|
|
pipe.load_lora(pipe.dit, "models/DiffSynth-Studio/Qwen-Image-Distill-LoRA/model.safetensors") |
|
|
|
|
|
prompt = "Exquisite portrait, underwater girl, flowing blue dress, gently floating hair, translucent lighting, surrounded by bubbles, serene expression, intricate details, dreamy and ethereal." |
|
|
image = pipe(prompt, seed=0, num_inference_steps=15, cfg_scale=1) |
|
|
image.save("image.jpg") |
|
|
``` |