Qwen-Image Image Structure Control Model - Depth ControlNet

Model Introduction

This model is an image structure control model based on Qwen-Image, with a ControlNet architecture that enables structural control of generated images using depth maps. The training framework is built upon DiffSynth-Studio, and the dataset used for training is BLIP3o.

Result Demonstration

Depth Map Generated Image 1 Generated Image 2

Inference Code

git clone https://github.com/modelscope/DiffSynth-Studio.git  
cd DiffSynth-Studio
pip install -e .
from diffsynth.pipelines.qwen_image import QwenImagePipeline, ModelConfig, ControlNetInput
from PIL import Image
import torch
from modelscope import dataset_snapshot_download


pipe = QwenImagePipeline.from_pretrained(
    torch_dtype=torch.bfloat16,
    device="cuda",
    model_configs=[
        ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="transformer/diffusion_pytorch_model*.safetensors"),
        ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="text_encoder/model*.safetensors"),
        ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="vae/diffusion_pytorch_model.safetensors"),
        ModelConfig(model_id="DiffSynth-Studio/Qwen-Image-Blockwise-ControlNet-Depth", origin_file_pattern="model.safetensors"),
    ],
    tokenizer_config=ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="tokenizer/"),
)

dataset_snapshot_download(
    dataset_id="DiffSynth-Studio/example_image_dataset",
    local_dir="./data/example_image_dataset",
    allow_file_pattern="depth/image_1.jpg"
)

controlnet_image = Image.open("data/example_image_dataset/depth/image_1.jpg").resize((1328, 1328))

prompt = "Exquisite portrait, underwater girl, flowing blue dress, gently floating hair, translucent lighting, surrounded by bubbles, serene expression, intricate details, dreamy and ethereal." image = pipe( prompt, seed=0, blockwise_controlnet_inputs=[ControlNetInput(image=controlnet_image)] ) image.save("image.jpg")

Downloads last month

-

Downloads are not tracked for this model. How to track
Safetensors
Model size
1B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support