Spaces:
Running
on
Zero
Running
on
Zero
| title: EditP23 | |
| emoji: π¨ | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: 5.38.2 | |
| app_file: app.py | |
| pinned: false | |
| # EditP23: 3D Editing via Propagation of Image Prompts to Multi-View | |
| [](https://editp23.github.io/) | |
| [](https://arxiv.org/abs/2506.20652) | |
| This repository contains the official implementation for **EditP23**, a method for fast, mask-free 3D editing that propagates 2D image edits to multi-view representations in a 3D-consistent manner. | |
| The edit is guided by an image pair, allowing users to leverage any preferred 2D editing tool, from manual painting to generative pipelines. | |
| ### Installation | |
| <details> | |
| <summary>Click to expand installation instructions</summary> | |
| This project was tested on a Linux system with Python 3.11 and CUDA 12.6. | |
| **1. Clone the Repository** | |
| ```bash | |
| git clone --recurse-submodules https://github.com/editp23/EditP23.git | |
| cd EditP23 | |
| ``` | |
| **2. Install Dependencies** | |
| ```bash | |
| conda create -n editp23 python=3.11 -y | |
| conda activate editp23 | |
| pip install torch torchvision --index-url https://download.pytorch.org/whl/cu126 # Ensure compatibility with your CUDA version. (tested with torch 2.6, cuda 12.6) | |
| pip install diffusers==0.30.1 transformers accelerate pillow huggingface_hub numpy tqdm | |
| ``` | |
| </details> | |
| ### Quick Start | |
| **1. Prepare Your Experiment Directory** | |
| Create a directory for your experiment. Inside this directory, you must place three specific PNG files: | |
| * `src.png`: The original, unedited view of your object. | |
| * `edited.png`: The same view after you have applied your desired 2D edit. | |
| * `src_mv.png`: The multi-view grid of the original object, which will be edited. | |
| Your directory structure should look like this: | |
| ```text | |
| examples/ | |
| βββ robot_sunglasses/ | |
| βββ src.png | |
| βββ edited.png | |
| βββ src_mv.png | |
| ``` | |
| **2. Run the Editing Script** | |
| Execute the `main.py` script, pointing it to your experiment directory. You can adjust the guidance parameters based on the complexity of your edit. | |
| #### Execution Examples | |
| * **Mild Edit (Appearance Change):** | |
| ```bash | |
| python src/main.py --exp_dir examples/robot_sunglasses --tar_guidance_scale 5.0 --n_max 31 | |
| ``` | |
| * **Hard Edit (Large Geometry Change):** | |
| ```bash | |
| python src/main.py --exp_dir examples/deer_wings --tar_guidance_scale 21.0 --n_max 39 | |
| ``` | |
| The output will be saved in the `output/` subdirectory within your experiment folder. | |
| ### Command-Line Arguments | |
| * `--exp_dir`: (Required) Path to the experiment directory. | |
| * `--T_steps`: Total number of denoising steps. Default: `50`. | |
| * `--n_max`: The number of denoising steps to apply edit-aware guidance. Higher values can help with more complex edits. Default: `31`. This value shouldn't exceed `T_steps`. | |
| * `--src_guidance_scale`: CFG scale for the source condition. Can typically remain constant. Default: `3.5`. | |
| * `--tar_guidance_scale`: CFG scale for the target (edited) condition. Higher values apply the edit more strongly. Default: `5.0`. | |
| * `--seed`: Random seed for reproducibility. Default: `18`. | |
| # Results in Multi-View | |
| ### Deer - Pixar style & Wings | |
| | | Cond. View | View 1 | View 2 | View 3 | | |
| | :--- |:-----------------------------------------------------------------:|:----------------------------------------------------:|:----------------------------------------------------:|:----------------------------------------------------:| | |
| | **Original** |  |  |  |  | | |
| | **Pixar style** |  |  |  |  | | |
| | **Wings** |  |  |  |  | | |
| <br> | |
| ### Person - Old & Zombie | |
| | | Cond. View | View 1 | View 2 | View 3 | | |
| |:-------------|:-----------------------------------------------------------------:|:----------------------------------------------------:|:----------------------------------------------------:|:----------------------------------------------------:| | |
| | **Original** |  |  |  |  | | |
| | **Old** |  |  |  |  | | |
| | **Zombie** |  |  |  |  | | |
| # Project Structure | |
| The repository is organized as follows: | |
| ```text | |
| EditP23/ | |
| βββ examples/ # Example assets for quick testing | |
| β βββ deer_wings/ | |
| β β βββ src.png | |
| β β βββ edited.png | |
| β β βββ src_mv.png | |
| β βββ robot_sunglasses/ | |
| β βββ ... | |
| βββ assets/ # Raw asset files | |
| β βββ stormtrooper.glb | |
| βββ scripts/ # Helper scripts for data preparation | |
| β βββ render_mesh.py | |
| β βββ img2mv.py | |
| βββ src/ # Main source code | |
| β βββ init.py | |
| β βββ edit_mv.py | |
| β βββ main.py | |
| β βββ pipeline.py | |
| β βββ utils.py | |
| βββ .gitignore | |
| βββ README.md | |
| ``` | |
| # Utilities | |
| ## Setup | |
| This guide shows how to prepare inputs for **EditP23** and run an edit. | |
| These helper scripts create the three PNG files every experiment needs: | |
| | File | Purpose | | |
| |---------------|-----------------------------------------------------------------| | |
| | `src.png` | Original single view (the one you will edit). | | |
| | `edited.png` | Your 2D edit of `src.png`. | | |
| | `src_mv.png` | 6-view grid of the original object. | | |
| ### 1. Generate `src.png` and `src_mv.png` | |
| **EditP23** needs a **source view** (`src.png`) and a **multi-view grid** (`src_mv.png`). | |
| The grid contains six extra views at fixed azimuth/elevation pairs: | |
| Angles (azimuth, elevation): `(30Β°, 20Β°) (90Β°, -10Β°) (150Β°, 20Β°) (210Β°, -10Β°) (270Β°, 20Β°) (330Β°, -10Β°)` and for the prompt view `(0Β°, 20Β°)`. | |
| We provide two methods to generate these inputs. Both methods produce views on a clean, white background. | |
| Both methods below produce the multi-view grid and the source view from the relevant angles on a white background. | |
| #### Method A: From a Single Image | |
| You can generate the multi-view grid from a single image of an object using our `img2mv.py` script. This script leverages the Zero123++ pipeline with a checkpoint from InstantMesh, which is fine-tuned to produce white backgrounds. | |
| ```bash | |
| # This script takes a single input image and generates the corresponding multi-view grid. | |
| python scripts/img2mv.py \ | |
| --input_image "examples/robot_sunglasses/src.png" \ | |
| --output_dir "examples/robot_sunglasses/" | |
| ``` | |
| **Note:** In this case, `src.png` serves as the source view for EditP23. | |
| #### Method B: From a 3D Mesh | |
| If you have a 3D model, you can use our Blender script to render both the source view and the multi-view grid. | |
| **Prerequisite:** This script requires Blender (`pip install bpy`). | |
| ```bash | |
| # This script renders a source view and a multi-view grid from a 3D mesh. | |
| python scripts/render_mesh.py \ | |
| --mesh_path "assets/stormtrooper.glb" \ | |
| --output_dir "examples/stormtrooper/" | |
| ``` | |
| ### 2. Generating `edited.png` | |
| Once you have your **source view**, you can use any 2D image editor to make your desired changes. We use this user-provided edit to guide the 3D modification. | |
| For quick edits, you can use readily available online tools, such as the following HuggingFace Spaces: | |
| - [FlowEdit](https://huggingface.co/spaces/fallenshock/FlowEdit): Excellent for global, structural edits. | |
| - [Flux-Inpainting](https://huggingface.co/spaces/black-forest-labs/FLUX.1-Fill-dev): Great for local modifications and inpainting. | |
| ## Reconstruction | |
| After generating an edited multi-view image (`edited_mv.png`) with our main script, you can reconstruct it into a 3D model. We provide a helper script that uses the [InstantMesh](https://github.com/TencentARC/InstantMesh) framework to produce a textured `.obj` file and a turntable video. | |
| ### Additional Dependencies | |
| First, you'll need to install several libraries required for the reconstruction process. | |
| <details> | |
| <summary>Click to expand installation instructions</summary> | |
| ```bash | |
| # Install general dependencies | |
| pip install opencv-python einops xatlas imageio[ffmpeg] | |
| # Install NVIDIA's nvdiffrast library | |
| pip install git+https://github.com/NVlabs/nvdiffrast/ | |
| # For video export, ensure ffmpeg is installed | |
| # On conda, you can run: | |
| conda install ffmpeg | |
| ``` | |
| </details> | |
| ### Running the Reconstruction | |
| The reconstruction script takes the multi-view PNG as input and generates the 3D assets. The necessary model config file (instant-mesh-large.yaml) is included in the configs/ directory of the InstanMesh repository. | |
| #### Example Command | |
| ````bash | |
| python scripts/recon.py \ | |
| external/instant-mesh/configs/instant-mesh-large.yaml \ | |
| --input_file "examples/robot_sunglasses/output/edited_mv.png" \ | |
| --output_dir "examples/robot_sunglasses/output/recon/" | |
| ```` | |
| ### Command-Line Arguments | |
| Here are the arguments for the recon.py script: | |
| | Argument | Description | Default | | |
| | :------------ | :----------------------------------------------------------------- | :----------- | | |
| | `config` | **(Required)** Path to the InstantMesh model config file. | | | |
| | `--input_file`| **(Required)** Path to the multi-view PNG file you want to reconstruct. | | | |
| | `--output_dir`| Directory where the output `.obj` and `.mp4` files will be saved. | `"outputs/"` | | |
| | `--scale` | Scale of the input cameras. | `1.0` | | |
| | `--distance` | Camera distance for rendering the output video. | `4.5` | | |
| | `--no_video` | A flag to disable saving the `.mp4` video. | `False` | |