Update README.md
Browse files
README.md
CHANGED
|
@@ -11,4 +11,129 @@ tags:
|
|
| 11 |
- point-cloud
|
| 12 |
---
|
| 13 |
|
| 14 |
-
GeometryCrafter
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
- point-cloud
|
| 12 |
---
|
| 13 |
|
| 14 |
+
## ___***GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors***___
|
| 15 |
+
<div align="center">
|
| 16 |
+
|
| 17 |
+
_**[Tian-Xing Xu<sup>1</sup>](https://scholar.google.com/citations?user=zHp0rMIAAAAJ&hl=zh-CN),
|
| 18 |
+
[Xiangjun Gao<sup>3</sup>](https://scholar.google.com/citations?user=qgdesEcAAAAJ&hl=en),
|
| 19 |
+
[Wenbo Hu<sup>2 †</sup>](https://wbhu.github.io),
|
| 20 |
+
[Xiaoyu Li<sup>2</sup>](https://xiaoyu258.github.io),
|
| 21 |
+
[Song-Hai Zhang<sup>1 †</sup>](https://scholar.google.com/citations?user=AWtV-EQAAAAJ&hl=en),
|
| 22 |
+
[Ying Shan<sup>2</sup>](https://scholar.google.com/citations?user=4oXBp9UAAAAJ&hl=en)**_
|
| 23 |
+
<br>
|
| 24 |
+
<sup>1</sup>Tsinghua University
|
| 25 |
+
<sup>2</sup>ARC Lab, Tencent PCG
|
| 26 |
+
<sup>3</sup>HKUST
|
| 27 |
+
|
| 28 |
+

|
| 29 |
+
<a href='https://arxiv.org/abs/2504.01016'><img src='https://img.shields.io/badge/arXiv-2504.01016-b31b1b.svg'></a>
|
| 30 |
+
<a href='https://geometrycrafter.github.io'><img src='https://img.shields.io/badge/Project-Page-Green'></a>
|
| 31 |
+
<a href='https://huggingface.co/spaces/TencentARC/GeometryCrafter'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Demo-blue'></a>
|
| 32 |
+
|
| 33 |
+
</div>
|
| 34 |
+
|
| 35 |
+
## π Notice
|
| 36 |
+
|
| 37 |
+
**GeometryCrafter is still under active development!**
|
| 38 |
+
|
| 39 |
+
We recommend that everyone use English to communicate on issues, as this helps developers from around the world discuss, share experiences, and answer questions together. For further implementation details, please contact `[email protected]`. For business licensing and other related inquiries, don't hesitate to contact `[email protected]`.
|
| 40 |
+
|
| 41 |
+
If you find GeometryCrafter useful, **please help β this repo**, which is important to Open-Source projects. Thanks!
|
| 42 |
+
|
| 43 |
+
## π Introduction
|
| 44 |
+
|
| 45 |
+
We present GeometryCrafter, a novel approach that estimates temporally consistent, high-quality point maps from open-world videos, facilitating downstream applications such as 3D/4D reconstruction and depth-based video editing or generation.
|
| 46 |
+
|
| 47 |
+
Release Notes:
|
| 48 |
+
- `[01/04/2025]` π₯π₯π₯**GeometryCrafter** is released now, have fun!
|
| 49 |
+
|
| 50 |
+
## π Quick Start
|
| 51 |
+
|
| 52 |
+
### Installation
|
| 53 |
+
1. Clone this repo:
|
| 54 |
+
```bash
|
| 55 |
+
git clone --recursive https://github.com/TencentARC/GeometryCrafter
|
| 56 |
+
```
|
| 57 |
+
2. Install dependencies (please refer to [requirements.txt](requirements.txt)):
|
| 58 |
+
```bash
|
| 59 |
+
pip install -r requirements.txt
|
| 60 |
+
```
|
| 61 |
+
|
| 62 |
+
### Inference
|
| 63 |
+
|
| 64 |
+
Run inference code on our provided demo videos at 1.27FPS, which requires a GPU with ~40GB memory for 110 frames with 1024x576 resolution:
|
| 65 |
+
|
| 66 |
+
```bash
|
| 67 |
+
python run.py \
|
| 68 |
+
--video_path examples/video1.mp4 \
|
| 69 |
+
--save_folder workspace/examples_output \
|
| 70 |
+
--height 576 --width 1024
|
| 71 |
+
# resize the input video to the target resolution for processing, which should be divided by 64
|
| 72 |
+
# the output point maps will be restored to the original resolution before saving
|
| 73 |
+
# you can use --downsample_ratio to downsample the input video or reduce --decode_chunk_size to save the memory usage
|
| 74 |
+
```
|
| 75 |
+
|
| 76 |
+
Run inference code with our deterministic variant at 1.50 FPS
|
| 77 |
+
|
| 78 |
+
```bash
|
| 79 |
+
python run.py \
|
| 80 |
+
--video_path examples/video1.mp4 \
|
| 81 |
+
--save_folder workspace/examples_output \
|
| 82 |
+
--height 576 --width 1024 \
|
| 83 |
+
--model_type determ
|
| 84 |
+
```
|
| 85 |
+
|
| 86 |
+
Run low-resolution processing at 2.49 FPS, which requires a GPU with ~22GB memory:
|
| 87 |
+
|
| 88 |
+
```bash
|
| 89 |
+
python run.py \
|
| 90 |
+
--video_path examples/video1.mp4 \
|
| 91 |
+
--save_folder workspace/examples_output \
|
| 92 |
+
--height 384 --width 640
|
| 93 |
+
```
|
| 94 |
+
|
| 95 |
+
### Visualization
|
| 96 |
+
|
| 97 |
+
Visualize the predicted point maps with `Viser`
|
| 98 |
+
|
| 99 |
+
```bash
|
| 100 |
+
python visualize/vis_point_maps.py \
|
| 101 |
+
--video_path examples/video1.mp4 \
|
| 102 |
+
--data_path workspace/examples_output/video1.npz
|
| 103 |
+
```
|
| 104 |
+
|
| 105 |
+
## π€ Gradio Demo
|
| 106 |
+
|
| 107 |
+
- Online demo: [**GeometryCrafter**](https://huggingface.co/spaces/TencentARC/GeometryCrafter)
|
| 108 |
+
- Local demo:
|
| 109 |
+
```bash
|
| 110 |
+
gradio app.py
|
| 111 |
+
```
|
| 112 |
+
|
| 113 |
+
## π Dataset Evaluation
|
| 114 |
+
|
| 115 |
+
Please check the `evaluation` folder.
|
| 116 |
+
- To create the dataset we use in the paper, you need to run `evaluation/preprocess/gen_{dataset_name}.py`.
|
| 117 |
+
- You need to change `DATA_DIR` and `OUTPUT_DIR` first accordint to your working environment.
|
| 118 |
+
- Then you will get the preprocessed datasets containing extracted RGB video and point map npz files. We also provide the catelog of these files.
|
| 119 |
+
- Inference for all datasets scripts:
|
| 120 |
+
```bash
|
| 121 |
+
bash evaluation/run_batch.sh
|
| 122 |
+
```
|
| 123 |
+
(Remember to replace the `data_root_dir` and `save_root_dir` with your path.)
|
| 124 |
+
- Evaluation for all datasets scripts (scale-invariant point map estimation):
|
| 125 |
+
```bash
|
| 126 |
+
bash evaluation/eval.sh
|
| 127 |
+
```
|
| 128 |
+
(Remember to replace the `pred_data_root_dir` and `gt_data_root_dir` with your path.)
|
| 129 |
+
- Evaluation for all datasets scripts (affine-invariant depth estimation):
|
| 130 |
+
```bash
|
| 131 |
+
bash evaluation/eval_depth.sh
|
| 132 |
+
```
|
| 133 |
+
(Remember to replace the `pred_data_root_dir` and `gt_data_root_dir` with your path.)
|
| 134 |
+
- We also provide the comparison results of MoGe and the deterministic variant of our method. You can evaluate these methods under the same protocol by uncomment the corresponding lines in `evaluation/run.sh` `evaluation/eval.sh` `evaluation/run_batch.sh` and `evaluation/eval_depth.sh`.
|
| 135 |
+
|
| 136 |
+
## π€ Contributing
|
| 137 |
+
|
| 138 |
+
- Welcome to open issues and pull requests.
|
| 139 |
+
- Welcome to optimize the inference speed and memory usage, e.g., through model quantization, distillation, or other acceleration techniques.
|