multimodalart HF Staff commited on
Commit
1e6bffc
·
verified ·
1 Parent(s): d47e09c

Add suggested inference code

Browse files
Files changed (1) hide show
  1. README.md +97 -0
README.md CHANGED
@@ -97,5 +97,102 @@ Krea realtime allows users to generate videos in a streaming fashion with ~1s ti
97
  </table>
98
  </div>
99
 
 
100
 
 
 
 
 
 
 
 
 
 
 
101
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
97
  </table>
98
  </div>
99
 
100
+ # Use it with our inference code
101
 
102
+ Set up
103
+ ```bash
104
+ sudo apt install ffmpeg # install if you haven't already
105
+ git clone https://github.com/krea-ai/realtime-video
106
+ cd realtime-video
107
+ uv sync
108
+ uv pip install flash_attn --no-build-isolation
109
+ huggingface-cli download Wan-AI/Wan2.1-T2V-1.3B --local-dir-use-symlinks False --local-dir wan_models/Wan2.1-T2V-1.3B
110
+ huggingface-cli download krea/krea-realtime-video krea-realtime-video-14b.safetensors --local-dir-use-symlinks False --local-dir checkpoints/krea-realtime-video-14b.safetensors
111
+ ```
112
 
113
+ Run
114
+ ```bash
115
+ export MODEL_FOLDER=Wan-AI
116
+ export CUDA_VISIBLE_DEVICES=0 # pick the GPU you want to serve on
117
+ export DO_COMPILE=true
118
+
119
+ uvicorn release_server:app --host 0.0.0.0 --port 8000
120
+ ```
121
+
122
+ And use the web app at http://localhost:8000/ in your browser
123
+ (for more advanced use-cases and custom pipeline check out our GitHub repository: https://github.com/krea-ai/realtime-video)
124
+
125
+ # Use it with 🧨 diffusers
126
+
127
+ Krea Realtime 14B can be used with the `diffusers` library utilizing the new Modular Diffusers structure (for now supporting text-to-video, video-to-video coming soon)
128
+
129
+ ```bash
130
+ # Install diffusers from main
131
+ pip install git+github.com/huggingface/diffusers.git
132
+ ```
133
+
134
+ ```py
135
+ import torch
136
+ from collections import deque
137
+ from diffusers import ModularPipelineBlocks, FlowMatchEulerDiscreteScheduler
138
+ from diffusers.utils import export_to_video
139
+ from diffusers.modular_pipelines import PipelineState, WanModularPipeline
140
+
141
+ class WanRTStreamingPipeline(WanModularPipeline):
142
+ @property
143
+ def default_sample_height(self):
144
+ return 60
145
+
146
+ @property
147
+ def default_sample_width(self):
148
+ return 104
149
+
150
+ @property
151
+ def frame_seq_length(self):
152
+ return 1560
153
+
154
+ @property
155
+ def seq_length(self):
156
+ return 32760
157
+
158
+ @property
159
+ def kv_cache_num_frames(self):
160
+ return 3
161
+
162
+ @property
163
+ def frame_cache_len(self):
164
+ return 1 + (self.kv_cache_num_frames - 1) * 4
165
+
166
+
167
+ block_path = "krea/krea-realtime-video"
168
+ blocks = ModularPipelineBlocks.from_pretrained(block_path, trust_remote_code=True)
169
+ pipe = WanRTStreamingPipeline(blocks, block_path)
170
+
171
+ pipe.load_components(
172
+ trust_remote_code=True,
173
+ device_map="cuda",
174
+ torch_dtype={"default": torch.bfloat16, "vae": torch.float32},
175
+ )
176
+ pipe.scheduler = FlowMatchEulerDiscreteScheduler(shift=5.0)
177
+
178
+ prompt = ["A cat sitting on a boat"]
179
+
180
+ num_frames_per_block = 3
181
+ num_blocks = 9
182
+
183
+ frames = []
184
+ state = PipelineState()
185
+ state.set("frame_cache_context", deque(maxlen=pipe.frame_cache_len))
186
+ for block_idx in range(num_blocks):
187
+ state = pipe(
188
+ state,
189
+ prompt=prompt,
190
+ num_inference_steps=6,
191
+ num_blocks=num_blocks,
192
+ num_frames_per_block=num_frames_per_block,
193
+ block_idx=block_idx,
194
+ )
195
+ frames.extend(state.values["videos"][0])
196
+
197
+ export_to_video(frames, "krt.mp4")
198
+ ```