Rolling Forcing

Autoregressive Long Video Diffusion in Real Time

Kunhao Liu1Wenbo Hu2Jiale Xu2Ying Shan2Shijian Lu1
1Nanyang Technological University 2ARC Lab, Tencent PCG

馃挕 TL;DR: REAL-TIME streaming generation of MULTI-MINUTE videos

  • 馃殌 Real-Time at 16 FPS: Stream high-quality video directly from text on a single GPU.
  • 馃幀 Minute-Long Videos: Generate coherent, multi-minute sequences with dramatically reduced drift.
  • 鈿欙笍 Rolling-Window Strategy: Denoise frames together in a rolling window for mutual refinement, breaking the chain of error accumulation.
  • 馃 Long-Term Memory: The novel Attention Sink anchors your video, preserving global context over thousands of frames.
  • 馃 State-of-the-Art Performance: Outperforms all comparable open-source models in quality and consistency.

馃摎 Citation

If you find this codebase useful for your research, please cite our paper and consider giving the repo a 猸愶笍 on GitHub: https://github.com/TencentARC/RollingForcing

@article{liu2025rolling,
  title={Rolling Forcing: Autoregressive Long Video Diffusion in Real Time},
  author={Liu, Kunhao and Hu, Wenbo and Xu, Jiale and Shan, Ying and Lu, Shijian},
  journal={arXiv preprint arXiv:2509.25161},
  year={2025}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for TencentARC/RollingForcing

Finetuned
(16)
this model