YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Ovi FusionModel - FP8 Quantized

This is the Ovi FusionModel quantized with FP8 (e4m3_e4m3_dynamic_per_tensor) for faster inference.

Quantization Details

Video Model Blocks: 30 blocks quantized
Audio Model Blocks: 30 blocks quantized
Attention/FFN layers: e4m3_e4m3_dynamic_per_tensor
Other layers: e4m3_weightonly

Usage

import sys
import os
import torch
from omegaconf import OmegaConf
from huggingface_hub import hf_hub_download

OVI_PATH = "./workspace/Ovi"
sys.path.insert(0, OVI_PATH)
os.chdir(OVI_PATH)

from ovi.ovi_fusion_engine import OviFusionEngine

# Download quantized weights
model_path = hf_hub_download(
    repo_id="wavespeed/Ovi-e4m3_e4m3_dynamic_per_tensor",
    filename="model.pth"
)

config = OmegaConf.load("config.yaml")
engine = OviFusionEngine(config=config, device="cuda", target_dtype=torch.bfloat16)

# Load quantized weights
engine.model.load_state_dict(torch.load(model_path))

# Model is already quantized, ready for inference

Model Card

Developed by: Alibaba/Character.AI
Model type: Video + Audio generation (FusionModel)
Quantization: FP8 (e4m3_e4m3_dynamic_per_tensor)
License: Check original Ovi repository

Original Model

Based on Ovi

Downloads last month: 13

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support