YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)
# Audio Flamingo 3 Caption Endpoint Template
Use this as a custom handler.py runtime for a Hugging Face Dedicated Endpoint.
Request contract
{
"inputs": {
"prompt": "Analyze this full song and summarize arrangement changes.",
"audio_base64": "<base64-encoded WAV bytes>",
"max_new_tokens": 1200,
"temperature": 0.1
}
}
Response contract
{
"generated_text": "..."
}
Setup
Fastest path from this repo:
python scripts/hf_clone.py af3-endpoint --repo-id YOUR_USERNAME/YOUR_AF3_ENDPOINT_REPO
Then deploy a Dedicated Endpoint from that model repo.
Important: make sure your endpoint repo contains top-level:
handler.pyrequirements.txtREADME.md
Use endpoint task custom so the runtime loads handler.py instead of a default Transformers pipeline.
Endpoint env vars
Required:
AF3_MODEL_ID=nvidia/audio-flamingo-3-hf
Optional runtime bootstrap (defaults shown):
AF3_BOOTSTRAP_RUNTIME=1AF3_TRANSFORMERS_SPEC=transformers==5.1.0AF3_RUNTIME_DIR=/tmp/af3_runtimeAF3_STUB_TORCHVISION=1
Notes
- Audio Flamingo 3 is large; use a GPU endpoint.
- First boot can take longer because the handler installs AF3-compatible runtime dependencies.
- This handler returns raw prose analysis. Use the local AF3+ChatGPT pipeline to normalize to LoRA sidecar JSON.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support