NanoFlow: Scalable Normalizing Flows with Sublinear Parameter Complexity Paper ⢠2006.06280 ⢠Published Jun 11, 2020
Low Frame-rate Speech Codec: a Codec Designed for Fast High-quality Speech LLM Training and Inference Paper ⢠2409.12117 ⢠Published Sep 18, 2024
Edit-A-Video: Single Video Editing with Object-Aware Consistency Paper ⢠2303.07945 ⢠Published Mar 14, 2023
VoiceTailor: Lightweight Plug-In Adapter for Diffusion-Based Personalized Text-to-Speech Paper ⢠2408.14739 ⢠Published Aug 27, 2024
Audio Flamingo 3: Advancing Audio Intelligence with Fully Open Large Audio Language Models Paper ⢠2507.08128 ⢠Published Jul 10, 2025 ⢠10
Music Flamingo: Scaling Music Understanding in Audio Language Models Paper ⢠2511.10289 ⢠Published Nov 13, 2025 ⢠10
UALM: Unified Audio Language Model for Understanding, Generation and Reasoning Paper ⢠2510.12000 ⢠Published Oct 13, 2025 ⢠1
ETTA: Elucidating the Design Space of Text-to-Audio Models Paper ⢠2412.19351 ⢠Published Dec 26, 2024
Running on Zero 84 Music Flamingo šµ 84 Upload audio or link YouTube URL to get detailed music analysis
Audio Flamingo 3: Advancing Audio Intelligence with Fully Open Large Audio Language Models Paper ⢠2507.08128 ⢠Published Jul 10, 2025 ⢠10
Negative-Guided Subject Fidelity Optimization for Zero-Shot Subject-Driven Generation Paper ⢠2506.03621 ⢠Published Jun 4, 2025 ⢠22
Cosmos Collection ā ļø This collection is archived. š https://huggingface.co/collections/nvidia/cosmos-predict25 ⢠31 items ⢠Updated 11 days ago ⢠299
PHI-S: Distribution Balancing for Label-Free Multi-Teacher Distillation Paper ⢠2410.01680 ⢠Published Oct 2, 2024 ⢠34