🎉 Welcome to Quantized LFM2-Audio-1.5B – Now With Fewer Bits, More Zest!

Are you tired of your audio models hogging all your RAM and GPU, eating up teraflops like they’re snack chips? You, my friend, are in the right place. Here you’ll find the lean, mean, quantized versions of the legendary LFM2-Audio-1.5B model – now slimmed down to run on computers that aren’t secretly a spaceship.

🧪 What’s Inside?

Every quant file here is basically the LFM2-Audio-1.5B you know and (hopefully) love, but dieted down for performance. Choose your flavor:

Q4 (nf4): For the speed demons and hardware minimalists.
FP4: Like diet soda for float precision.
Q6 (fp6): The “Goldilocks” of quantization – not too small, not too big.
INT4/INT8: For the brave souls who want models so tiny they could run inside a toaster...maybe.

(If your PC starts making mysterious noises, just pretend it’s applauding your cleverness.)

🚀 How to Use (No PhD Required)

Plug, load, and chat with your model like it’s your favorite AI pal. Example code below – now with comments for real humans:

# 1. Install dependencies
pip install liquid-audio torch torchaudio bitsandbytes


import torch
from liquid_audio import LFM2AudioModel, LFM2AudioProcessor, ChatState, LFMModality

device = "cuda" if torch.cuda.is_available() else "cpu"

# ---- Step 1: Load processor and model architecture from Hugging Face ----
model_id = "LiquidAI/LFM2-Audio-1.5B"  # Keep this for processor and config
processor = LFM2AudioProcessor.from_pretrained(model_id).eval()
model = LFM2AudioModel.from_pretrained(model_id)

# ---- Step 2: Load YOUR quantized weights ----
model.load_state_dict(torch.load("./LFM2-Audio-1.5B-Q4.pt", map_location=device))
model.to(device)
model.eval()

# ---- Step 3: Setup prompt/chat ----
chat = ChatState(processor)

chat.new_turn("system")
chat.add_text("Respond with interleaved text and audio. ")
    
chat.end_turn()

chat.new_turn("user")
chat.add_text("My business specialized in chairs, can you give me something related to that?")  # Replace with your test input
chat.end_turn()

chat.new_turn("assistant")
# ---- Step 4: Generate response (text/audio) ----
text_out = []
audio_out = []
modality_out = []

for t in model.generate_interleaved(**chat, max_new_tokens=512, audio_temperature=1.0, audio_top_k=4):
    if t.numel() == 1:  # text token
        print(processor.text.decode(t), end="", flush=True)
        text_out.append(t)
        modality_out.append(LFMModality.TEXT)
    else:  # audio token
        audio_out.append(t)
        modality_out.append(LFMModality.AUDIO_OUT)

# ---- Step 5: Optionally decode and save audio response ----
if audio_out:
    mimi_codes = torch.stack(audio_out[:-1], 1).unsqueeze(0)
    with torch.no_grad():
        waveform = processor.mimi.decode(mimi_codes)[0]
    import torchaudio
    torchaudio.save("a.wav", waveform.cpu(), 24_000)
    print("Audio response saved as answer.wav")

And boom – your model is ready to do all the AI talking, thinking, and audio wiggling you need. If the output sounds weird, just say it’s avant-garde.

📦 Supported Files (Snack-Sized AI)

A quick menu for file connoisseurs:

LFM2-Audio-1.5B-Q4.pt
LFM2-Audio-1.5B-FP4.pt
LFM2-Audio-1.5B-Q6.pt
LFM2-Audio-1.5B-INT4.pt
LFM2-Audio-1.5B-INT8.pt

Collect them all, Pokemon style.

🕵️‍♂️ License

Play nice – these are under the [LiquidAI/LFM2-Audio-1.5B license]. Read it with snacks.>ᴗ<

📬 Trouble?

Yell into Hugging Face Discussions or summon help from the Liquid4All repo. Carrier pigeons not supported.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for Shadow0482/miyu_s2s

Base model

LiquidAI/LFM2-1.2B

Finetuned

LiquidAI/LFM2-Audio-1.5B

Finetuned

(1)

this model