Mitko Vasilev's picture

Mitko Vasilev

mitkox

·

AI & ML interests

Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.

Recent Activity

posted an update about 6 hours ago

I just stress-tested the Beast: MiniMax-M2.1 on Z8 Fury G5. 2101 tokens/sec. FORTY concurrent clients. That's 609 t/s out, 1492 t/s in. The model outputs fire faster than I can type, but feeds on data like a black hole on cheat day. But wait, there's more! Threw it into Claude Code torture testing with 60+ tools, 8 agents (7 sub-agents because apparently one wasn't enough chaos). It didn't even flinch. Extremely fast, scary good at coding. The kind of performance that makes you wonder if the model's been secretly reading Stack Overflow in its spare time lol 3 months ago, these numbers lived in my "maybe in “2030 dreams. Today it's running on my desk AND heaths my home office during the winter!

posted an update 25 days ago

Got to 1199.8 tokens/sec with Devstral Small -2 on my desktop GPU workstation. vLLM nightly. Works out of the box with Mistral Vibe. Next is time to test the big one.

posted an update about 1 month ago

I run 20 AI coding agents locally on my desktop workstation at 400+ tokens/sec with MiniMax-M2. It’s a Sonnet drop-in replacement in my Cursor, Claude Code, Droid, Kilo and Cline peak at 11k tok/sec input and 433 tok/s output, can generate 1B+ tok/m.All with 196k context window. I'm running it for 6 days now with this config. Today max performance was stable at 490.2 tokens/sec across 48 concurrent clients and MiniMax M2. Z8 Fury G5, Xeon 3455, 4xA6K. Aibrix 0.5.0, vLLM 0.11.2,

View all activity

Organizations

New activity in open-acc/README about 1 year ago

Bye Apple and hi NVIDIA

#6 opened about 1 year ago by

New activity in mitkox/WhiteRabbitNeo-2.5-Qwen-2.5-Coder-7B-mlx over 1 year ago

Upload folder using huggingface_hub

#1 opened over 1 year ago by

New activity in microsoft/kosmos-2.5 over 1 year ago

Apply for community grant: Academic project

#1 opened over 1 year ago by

New activity in google/gemma-7b almost 2 years ago

How long does this approval process take?

#10 opened almost 2 years ago by

New activity in TheBloke/WhiteRabbitNeo-33B-v1-GGUF almost 2 years ago

Not able to run this model?

#1 opened almost 2 years ago by