Mitko Vasilev
mitkox
AI & ML interests
Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.
Recent Activity
posted
an
update
about 6 hours ago
I just stress-tested the Beast: MiniMax-M2.1 on Z8 Fury G5.
2101 tokens/sec. FORTY concurrent clients. That's 609 t/s out, 1492 t/s in. The model outputs fire faster than I can type, but feeds on data like a black hole on cheat day.
But wait, there's more! Threw it into Claude Code torture testing with 60+ tools, 8 agents (7 sub-agents because apparently one wasn't enough chaos). It didn't even flinch. Extremely fast, scary good at coding. The kind of performance that makes you wonder if the model's been secretly reading Stack Overflow in its spare time lol
3 months ago, these numbers lived in my "maybe in β2030 dreams. Today it's running on my desk AND heaths my home office during the winter!
posted
an
update
about 1 month ago
I run 20 AI coding agents locally on my desktop workstation at 400+ tokens/sec with MiniMax-M2. Itβs a Sonnet drop-in replacement in my Cursor, Claude Code, Droid, Kilo and Cline peak at 11k tok/sec input and 433 tok/s output, can generate 1B+ tok/m.All with 196k context window. I'm running it for 6 days now with this config.
Today max performance was stable at 490.2 tokens/sec across 48 concurrent clients and MiniMax M2.
Z8 Fury G5, Xeon 3455, 4xA6K. Aibrix 0.5.0, vLLM 0.11.2,