I’m just reading that Ryzen AI 395 has to be 30% slower than DGX Spark in LLM inferencing… and only 96GB GPU RAM… good I haven’t RTFM upfront, so I made the AMD faster with 128GB unified RAM 🫡 Z2 mini G1a can run Qwen3 Coder 30B BF16 at 26.8 tok/sec in ~60GB GPU RAM
Say hello to my little friends! I just unboxed this trio of HP Z2 G1a!
Three is always better than one! 3x AMD Ryzen AI Max+ Pro 395 384GB RAM 24TB of RAID storage Ubuntu 24.04 ROCm 7.0.2 llama cpp, vLLM and Aibrix
Small, cheap GPUs are about to become the Raspberry Pi of edge AI inference. Sprinkle some kubectl fairy dust on top, and suddenly it's a high-availability, self-healing, cloud-native, enterprise-grade AI cluster camping in a closet.
Make sure you own your AI. AI in the cloud is not aligned with you; it’s aligned with the company that owns it.