view reply Our guide shows how llama.cpp allows disk, RAM and VRAM offloading, so it optimally allocates it. For eg Mac unified memory systems are well suited.