Adjust duration estimation for H200 performance - reduce conservative estimates de766da Luigi commited on Oct 12
Use actual parameter count for AOT decision instead of string matching e3e334f Luigi commited on Oct 12
Make AOT compilation conditional for models >= 2B parameters to optimize free tier usage 4500f92 Luigi commited on Oct 12
disable two models that cannot run or too run too slowly on hf spaces with zerogpu 3dc7ced Luigi commited on Oct 11
feat(models): add Granite-4.0-Micro and Qwen3-4B-Instruct-2507 to MODELS registry c30a7f7 verified Luigi commited on Oct 9
remove prevously added breeze models (as it didn't work), add smollm 135m taiwan b3fd72e Luigi commited on Aug 4