@pagezyhf on Hugging Face: "What’s your biggest headache deploying Hugging Face models to the cloud

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

pagezyhf

posted an update Sep 22, 2025

Post

860

What’s your biggest headache deploying Hugging Face models to the cloud—and how can we fix it for you?

nroggendorff

Sep 29, 2025

probably a good thing there aren't many responses here, yes?

nroggendorff

Sep 29, 2025

When I push a model that has multiple shards to a repo that originally had less or more shards, it fetches all of them, even if they're a different architecture (ie pushing a 3-shard model to what was a 1-shard model only fetches the 1-shard model, instead of overwriting it)

push to hub 1-shard model
push to hub 3-shard model
fetch from hub - only fetches model.safetensors, not model-00001-of-00003.safetensors through model-00003-of-00003.safetensors

nroggendorff

Sep 29, 2025

probably wasn't fixed because it's not very common to use the same repo for different architectures like i do

LeroyDyer

Oct 1, 2025

•

edited Oct 2, 2025

...
Exscusmoi

nroggendorff

Oct 7, 2025

..wow.

nroggendorff

Oct 15, 2025

What about letting the user supply a pipeline.py in the repository that the loader will automatically parse and use? For if you have a custom architecture or something.

In this post