@sergiopaniego on Hugging Face: "Online training methods (e.g., GRPO) require real-time generation, a compute-…"

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

sergiopaniego

posted an update Oct 7, 2025

Post

2458

Online training methods (e.g., GRPO) require real-time generation, a compute- and memory-heavy bottleneck.

TRL has built-in vLLM support and in this new recipe, we show how to leverage it for efficient online training. Run on Colab ⚡, scale to multi-GPU/multi-node!

🧑‍🍳 recipe: https://huggingface.co/learn/cookbook/grpo_vllm_online_training

SelmaNajih001

Oct 7, 2025

Thanks for sharing!

In this post

sergiopaniego Sergio Paniego
SelmaNajih001 Selma Najih