Models for "RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments" - https://arxiv.org/abs/2511.07317
Hamish Ivison
hamishivi
AI & ML interests
NLP :)
Recent Activity
updated
a dataset
39 minutes ago
hamishivi/appworld_env_train_fixed
published
a dataset
39 minutes ago
hamishivi/appworld_env_train_fixed
updated
a model
about 1 hour ago
hamishivi/1412_rl_rag_open_judge_citation_1237_step1500
Organizations
TESS 2
Models associated with the paper "TESS-2: A Large-Scale, Generalist Diffusion Language Model". Code: https://github.com/hamishivi/tess-2
7b tulu 2.5
a small run at 7b scale with ppo, following the unpacking dpo and ppo paper.
Tulu V1 Suite
The set of models associated with the paper "How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources".
Large-Scale Data Selection for Instruction Tuning
Datasets and models associated with the paper "Large-Scale Data Selection for Instruction Tuning" (https://arxiv.org/abs/2503.01807)
Tulu 2 Llama 3 Update
Llama 3 models trained on the tulu dataset, following https://arxiv.org/abs/2311.10702 (tulu 2) and https://arxiv.org/abs/2406.09279 (tulu 2.5).
Tulu V2 Suite
The set of models associated with the Tulu V2 technical report.
LM Preference Datasets
RLVE
Models for "RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments" - https://arxiv.org/abs/2511.07317
Large-Scale Data Selection for Instruction Tuning
Datasets and models associated with the paper "Large-Scale Data Selection for Instruction Tuning" (https://arxiv.org/abs/2503.01807)
TESS 2
Models associated with the paper "TESS-2: A Large-Scale, Generalist Diffusion Language Model". Code: https://github.com/hamishivi/tess-2
Tulu 2 Llama 3 Update
Llama 3 models trained on the tulu dataset, following https://arxiv.org/abs/2311.10702 (tulu 2) and https://arxiv.org/abs/2406.09279 (tulu 2.5).
7b tulu 2.5
a small run at 7b scale with ppo, following the unpacking dpo and ppo paper.
Tulu V2 Suite
The set of models associated with the Tulu V2 technical report.
Tulu V1 Suite
The set of models associated with the paper "How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources".
LM Preference Datasets