38 18 8

Hamish Ivison

hamishivi

esfrankel17's profile picture

0xLaszlo's profile picture

XxghostsyncxX's profile picture

https://ivison.id.au

hamishivi
hamishivi

AI & ML interests

NLP :)

Recent Activity

updated a dataset 39 minutes ago

hamishivi/appworld_env_train_fixed

published a dataset 39 minutes ago

hamishivi/appworld_env_train_fixed

updated a model about 1 hour ago

hamishivi/1412_rl_rag_open_judge_citation_1237_step1500

View all activity

Organizations

hamishivi 's collections 8

RLVE

Models for "RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments" - https://arxiv.org/abs/2511.07317

RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments

Paper • 2511.07317 • Published Nov 10, 2025 • 16
hamishivi/OpenThinker3-1.5B-RLVE

Text Generation • 2B • Updated Nov 11, 2025 • 2 • 2
hamishivi/Nemotron-Research-Reasoning-Qwen-1.5B-v2-RLVE

Text Generation • 2B • Updated Nov 11, 2025 • 2

TESS 2

Models associated with the paper "TESS-2: A Large-Scale, Generalist Diffusion Language Model". Code: https://github.com/hamishivi/tess-2

TESS 2: A Large-Scale Generalist Diffusion Language Model

Paper • 2502.13917 • Published Feb 19, 2025 • 6
hamishivi/tess2-v0.3

7B • Updated Feb 20, 2025 • 280 • 5
hamishivi/tess2-v0.1

7B • Updated Feb 20, 2025
hamishivi/tess2-v0.3-base

7B • Updated Feb 20, 2025 • 7

7b tulu 2.5

a small run at 7b scale with ppo, following the unpacking dpo and ppo paper.

hamishivi/tulu-v2.5-7b-uf-mean-7b-uf-rm

Text Generation • 7B • Updated Jun 25, 2024 • 1
hamishivi/tulu-v2.5-7b-uf-mean-7b-uf-rm-value

Token Classification • 7B • Updated Jun 25, 2024
hamishivi/tulu-v2.5-7b-uf-rm

Text Classification • 7B • Updated Jun 25, 2024 • 3

Tulu V1 Suite

The set of models associated with the paper "How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources".

allenai/tulu-65b

Text Generation • Updated Jun 29, 2023 • 58 • 21
allenai/tulu-30b

Text Generation • Updated Jun 20, 2023 • 66 • 18
allenai/tulu-13b

Text Generation • Updated Jun 20, 2023 • 72 • 8
allenai/tulu-7b

Text Generation • Updated Jun 20, 2023 • 114 • 9

Large-Scale Data Selection for Instruction Tuning

Datasets and models associated with the paper "Large-Scale Data Selection for Instruction Tuning" (https://arxiv.org/abs/2503.01807)

Large-Scale Data Selection for Instruction Tuning

Paper • 2503.01807 • Published Mar 3, 2025 • 14
hamishivi/tulu-2-multitask-rrmax-326k-sft

7B • Updated Mar 4, 2025 • 6 • 1
hamishivi/rds-sels-multitask-rrmax-top326k

Viewer • Updated Mar 4, 2025 • 326k • 7 • 1
hamishivi/llama-3.1-tulu-3-multitask-rrmax-939k-sft

Updated Mar 4, 2025 • 6

Tulu 2 Llama 3 Update

Llama 3 models trained on the tulu dataset, following https://arxiv.org/abs/2311.10702 (tulu 2) and https://arxiv.org/abs/2406.09279 (tulu 2.5).

allenai/llama-3.1-tulu-2-dpo-70b

71B • Updated Aug 15, 2024 • 49
allenai/llama-3.1-tulu-2-70b

71B • Updated Aug 15, 2024 • 59
allenai/llama-3.1-tulu-2-70b-uf-mean-rm

70B • Updated Aug 15, 2024 • 22
allenai/llama-3.1-tulu-2-dpo-8b

8B • Updated Aug 15, 2024 • 52 • 2

Tulu V2 Suite

The set of models associated with the Tulu V2 technical report.

allenai/tulu-2-dpo-70b

Text Generation • 69B • Updated Jan 31, 2024 • 2.52k • 158
allenai/tulu-2-dpo-13b

Text Generation • 13B • Updated May 17, 2024 • 2.25k • • 21
allenai/tulu-2-dpo-7b

Text Generation • Updated May 14, 2024 • 2.47k • 20
allenai/tulu-2-70b

Text Generation • Updated Apr 19, 2024 • 187 • 8

LM Preference Datasets

lmsys/chatbot_arena_conversations

Viewer • Updated Sep 30, 2023 • 33k • 2.18k • 436
Anthropic/hh-rlhf

Viewer • Updated May 26, 2023 • 169k • 21.7k • 1.65k
openai/summarize_from_feedback

Viewer • Updated Jan 3, 2023 • 194k • 4.32k • 216
openai/webgpt_comparisons

Viewer • Updated Dec 19, 2022 • 19.6k • 263 • 239

RLVE

Models for "RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments" - https://arxiv.org/abs/2511.07317

RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments

Paper • 2511.07317 • Published Nov 10, 2025 • 16
hamishivi/OpenThinker3-1.5B-RLVE

Text Generation • 2B • Updated Nov 11, 2025 • 2 • 2
hamishivi/Nemotron-Research-Reasoning-Qwen-1.5B-v2-RLVE

Text Generation • 2B • Updated Nov 11, 2025 • 2

Large-Scale Data Selection for Instruction Tuning

Datasets and models associated with the paper "Large-Scale Data Selection for Instruction Tuning" (https://arxiv.org/abs/2503.01807)

Large-Scale Data Selection for Instruction Tuning

Paper • 2503.01807 • Published Mar 3, 2025 • 14
hamishivi/tulu-2-multitask-rrmax-326k-sft

7B • Updated Mar 4, 2025 • 6 • 1
hamishivi/rds-sels-multitask-rrmax-top326k

Viewer • Updated Mar 4, 2025 • 326k • 7 • 1
hamishivi/llama-3.1-tulu-3-multitask-rrmax-939k-sft

Updated Mar 4, 2025 • 6

TESS 2

Models associated with the paper "TESS-2: A Large-Scale, Generalist Diffusion Language Model". Code: https://github.com/hamishivi/tess-2

TESS 2: A Large-Scale Generalist Diffusion Language Model

Paper • 2502.13917 • Published Feb 19, 2025 • 6
hamishivi/tess2-v0.3

7B • Updated Feb 20, 2025 • 280 • 5
hamishivi/tess2-v0.1

7B • Updated Feb 20, 2025
hamishivi/tess2-v0.3-base

7B • Updated Feb 20, 2025 • 7

Tulu 2 Llama 3 Update

Llama 3 models trained on the tulu dataset, following https://arxiv.org/abs/2311.10702 (tulu 2) and https://arxiv.org/abs/2406.09279 (tulu 2.5).

allenai/llama-3.1-tulu-2-dpo-70b

71B • Updated Aug 15, 2024 • 49
allenai/llama-3.1-tulu-2-70b

71B • Updated Aug 15, 2024 • 59
allenai/llama-3.1-tulu-2-70b-uf-mean-rm

70B • Updated Aug 15, 2024 • 22
allenai/llama-3.1-tulu-2-dpo-8b

8B • Updated Aug 15, 2024 • 52 • 2

7b tulu 2.5

a small run at 7b scale with ppo, following the unpacking dpo and ppo paper.

hamishivi/tulu-v2.5-7b-uf-mean-7b-uf-rm

Text Generation • 7B • Updated Jun 25, 2024 • 1
hamishivi/tulu-v2.5-7b-uf-mean-7b-uf-rm-value

Token Classification • 7B • Updated Jun 25, 2024
hamishivi/tulu-v2.5-7b-uf-rm

Text Classification • 7B • Updated Jun 25, 2024 • 3

Tulu V2 Suite

The set of models associated with the Tulu V2 technical report.

allenai/tulu-2-dpo-70b

Text Generation • 69B • Updated Jan 31, 2024 • 2.52k • 158
allenai/tulu-2-dpo-13b

Text Generation • 13B • Updated May 17, 2024 • 2.25k • • 21
allenai/tulu-2-dpo-7b

Text Generation • Updated May 14, 2024 • 2.47k • 20
allenai/tulu-2-70b

Text Generation • Updated Apr 19, 2024 • 187 • 8

Tulu V1 Suite

The set of models associated with the paper "How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources".

allenai/tulu-65b

Text Generation • Updated Jun 29, 2023 • 58 • 21
allenai/tulu-30b

Text Generation • Updated Jun 20, 2023 • 66 • 18
allenai/tulu-13b

Text Generation • Updated Jun 20, 2023 • 72 • 8
allenai/tulu-7b

Text Generation • Updated Jun 20, 2023 • 114 • 9

LM Preference Datasets

lmsys/chatbot_arena_conversations

Viewer • Updated Sep 30, 2023 • 33k • 2.18k • 436
Anthropic/hh-rlhf

Viewer • Updated May 26, 2023 • 169k • 21.7k • 1.65k
openai/summarize_from_feedback

Viewer • Updated Jan 3, 2023 • 194k • 4.32k • 216
openai/webgpt_comparisons

Viewer • Updated Dec 19, 2022 • 19.6k • 263 • 239