diff-interpretation-tuning
/

loras

Diff Interpretation Tuning

Model card Files Files and versions

ttw commited on Oct 12

Commit

e7ac230

·

verified ·

1 Parent(s): d4e8bb0

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -12,7 +12,7 @@ datasets:
 This repository contains the weight diffs and DIT adapters used in the paper [Learning to Interpret Weight Differences in Language Models (Goel et al. 2025)](https://arxiv.org/abs/2510.05092).
 This paper introduces *Diff Interpretation Tuning*, a method that trains a LoRA adapter than can be applied to a model to get it to describe its own finetuning induced modifications.
-To play around with the weight diffs and DIT adapters from the paper, please check out our [Google Colab demo notebook](https://colab.research.google.com/drive/12YD_9GRT-y_hFOBqXzyI4eN_lJGKiXwN?usp=sharing).
 This notebook shows how to load the weight diffs and adapters from this repo.
 The code used to train and evaluate our weight diffs and DIT adapters can be found at [github.com/Aviously/diff-interpretation-tuning](https://github.com/Aviously/diff-interpretation-tuning).

 This repository contains the weight diffs and DIT adapters used in the paper [Learning to Interpret Weight Differences in Language Models (Goel et al. 2025)](https://arxiv.org/abs/2510.05092).
 This paper introduces *Diff Interpretation Tuning*, a method that trains a LoRA adapter than can be applied to a model to get it to describe its own finetuning induced modifications.
+To play around with the weight diffs and DIT adapters from the paper, please check out our [Google Colab demo notebook](https://colab.research.google.com/drive/12YD_9GRT-y_hFOBqXzyI4eN_lJGKiXwN?usp=sharing#forceEdit=true&sandboxMode=true).
 This notebook shows how to load the weight diffs and adapters from this repo.
 The code used to train and evaluate our weight diffs and DIT adapters can be found at [github.com/Aviously/diff-interpretation-tuning](https://github.com/Aviously/diff-interpretation-tuning).