Update README.md
Browse files
README.md
CHANGED
|
@@ -8,7 +8,7 @@ datasets:
|
|
| 8 |
- diff-interpretation-tuning/finetuning-data
|
| 9 |
---
|
| 10 |
|
| 11 |
-
# Diff Interpretation Tuning
|
| 12 |
This repository contains the weight diffs and DIT adapters used in the paper [Learning to Interpret Weight Differences in Language Models (Goel et al. 2025)](https://arxiv.org/abs/2510.05092).
|
| 13 |
This paper introduces *Diff Interpretation Tuning*, a method that trains a LoRA adapter than can be applied to a model to get it to describe its own finetuning induced modifications.
|
| 14 |
|
|
|
|
| 8 |
- diff-interpretation-tuning/finetuning-data
|
| 9 |
---
|
| 10 |
|
| 11 |
+
# Diff Interpretation Tuning: Weight Diffs and Adapters
|
| 12 |
This repository contains the weight diffs and DIT adapters used in the paper [Learning to Interpret Weight Differences in Language Models (Goel et al. 2025)](https://arxiv.org/abs/2510.05092).
|
| 13 |
This paper introduces *Diff Interpretation Tuning*, a method that trains a LoRA adapter than can be applied to a model to get it to describe its own finetuning induced modifications.
|
| 14 |
|