DSI-large-NQ320k
This repository contains one of the models analyzed in our paper Reverse-Engineering the Retrieval Process in GenIR Models.
Training
The model is based on T5-large and was trained on the Natural Questions dataset as a atomic GenIR model reproducing DSI. The dataset can be found here.
Model Overview
| Model | Huggingface URL |
|---|---|
| NQ10k | DSI-large-NQ10k |
| NQ100k | DSI-large-NQ100k |
| NQ320k | DSI-large-NQ320k |
| Trivia-QA | DSI-large-TriviaQA |
| Trivia-QA QG | DSI-large-TriviaQ |
Citation
@inproceedings{Reusch2025Reverse,
author = {Reusch, Anja and Belinkov, Yonatan},
title = {Reverse-Engineering the Retrieval Process in GenIR Models},
year = {2025},
isbn = {9798400715921},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3726302.3730076},
doi = {10.1145/3726302.3730076},
booktitle = {Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval},
pages = {668–677},
numpages = {10},
location = {Padua, Italy},
series = {SIGIR '25}
}
- Downloads last month
- 3
Model tree for AnReu/DSI-large-NQ320k
Base model
google-t5/t5-large