Update README.md
Browse files
README.md
CHANGED
|
@@ -16,3 +16,19 @@ This is the model based on Mistral v0.3.
|
|
| 16 |
**This is the diffusion-adapted base model, which has not yet undergone instruction tuning. We recommend further tuning this model on your dataset of interest, or checking out the [instruction tuned version](https://huggingface.co/hamishivi/tess2).**
|
| 17 |
|
| 18 |
This model will only work with our custom codebase found [here](https://github.com/hamishivi/tess-2) -- please go there to see details on how to run training.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
**This is the diffusion-adapted base model, which has not yet undergone instruction tuning. We recommend further tuning this model on your dataset of interest, or checking out the [instruction tuned version](https://huggingface.co/hamishivi/tess2).**
|
| 17 |
|
| 18 |
This model will only work with our custom codebase found [here](https://github.com/hamishivi/tess-2) -- please go there to see details on how to run training.
|
| 19 |
+
|
| 20 |
+
## Citation
|
| 21 |
+
|
| 22 |
+
If you find this work useful, please cite this work as follows.
|
| 23 |
+
|
| 24 |
+
```bibtex
|
| 25 |
+
@misc{taeivison2025tess2,
|
| 26 |
+
title={{TESS 2: A Large-Scale Generalist Diffusion Language Model}},
|
| 27 |
+
author={Jaesung Tae and Hamish Ivison and Sachin Kumar and Arman Cohan},
|
| 28 |
+
year={2025},
|
| 29 |
+
eprint={2502.13917},
|
| 30 |
+
archivePrefix={arXiv},
|
| 31 |
+
primaryClass={cs.CL},
|
| 32 |
+
url={https://arxiv.org/abs/2502.13917},
|
| 33 |
+
}
|
| 34 |
+
```
|