markussagen
/

xlm-roberta-longformer-base-4096

Model card Files Files and versions

markussagen commited on Jul 26, 2021

Commit

c03980e

·

1 Parent(s): adb08df

Improved README

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -1,5 +1,5 @@
-## XLM-R Longformer Model
-XLM-R Longformer is a XLM-R model, that has been extended to allow sequence lengths up to 4096 tokens, instead of the regular 512. The model was pre-trained from the XLM-RoBERTa checkpoint using the Longformer [pre-training scheme](https://github.com/allenai/longformer/blob/master/scripts/convert_model_to_long.ipynb) on the English WikiText-103 corpus.
 The reason for this was to investigate methods for creating efficient Transformers for low-resource languages, such as Swedish, without the need to pre-train them on long-context datasets in each respecitve language. The trained model came as a result of a master thesis project at [Peltarion](https://peltarion.com/) and was fine-tuned on multilingual quesion-answering tasks, with code available [here](https://github.com/MarkusSagen/Master-Thesis-Multilingual-Longformer#xlm-r).

+## XLM-R Longformer Model  / XLM-Long
+XLM-R Longformer (or XLM-Long for short) is a XLM-R model that has been extended to allow sequence lengths up to 4096 tokens, instead of the regular 512. The model was pre-trained from the XLM-RoBERTa checkpoint using the Longformer [pre-training scheme](https://github.com/allenai/longformer/blob/master/scripts/convert_model_to_long.ipynb) on the English WikiText-103 corpus.
 The reason for this was to investigate methods for creating efficient Transformers for low-resource languages, such as Swedish, without the need to pre-train them on long-context datasets in each respecitve language. The trained model came as a result of a master thesis project at [Peltarion](https://peltarion.com/) and was fine-tuned on multilingual quesion-answering tasks, with code available [here](https://github.com/MarkusSagen/Master-Thesis-Multilingual-Longformer#xlm-r).