namespace-Pt
/

Llama-3-8B-Instruct-80K-QLoRA

namespace-Pt commited on Apr 30, 2024

Commit

b4e2b3e

verified ·

1 Parent(s): d9b60fd

Upload folder using huggingface_hub

Files changed (1) hide show

README.md CHANGED Viewed

@@ -6,7 +6,7 @@ pipeline_tag: text-generation
 <div align="center">
 <h1>Llama-3-8B-Instruct-80K-QLoRA</h1>
-<a href="https://github.com/FlagOpen/FlagEmbedding/tree/master/Long_LLM/">[Data&Code]</a>
 </div>
 We extend the context length of Llama-3-8B-Instruct to 80K using QLoRA and 3.5K long-context training data synthesized from GPT-4. The entire training cycle is super efficient, which takes 8 hours on a 8xA800 (80G) machine. Yet, the resulted model achieves remarkable performance on a series of downstream long-context evaluation benchmarks.
@@ -14,7 +14,7 @@ We extend the context length of Llama-3-8B-Instruct to 80K using QLoRA and 3.5K
 # Evaluation
-All the following evaluation results can be reproduced following instructions [here](https://github.com/FlagOpen/FlagEmbedding/tree/master/Long_LLM/activation_beacon/new/docs/llama3-8b-instruct-qlora-80k.md).
 ## Needle in a Haystack
 We evaluate the model on the Needle-In-A-HayStack task using the official setting. The blue vertical line indicates the training context length, i.e. 80K.

 <div align="center">
 <h1>Llama-3-8B-Instruct-80K-QLoRA</h1>
+<a href="https://github.com/FlagOpen/FlagEmbedding/tree/master/Long_LLM/longllm_qlora">[Data&Code]</a>
 </div>
 We extend the context length of Llama-3-8B-Instruct to 80K using QLoRA and 3.5K long-context training data synthesized from GPT-4. The entire training cycle is super efficient, which takes 8 hours on a 8xA800 (80G) machine. Yet, the resulted model achieves remarkable performance on a series of downstream long-context evaluation benchmarks.
 # Evaluation
+All the following evaluation results can be reproduced following instructions [here](https://github.com/FlagOpen/FlagEmbedding/tree/master/Long_LLM/longllm_qlora).
 ## Needle in a Haystack
 We evaluate the model on the Needle-In-A-HayStack task using the official setting. The blue vertical line indicates the training context length, i.e. 80K.