Spaces:
Paused
Paused
| # Unsloth Integration | |
| <Tip warning={true}> | |
| Section under construction. Feel free to contribute! | |
| </Tip> | |
| Unsloth is an open‑source framework for fine‑tuning and reinforcement learning that trains LLMs (like Llama, Mistral, Gemma, DeepSeek, and more) up to 2× faster with up to 70% less VRAM, while providing a streamlined, Hugging Face–compatible workflow for training, evaluation, and deployment. | |
| Unsloth library that is fully compatible with [`SFTTrainer`]. Some benchmarks on 1 x A100 listed below: | |
| | 1 A100 40GB | Dataset | 🤗 | 🤗 + FlashAttention 2 | 🦥 Unsloth | 🦥 VRAM saved | | |
| | --------------- | --------- | --- | --------------------- | --------- | ------------ | | |
| | Code Llama 34b | Slim Orca | 1x | 1.01x | **1.94x** | -22.7% | | |
| | Llama-2 7b | Slim Orca | 1x | 0.96x | **1.87x** | -39.3% | | |
| | Mistral 7b | Slim Orca | 1x | 1.17x | **1.88x** | -65.9% | | |
| | Tiny Llama 1.1b | Alpaca | 1x | 1.55x | **2.74x** | -57.8% | | |
| First, install `unsloth` according to the [official documentation](https://github.com/unslothai/unsloth). Once installed, you can incorporate unsloth into your workflow in a very simple manner; instead of loading [`~transformers.AutoModelForCausalLM`], you just need to load a `FastLanguageModel` as follows: | |
| ```python | |
| import torch | |
| from trl import SFTConfig, SFTTrainer | |
| from unsloth import FastLanguageModel | |
| max_length = 2048 # Supports automatic RoPE Scaling, so choose any number | |
| # Load model | |
| model, tokenizer = FastLanguageModel.from_pretrained( | |
| model_name="unsloth/mistral-7b", | |
| max_seq_length=max_length, | |
| dtype=None, # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+ | |
| load_in_4bit=True, # Use 4bit quantization to reduce memory usage. Can be False | |
| ) | |
| # Do model patching and add fast LoRA weights | |
| model = FastLanguageModel.get_peft_model( | |
| model, | |
| r=16, | |
| target_modules=[ | |
| "q_proj", | |
| "k_proj", | |
| "v_proj", | |
| "o_proj", | |
| "gate_proj", | |
| "up_proj", | |
| "down_proj", | |
| ], | |
| lora_alpha=16, | |
| lora_dropout=0, # Dropout = 0 is currently optimized | |
| bias="none", # Bias = "none" is currently optimized | |
| use_gradient_checkpointing=True, | |
| random_state=3407, | |
| ) | |
| training_args = SFTConfig(output_dir="./output", max_length=max_length) | |
| trainer = SFTTrainer( | |
| model=model, | |
| args=training_args, | |
| train_dataset=dataset, | |
| ) | |
| trainer.train() | |
| ``` | |
| The saved model is fully compatible with Hugging Face's transformers library. Learn more about unsloth in their [official repository](https://github.com/unslothai/unsloth). | |