Model Card for Model ID
TinyLlama-1.1B Alpaca Fine-tuned
This is a fine-tuned version of TinyLlama/TinyLlama-1.1B-Chat-v1.0 trained on the Alpaca dataset for improved instruction-following capabilities.
Model Description
- Developed by: [Navisha Shetty]
- Model type: Causal Language Model (Decoder-only Transformer)
- Language: English
- License: Apache 2.0
- Finetuned from: TinyLlama/TinyLlama-1.1B-Chat-v1.0
- Training method: QLoRA (Quantized Low-Rank Adaptation)
- Dataset: Stanford Alpaca (52,002 instruction-following examples)
Model Architecture
- Base Model: TinyLlama-1.1B (1.1 billion parameters)
- Fine-tuning Method: QLoRA with LoRA adapters
- Trainable Parameters: 4.5M (0.4% of total)
- LoRA Configuration:
- Rank (r): 16
- Alpha: 32
- Target modules: q_proj, k_proj, v_proj, o_proj
- Dropout: 0.05
Intended Use
This model is designed for instruction-following tasks and can:
- Answer questions
- Generate creative content (stories, poems, etc.)
- Provide explanations and summaries
- Help with brainstorming and ideation
- Assist with text formatting and rewriting
- Follow multi-step instructions
Direct Use
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"TinyLlama/TinyLlama-1.1B-Chat-v1.0",
device_map="auto"
)
# Load fine-tuned adapter
model = PeftModel.from_pretrained(
base_model,
"shettynavisha25/tinyllama-alpaca-finetuned"
)
tokenizer = AutoTokenizer.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat-v1.0")
# Format your prompt
prompt = """### Instruction:
Write a haiku about artificial intelligence
### Response:
"""
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=150, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Example Prompts
### Instruction:
Explain quantum computing in simple terms
### Response:
### Instruction:
Write a Python function to calculate fibonacci numbers
### Response:
Training Details
Training Data
The model was fine-tuned on the Stanford Alpaca dataset, which contains 52,002 instruction-response pairs generated using OpenAI's text-davinci-003 model. The dataset covers diverse tasks including:
- Open-ended generation
- Question answering
- Brainstorming
- Chat
- Rewriting
- Summarization
- Classification
Training Hyperparameters
| Hyperparameter | Value |
|---|---|
| Learning rate | 2e-4 |
| Batch size | 4 |
| Gradient accumulation steps | 4 |
| Effective batch size | 16 |
| Number of epochs | 3 |
| Max sequence length | 512 |
| Optimizer | paged_adamw_8bit |
| Learning rate schedule | Linear warmup (100 steps) |
| Weight decay | 0 |
| Warmup steps | 100 |
Training Procedure
- Quantization: 4-bit quantization using bitsandbytes
- Precision: FP16 mixed precision training
- Gradient Checkpointing: Enabled to reduce memory usage
- Training Steps: 9,753 total steps
- Checkpointing: Every 500 steps (last 3 checkpoints retained)
Compute Infrastructure
- Hardware: NVIDIA Tesla T4 GPU (16GB VRAM)
- Cloud Provider: AWS (g4dn.2xlarge instance)
- Orchestration: Kubernetes
- Training Time: ~13 hours
- Framework: PyTorch 2.1.0 with CUDA 12.1
Performance
Training Loss
The model achieved a final training loss of 1.14 after 3 epochs, showing consistent improvement throughout training:
- Epoch 1: Loss decreased from 1.85 โ 1.35
- Epoch 2: Loss decreased from 1.35 โ 1.20
- Epoch 3: Loss decreased from 1.20 โ 1.14
Qualitative Improvements
Compared to the base TinyLlama model, this fine-tuned version demonstrates:
- Better instruction-following behavior
- More structured and coherent responses
- Improved task completion for creative and analytical tasks
- Reduced hallucination on instruction-based queries
Limitations and Biases
- Model Size: With only 1.1B parameters, this model has limited world knowledge compared to larger models
- Dataset Biases: Inherits biases present in the Alpaca dataset and the underlying base model
- English-only: Primarily trained on English text
- Factual Accuracy: May generate plausible-sounding but incorrect information
- Context Length: Limited to 512 tokens during fine-tuning
- Not for Production: This is a research/educational model and should be thoroughly tested before production use
Ethical Considerations
This model should not be used for:
- Generating harmful, toxic, or biased content
- Impersonating individuals
- Providing medical, legal, or financial advice
- Making critical decisions without human oversight
- Spreading misinformation
Citation
If you use this model, please cite:
@misc{tinyllama-alpaca-finetuned,
author = {Navisha Shetty},
title = {TinyLlama-1.1B Alpaca Fine-tuned},
year = {2025},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/shettynavisha25/tinyllama-alpaca-finetuned}}
}
Base Model Citation
@article{zhang2024tinyllama,
title={TinyLlama: An Open-Source Small Language Model},
author={Zhang, Peiyuan and Guangtao, Zeng and Wang, Tianduo and Lu, Wei},
journal={arXiv preprint arXiv:2401.02385},
year={2024}
}
Alpaca Dataset Citation
@misc{alpaca,
author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto},
title = {Stanford Alpaca: An Instruction-following LLaMA model},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
}
Acknowledgments
- Base Model: TinyLlama team for the excellent base model
- Dataset: Stanford Alpaca team for the instruction-following dataset
- Training Framework: Hugging Face Transformers and PEFT libraries
- Infrastructure: AWS for GPU compute resources
Framework Versions
- PyTorch: 2.1.0
- Transformers: 4.35.0+
- PEFT: 0.7.0+
- Accelerate: 0.24.0+
- Bitsandbytes: 0.41.0+
- CUDA: 12.1
Contact
For questions or issues, please open an issue on the model repository or contact [[email protected]].
Note: This model is released for research and educational purposes. Please use responsibly and be aware of its limitations.
- Downloads last month
- 77
Model tree for shettynavisha25/tinyllama-alpaca-finetuned
Base model
TinyLlama/TinyLlama-1.1B-Chat-v1.0