Model Card for Model ID

TinyLlama-1.1B Alpaca Fine-tuned

This is a fine-tuned version of TinyLlama/TinyLlama-1.1B-Chat-v1.0 trained on the Alpaca dataset for improved instruction-following capabilities.

Model Description

  • Developed by: [Navisha Shetty]
  • Model type: Causal Language Model (Decoder-only Transformer)
  • Language: English
  • License: Apache 2.0
  • Finetuned from: TinyLlama/TinyLlama-1.1B-Chat-v1.0
  • Training method: QLoRA (Quantized Low-Rank Adaptation)
  • Dataset: Stanford Alpaca (52,002 instruction-following examples)

Model Architecture

  • Base Model: TinyLlama-1.1B (1.1 billion parameters)
  • Fine-tuning Method: QLoRA with LoRA adapters
  • Trainable Parameters: 4.5M (0.4% of total)
  • LoRA Configuration:
    • Rank (r): 16
    • Alpha: 32
    • Target modules: q_proj, k_proj, v_proj, o_proj
    • Dropout: 0.05

Intended Use

This model is designed for instruction-following tasks and can:

  • Answer questions
  • Generate creative content (stories, poems, etc.)
  • Provide explanations and summaries
  • Help with brainstorming and ideation
  • Assist with text formatting and rewriting
  • Follow multi-step instructions

Direct Use

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "TinyLlama/TinyLlama-1.1B-Chat-v1.0",
    device_map="auto"
)

# Load fine-tuned adapter
model = PeftModel.from_pretrained(
    base_model,
    "shettynavisha25/tinyllama-alpaca-finetuned"
)

tokenizer = AutoTokenizer.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat-v1.0")

# Format your prompt
prompt = """### Instruction:
Write a haiku about artificial intelligence

### Response:
"""

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=150, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Example Prompts

### Instruction:
Explain quantum computing in simple terms

### Response:
### Instruction:
Write a Python function to calculate fibonacci numbers

### Response:

Training Details

Training Data

The model was fine-tuned on the Stanford Alpaca dataset, which contains 52,002 instruction-response pairs generated using OpenAI's text-davinci-003 model. The dataset covers diverse tasks including:

  • Open-ended generation
  • Question answering
  • Brainstorming
  • Chat
  • Rewriting
  • Summarization
  • Classification

Training Hyperparameters

Hyperparameter Value
Learning rate 2e-4
Batch size 4
Gradient accumulation steps 4
Effective batch size 16
Number of epochs 3
Max sequence length 512
Optimizer paged_adamw_8bit
Learning rate schedule Linear warmup (100 steps)
Weight decay 0
Warmup steps 100

Training Procedure

  • Quantization: 4-bit quantization using bitsandbytes
  • Precision: FP16 mixed precision training
  • Gradient Checkpointing: Enabled to reduce memory usage
  • Training Steps: 9,753 total steps
  • Checkpointing: Every 500 steps (last 3 checkpoints retained)

Compute Infrastructure

  • Hardware: NVIDIA Tesla T4 GPU (16GB VRAM)
  • Cloud Provider: AWS (g4dn.2xlarge instance)
  • Orchestration: Kubernetes
  • Training Time: ~13 hours
  • Framework: PyTorch 2.1.0 with CUDA 12.1

Performance

Training Loss

The model achieved a final training loss of 1.14 after 3 epochs, showing consistent improvement throughout training:

  • Epoch 1: Loss decreased from 1.85 โ†’ 1.35
  • Epoch 2: Loss decreased from 1.35 โ†’ 1.20
  • Epoch 3: Loss decreased from 1.20 โ†’ 1.14

Qualitative Improvements

Compared to the base TinyLlama model, this fine-tuned version demonstrates:

  • Better instruction-following behavior
  • More structured and coherent responses
  • Improved task completion for creative and analytical tasks
  • Reduced hallucination on instruction-based queries

Limitations and Biases

  • Model Size: With only 1.1B parameters, this model has limited world knowledge compared to larger models
  • Dataset Biases: Inherits biases present in the Alpaca dataset and the underlying base model
  • English-only: Primarily trained on English text
  • Factual Accuracy: May generate plausible-sounding but incorrect information
  • Context Length: Limited to 512 tokens during fine-tuning
  • Not for Production: This is a research/educational model and should be thoroughly tested before production use

Ethical Considerations

This model should not be used for:

  • Generating harmful, toxic, or biased content
  • Impersonating individuals
  • Providing medical, legal, or financial advice
  • Making critical decisions without human oversight
  • Spreading misinformation

Citation

If you use this model, please cite:

@misc{tinyllama-alpaca-finetuned,
  author = {Navisha Shetty},
  title = {TinyLlama-1.1B Alpaca Fine-tuned},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/shettynavisha25/tinyllama-alpaca-finetuned}}
}

Base Model Citation

@article{zhang2024tinyllama,
  title={TinyLlama: An Open-Source Small Language Model},
  author={Zhang, Peiyuan and Guangtao, Zeng and Wang, Tianduo and Lu, Wei},
  journal={arXiv preprint arXiv:2401.02385},
  year={2024}
}

Alpaca Dataset Citation

@misc{alpaca,
  author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto},
  title = {Stanford Alpaca: An Instruction-following LLaMA model},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
}

Acknowledgments

  • Base Model: TinyLlama team for the excellent base model
  • Dataset: Stanford Alpaca team for the instruction-following dataset
  • Training Framework: Hugging Face Transformers and PEFT libraries
  • Infrastructure: AWS for GPU compute resources

Framework Versions

  • PyTorch: 2.1.0
  • Transformers: 4.35.0+
  • PEFT: 0.7.0+
  • Accelerate: 0.24.0+
  • Bitsandbytes: 0.41.0+
  • CUDA: 12.1

Contact

For questions or issues, please open an issue on the model repository or contact [[email protected]].


Note: This model is released for research and educational purposes. Please use responsibly and be aware of its limitations.

Downloads last month
77
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for shettynavisha25/tinyllama-alpaca-finetuned

Adapter
(1240)
this model