GPT-2 Tigrinya (LoRA Fine-Tuned)

Model Details

Model Description

This model is a GPT-2 small (124M parameters) fine-tuned using LoRA (Low-Rank Adaptation) on a custom Tigrinya dataset.
It is designed for Tigrinya text generation, including chatbot/dialogue, storytelling, and text continuation.

Developed by: Abrhaley (MSc Student, Warsaw University of Technology)
Model type: Causal Language Model (decoder-only Transformer)
Languages (NLP): Tigrinya (ti)
License: MIT
Finetuned from model: gpt2
Framework versions: Transformers 4.x, PEFT 0.17.1, PyTorch 2.x

Model Sources

Repository (GitHub): abrhaley/gpt2-tigrinya-lora
Hugging Face Model Hub: abrhaley/gpt2-tigrinya-lora
Paper: N/A
Demo (Gradio): coming soon via Hugging Face Spaces

Uses

Direct Use

Generate Tigrinya text (stories, conversation, completions)
Chatbots and dialogue systems in Tigrinya
Creative text applications (poetry, narratives, cultural content)

Downstream Use

Fine-tuning for specialized domains (news, education, healthcare in Tigrinya)

Out-of-Scope Use

Generating factual or authoritative knowledge (may hallucinate)
Sensitive/critical applications (medical, legal, political decisions)
Harmful or offensive text generation

Bias, Risks, and Limitations

Dataset does not cover all Tigrinya dialects equally.
Model may generate biased, offensive, or incoherent outputs.
Not reliable for factual Q&A.
Small model (GPT-2 small) → may struggle with long contexts.

Recommendations

Users should carefully review outputs before use.
Avoid deploying the model in sensitive applications without human oversight.

How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

model_id = "abrhaley/gpt2-tigrinya-lora"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

generator = pipeline("text-generation", model=model, tokenizer=tokenizer)

prompt = "ኣብ ኣዲስ ኣበባ"
print(generator(prompt, max_length=100, do_sample=True))

## Training Details

### Training Data
- **Source**: Custom Tigrinya text corpus (news + literature + web text)  
- **Split**: Train/validation prepared manually  

### Training Procedure
- **Base model**: GPT-2 small (124M parameters)  
- **Fine-tuning method**: LoRA (PEFT) applied on attention layers  
- **LoRA config**: r=8, alpha=32, dropout=0.05  

### Hyperparameters
- **Batch size (effective)**: 8  
- **Learning rate**: 2e-4  
- **Optimizer**: AdamW  
- **Epochs**: 1 (demo training; extendable)  
- **Precision**: FP16 (when GPU available)  

---

## Evaluation

### Results
- **Training Loss**: 1.67  
- **Validation Loss**: 1.61  
- **Perplexity (PPL)**: ≈ 5.0  

### Metrics
- **Primary metric**: Perplexity (lower is better → more fluent text)  
- **Summary**: The model achieves ~5.0 PPL on validation and produces fluent/natural Tigrinya completions.  

---

## Environmental Impact
- **Hardware**: NVIDIA T4 (Google Colab)  
- **Training time**: ~5.5 hours  
- **Cloud Provider**: Google Cloud (via Colab)  
- **Carbon estimate**: <1kg CO₂eq (low emissions, small-scale run).  

---

## Technical Specifications
- **Architecture**: GPT-2 small (decoder-only Transformer)  
- **LoRA applied to**: attention layers (`c_attn`, `c_proj`)  
- **Framework**: Hugging Face Transformers + PEFT  
- **Precision**: FP16 mixed-precision (on GPU)  

## Citation

If you use this model, please cite it as:

**BibTeX:**
```bibtex
@misc{abrhaley2025gpt2tigrinya,
  title   = {GPT-2 Tigrinya LoRA Fine-Tuned},
  author  = {Abrhaley},
  year    = {2025},
  url     = {https://huggingface.co/abrhaley/gpt2-tigrinya-lora}
}

Downloads last month: 35

Model tree for Abrhaley/gpt2-tigrinya-lora

Base model

openai-community/gpt2

Adapter

(1632)

this model