GPT-2 Tigrinya (LoRA Fine-Tuned)

Model Details

Model Description

This model is a GPT-2 small (124M parameters) fine-tuned using LoRA (Low-Rank Adaptation) on a custom Tigrinya dataset.
It is designed for Tigrinya text generation, including chatbot/dialogue, storytelling, and text continuation.

  • Developed by: Abrhaley (MSc Student, Warsaw University of Technology)
  • Model type: Causal Language Model (decoder-only Transformer)
  • Languages (NLP): Tigrinya (ti)
  • License: MIT
  • Finetuned from model: gpt2
  • Framework versions: Transformers 4.x, PEFT 0.17.1, PyTorch 2.x

Model Sources


Uses

Direct Use

  • Generate Tigrinya text (stories, conversation, completions)
  • Chatbots and dialogue systems in Tigrinya
  • Creative text applications (poetry, narratives, cultural content)

Downstream Use

  • Fine-tuning for specialized domains (news, education, healthcare in Tigrinya)

Out-of-Scope Use

  • Generating factual or authoritative knowledge (may hallucinate)
  • Sensitive/critical applications (medical, legal, political decisions)
  • Harmful or offensive text generation

Bias, Risks, and Limitations

  • Dataset does not cover all Tigrinya dialects equally.
  • Model may generate biased, offensive, or incoherent outputs.
  • Not reliable for factual Q&A.
  • Small model (GPT-2 small) → may struggle with long contexts.

Recommendations

Users should carefully review outputs before use.
Avoid deploying the model in sensitive applications without human oversight.


How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

model_id = "abrhaley/gpt2-tigrinya-lora"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

generator = pipeline("text-generation", model=model, tokenizer=tokenizer)

prompt = "ኣብ ኣዲስ ኣበባ"
print(generator(prompt, max_length=100, do_sample=True))

## Training Details

### Training Data
- **Source**: Custom Tigrinya text corpus (news + literature + web text)  
- **Split**: Train/validation prepared manually  

### Training Procedure
- **Base model**: GPT-2 small (124M parameters)  
- **Fine-tuning method**: LoRA (PEFT) applied on attention layers  
- **LoRA config**: r=8, alpha=32, dropout=0.05  

### Hyperparameters
- **Batch size (effective)**: 8  
- **Learning rate**: 2e-4  
- **Optimizer**: AdamW  
- **Epochs**: 1 (demo training; extendable)  
- **Precision**: FP16 (when GPU available)  

---

## Evaluation

### Results
- **Training Loss**: 1.67  
- **Validation Loss**: 1.61  
- **Perplexity (PPL)**: ≈ 5.0  

### Metrics
- **Primary metric**: Perplexity (lower is better → more fluent text)  
- **Summary**: The model achieves ~5.0 PPL on validation and produces fluent/natural Tigrinya completions.  

---

## Environmental Impact
- **Hardware**: NVIDIA T4 (Google Colab)  
- **Training time**: ~5.5 hours  
- **Cloud Provider**: Google Cloud (via Colab)  
- **Carbon estimate**: <1kg COâ‚‚eq (low emissions, small-scale run).  

---

## Technical Specifications
- **Architecture**: GPT-2 small (decoder-only Transformer)  
- **LoRA applied to**: attention layers (`c_attn`, `c_proj`)  
- **Framework**: Hugging Face Transformers + PEFT  
- **Precision**: FP16 mixed-precision (on GPU)  

## Citation

If you use this model, please cite it as:

**BibTeX:**
```bibtex
@misc{abrhaley2025gpt2tigrinya,
  title   = {GPT-2 Tigrinya LoRA Fine-Tuned},
  author  = {Abrhaley},
  year    = {2025},
  url     = {https://huggingface.co/abrhaley/gpt2-tigrinya-lora}
}
Downloads last month
35
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Abrhaley/gpt2-tigrinya-lora

Adapter
(1632)
this model