My First LoRA Model - An Educational Failure Case

Warning: This model produces hilariously incoherent outputs!

This is my very first attempt at fine-tuning a language model using LoRA, or Low-Rank Adaptation. I’m sharing it as a prime example of what can go wrong when you're just starting out with parameter-efficient fine-tuning. The model generates mostly gibberish, which is a great lesson in what not to do.

Sample "Trash" Outputs

Here are some examples of the kind of gibberish this model produces:

Q: "What is deep learning?" A: "Deep learning is a way to understand the data that is being collected. It is a way to display the data that is used to analyze the data..."

Q: "How do you debug a Python program?" A: "The debug code is :"

Q: "Explain overfitting" A: "Overfitting the size of the car is a very common technique for removing a car from the vehicle..."

Yes, it really thinks overfitting has something to do with cars.

What Went Wrong?

I made several common mistakes during this learning process:

Poor Input Formatting: My training data was in plain text, not a structured instruction format.
Bad Generation Parameters: The temperature was too high, and I didn't set any stopping criteria.
Wrong Model Choice: The base model, DialoGPT, isn't designed for instruction following.
Missing Special Tokens: I didn't include clear instruction and response boundaries.

What I Learned

This beautiful failure was a powerful learning experience that taught me:

The critical importance of data formatting when fine-tuning a large language model.
How generation parameters, like temperature, can dramatically affect the quality of the output.
Why the choice of model architecture matters for different tasks.
That LoRA training can technically succeed (in terms of loss reduction) while still being a practical failure.

Technical Details

Base Model: microsoft/DialoGPT-small (117M params)
LoRA Rank: 8
Target Modules: ["c_attn", "c_proj"]
Training Data: A poorly formatted version of the Alpaca dataset.
Training Loss: The loss actually decreased, even though the outputs were terrible.
Trainable Parameters: Around 262k (0.2% of the total model).

How to Use (For Science!)

If you're curious to see this model's amusingly bad performance for yourself, you can use the code below.

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# Load the trash model
tokenizer = AutoTokenizer.from_pretrained("microsoft/DialoGPT-small")
base_model = AutoModelForCausalLM.from_pretrained("microsoft/DialoGPT-small")
model = PeftModel.from_pretrained(base_model, "yourusername/my-first-lora-model")

# Generate a bad response
def generate_trash(prompt):
    inputs = tokenizer.encode(f"Instruction: {prompt}
Response:", return_tensors="pt")
    outputs = model.generate(inputs, max_length=100, temperature=0.7, do_sample=True)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Try it out!
print(generate_trash("What is machine learning?"))
# Expect a response like: "Machine learning is when computers learn to computer the learning..."

The Fix

After this experience, I know what to do differently next time. I plan to:

Use a proper instruction format with special tokens.
Lower the generation temperature from 0.7 to a more suitable value like 0.1.
Add clear start and stop markers.
Choose a better base model for instruction-following tasks.

Educational Value

This model is a perfect resource for anyone who wants to:

Understand common pitfalls of LoRA fine-tuning.
See a practical demonstration of how important data formatting is.
Learn debugging skills for language model training.
Understand that technical success doesn't always equal practical success.

Model tree for Vibudhbh/my-first-lora-model

Base model

microsoft/DialoGPT-small

Adapter

(17)

this model

Vibudhbh
/

my-first-lora-model