Code Specialist 7B
Description
Code Specialist 7B is a fine-tuned version of Mistral-7B-Instruct-v0.3, trained through Supervised Fine-Tuning (SFT) using datasets focused on Python and SQL.
The goal of this training was to enhance the model’s performance in data analysis, programming problem-solving, and technical reasoning.
The model preserves the 7B parameter Transformer decoder-only architecture while introducing a code-oriented fine-tuning, resulting in improved robustness for function generation, SQL queries, and technical answers.
Base Model
- Mistral-7B-Instruct-v0.3
- Architecture: Transformer (decoder-only)
- Parameters: ~7B
Datasets Used for SFT
Both datasets were filtered to include only Python and SQL examples, following Alpaca/Mistral-style instruction formatting.
Example prompt format:
[INST] Write a Python function that adds two numbers. [/INST]
def add(a, b):
return a + b
Training Details
| Aspect | Detail |
|---|---|
| Method | QLoRA with final weight merge |
| Frameworks | transformers, trl, peft, bitsandbytes |
| Hardware | GPU with 12 GB VRAM (4-bit quantization for training) |
Main Hyperparameters
| Parameter | Value |
|---|---|
per_device_train_batch_size |
2 |
gradient_accumulation_steps |
4 |
learning_rate |
2e-4 |
num_train_epochs |
1 |
max_seq_length |
1024 |
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "Ricardouchub/Code-Specialist-7B"
tok = AutoTokenizer.from_pretrained(model_id)
mdl = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
prompt = "[INST] Write a Python function that calculates the average of a list. [/INST]"
inputs = tok(prompt, return_tensors="pt").to(mdl.device)
out = mdl.generate(**inputs, max_new_tokens=256)
print(tok.decode(out[0], skip_special_tokens=True))
Initial Benchmarks
- Simple evaluation (Python tasks): Improved results on small programming and data-related tasks, including data analysis, SQL query generation, and Python snippets, compared to the base model.
- Further evaluation on HumanEval or MBPP is recommended for reproducible metrics.
Author
Ricardo Urdaneta
Limitations
- The model does not guarantee 100% accuracy on complex programming tasks.
- It may produce inconsistent results for ambiguous or incomplete prompts.
License
This model is released under the same license as Mistral-7B-Instruct-v0.3 — MIT License.
- Downloads last month
- 16
Model tree for Ricardouchub/code-specialist-7b
Base model
mistralai/Mistral-7B-v0.3