---
license: apache-2.0
tags:
- mistral
- 7b
- lora
- fine-tuning
- indic-align
- hindi
- conversational-ai
---

# Dhee-Chat-Hi

A fine-tuned Hindi conversational model based on Mistral-7B-v0.3, optimized for Hindi language understanding and generation.

## Model Details

* **Base Model:** Mistral 7B v0.3
* **Fine-tuning Method:** LoRA (Low-Rank Adaptation)
* **Dataset:** `ai4bharat/indic-align`
* **Language:** Hindi
* **Model ID:** `dheeyantra/Dhee-Chat-Hi`

## Training Configuration

The model was fine-tuned using the following LoRA and training parameters:

### LoRA Parameters:
* `r`: 16
* `target_modules`: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
* `lora_alpha`: 16
* `lora_dropout`: 0
* `bias`: "none"
* `use_gradient_checkpointing`: "unsloth"
* `use_rslora`: False
* `loftq_config`: None

### Training Arguments:
* `per_device_train_batch_size`: 1
* `per_device_eval_batch_size`: 1
* `gradient_accumulation_steps`: 4
* `warmup_ratio`: 0.03
* `fp16`: True 
* `optim`: "adamw_8bit"
* `max_seq_length`: 32768
* `dataset_num_proc`: 2
* `packing`: False

## Intended Uses & Limitations

This model is intended for use in Hindi conversational applications, such as chatbots and virtual assistants. As it is fine-tuned on the `ai4bharat/indic-align` dataset, its knowledge and conversational style are primarily shaped by this data.

Limitations:
* The model's responses are based on the patterns and information present in the training data. It may generate incorrect or biased information.
* Performance may vary depending on the complexity and nuance of the input.
* The model is primarily focused on Hindi and may not perform well in other languages or code-mixed scenarios unless explicitly trained for them.

## How to Get Started with Hugging Face Transformers

You can use the following Python code to load and run inference with the `dheeyantra/Dhee-Chat-Hi` model:

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_path = "dheeyantra/Dhee-Chat-Hi"

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path)

# Move model to GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

# Prepare chat messages
messages = [
    {"role": "User", "content": "कितने वेद हैं?"},
    {"role": "Dhee", "content": "चार वेद हैंः ऋग्वेद, यजुर्वेद, सामवेद और अथर्ववेद।"},
    {"role": "User", "content": "ऋग्वेद के बारे में और बतायें?"}
]

# Apply chat template to get prompt
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

# Tokenize prompt
inputs = tokenizer(prompt, return_tensors="pt").to(device)

# Generate output
with torch.no_grad():
    output_ids = model.generate(
        **inputs,
        max_new_tokens=64,
        do_sample=True,
        temperature=0.7,
        top_p=0.95,
        pad_token_id=tokenizer.eos_token_id
    )

# Decode generated text
generated_text = tokenizer.decode(output_ids[0][inputs['input_ids'].shape[-1]:], skip_special_tokens=True)
results = [{"generated_text": generated_text}]

print("Generated text:")
print(results[0]['generated_text'])
```

## Hardware Requirements
This model requires a GPU for efficient inference, especially for longer sequences or interactive applications. The exact VRAM requirements will depend on the sequence length and batch size.

## Disclaimer
This model is provided as-is. Users should be aware of its potential limitations and biases before deploying it in any application. Responsible AI practices should be followed.