Kwaipilot-KAT-Dev-CPT-LoRA-Adapter-HyperSwitch

A LoRA fine-tuned model based on Kwaipilot/KAT-Dev specialized for the Hyperswitch Rust codebase. This model excels at understanding payment processing patterns, Hyperswitch architecture, and Rust development practices.

🎯 Model Description

This LoRA adapter was trained on 16,731 samples extracted from the Hyperswitch codebase to enhance code understanding, explanation, and generation within the payment processing domain.

  • Base Model: Kwaipilot/KAT-Dev
  • Training Type: Causal Language Modeling (CLM) with LoRA
  • Domain: Payment Processing, Rust Development
  • Specialization: Hyperswitch codebase patterns and architecture

πŸ“Š Training Details

Dataset Composition

  • Total Samples: 16,731
    • File-level samples: 2,120 complete files
    • Granular samples: 14,611 extracted components
      • Functions: 4,121
      • Structs: 5,710
      • Traits: 223
      • Implementations: 4,296
      • Modules: 261

LoRA Configuration

r: 64                   # LoRA rank
alpha: 128              # LoRA alpha (2*r)
dropout: 0.05           # LoRA dropout
target_modules:         # Applied to all linear layers
  - q_proj, k_proj, v_proj, o_proj
  - gate_proj, up_proj, down_proj

Training Hyperparameters

  • Epochs: 5
  • Batch Size: 2 per device (16 effective with gradient accumulation)
  • Learning Rate: 5e-5 (cosine schedule)
  • Max Context: 8,192 tokens
  • Hardware: 2x NVIDIA H200 (80GB each)
  • Training Time: ~4 hours (2,355 steps)

Training Results

Final Loss: 0.48 (from 1.63)
Perplexity: 1.59 (from 5.12)  
Accuracy: 89% (from 61%)

πŸš€ Usage

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "Kwaipilot/KAT-Dev",
    dtype=torch.bfloat16,
    device_map="auto"
)
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("Kwaipilot/KAT-Dev")
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "juspay/Kwaipilot-KAT-Dev-CPT-LoRA-Adapter-HyperSwitch")
# Generate code
prompt = """// Hyperswitch payment processing
pub fn validate_payment_method("""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=200,
    temperature=0.2,  # Lower temperature for code generation
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Recommended Settings

  • Temperature: 0.2-0.3 for code generation
  • Temperature: 0.5-0.7 for explanations and documentation
  • Max tokens: 1024 for most tasks

πŸ› οΈ Technical Specifications

  • Context Window: 8,192 tokens
  • Precision: bfloat16
  • Memory Usage: ~78GB VRAM (32B base model)
  • Inference Speed: Optimized with Flash Attention 2

πŸ™ Acknowledgments

  • Kwaipilot Team for the excellent KAT-Dev base model
  • Hyperswitch Team for the open-source payment processing platform
  • Hugging Face for the transformers and PEFT libraries

πŸ“ž Citation

@misc{hyperswitch-kat-dev-lora-2024,
  title={Kwaipilot-KAT-Dev-CPT-LoRA-Adapter-HyperSwitch},
  author={Aditya Narayan},
  year={2024},
  publisher={Hugging Face},
  url={https://huggingface.co/juspay/Kwaipilot-KAT-Dev-CPT-LoRA-Adapter-HyperSwitch}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for juspay/Kwaipilot-KAT-Dev-CPT-LoRA-Adapter-HyperSwitch

Base model

Kwaipilot/KAT-Dev
Finetuned
(2)
this model