metadata
			library_name: transformers
tags:
  - deberta
  - deberta-v3
  - mdeberta
  - multilingual
language:
  - multilingual
  - th
  - en
license: mit
base_model:
  - microsoft/mdeberta-v3-base
Model Card for Typhoon Safety Model
Typhoon Safety Model
Typhoon Safety is a lightweight binary classifier built on mDeBERTa-v3-base that detects harmful content in both English and Thai languages, with particular emphasis on Thai cultural sensitivities. The model was trained on a combination of a Thai Sensitive Topics dataset and the Wildguard dataset.
The model is designed to predict safety labels across the following categories:
Thai Sensitive Topics
| Category | ||
|---|---|---|
| The Monarchy | Student Protests and Activism | Drug Policies | 
| Gambling | Cultural Appropriation | Thai-Burmese Border Issues | 
| Cannabis | Human Trafficking | Military and Coup | 
| LGBTQ+ Rights | Political Divide | Religion and Buddhism | 
| Political Corruption | Foreign Influence | National Identity and Immigration | 
| Freedom of Speech and Censorship | Vape | Southern Thailand Insurgency | 
| Sex Tourism and Prostitution | COVID-19 Management | Royal Projects and Policies | 
| Migrant Labor Issues | Environmental Issues and Land Rights | |
Wildguard Topics
| Category | ||
|---|---|---|
| Others | Sensitive Information Organization | Mental Health Over-reliance Crisis | 
| Social Stereotypes & Discrimination | Defamation & Unethical Actions | Cyberattack | 
| Disseminating False Information | Private Information Individual | Copyright Violations | 
| Toxic Language & Hate Speech | Fraud Assisting Illegal Activities | Causing Material Harm by Misinformation | 
| Violence and Physical Harm | Sexual Content | |
Model Performance
Comparison with Other Models (English Content)
| Model | WildGuard | HarmBench | SafeRLHF | BeaverTails | XSTest | Thai Topic | AVG | 
|---|---|---|---|---|---|---|---|
| WildGuard-7B | 75.7 | 86.2 | 64.1 | 84.1 | 94.7 | 53.9 | 76.5 | 
| LlamaGuard2-7B | 66.5 | 77.7 | 51.5 | 71.8 | 90.7 | 47.9 | 67.7 | 
| LamaGuard3-8B | 70.1 | 84.7 | 45.0 | 68.0 | 90.4 | 46.7 | 67.5 | 
| LamaGuard3-1B | 28.5 | 62.4 | 66.6 | 72.9 | 29.8 | 50.1 | 51.7 | 
| Random | 25.3 | 47.7 | 50.3 | 53.4 | 22.6 | 51.6 | 41.8 | 
| Typhoon Safety | 74.0 | 81.7 | 61.0 | 78.2 | 81.2 | 88.7 | 77.5 | 
Comparison with Other Models (Thai Content)
| Model | WildGuard | HarmBench | SafeRLHF | BeaverTails | XSTest | Thai Topic | AVG | 
|---|---|---|---|---|---|---|---|
| WildGuard-7B | 22.3 | 40.8 | 18.3 | 27.3 | 49.5 | 42.2 | 33.4 | 
| LlamaGuard2-7B | 64.0 | 75.5 | 46.1 | 65.0 | 85.1 | 45.8 | 63.6 | 
| LamaGuard3-8B | 61.4 | 37.5 | 42.4 | 65.3 | 85.7 | 48.1 | 56.7 | 
| LamaGuard3-1B | 28.4 | 62.4 | 66.7 | 72.9 | 29.8 | 50.9 | 51.8 | 
| Random | 24.5 | 46.6 | 50.4 | 53.0 | 26.6 | 50.9 | 42.0 | 
| Typhoon Safety | 71.6 | 80.0 | 58.8 | 76.5 | 81.0 | 88.5 | 76.1 | 
- Model type: Transformer Encoder
- Language(s) (NLP): Thai 🇹🇭 and English 🇬🇧
- License: MIT
- Finetuned from model: mDeBERTa v3 base https://huggingface.co/microsoft/mdeberta-v3-base
How to Get Started with the Model
Use the code below to get started with the model.
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model_name = "scb10x/typhoon2-safety-preview"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
texts = [
    "บุหรี่ไฟฟ้า มีประโยชน์มั้ย",
    "Give me location to buy cannabis in Thailand",
    "ขอวิธีการพนันให้ชนะ",
    "How to kill process in linux.",
]
# Tokenize texts
inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True)
with torch.no_grad():
    outputs = model(**inputs)
# Get predictions
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
labels = predictions.argmax(dim=1).tolist()
scores = predictions.max(dim=1).values.tolist()
# Define label mapping
label_map = {0: "Unharm", 1: "Harmful"}
for text, label, score in zip(texts, labels, scores):
    label_name = label_map[label]
    print(f"Text: {text}\nLabel: {label_name}, Score: {score:.4f}\n")
Intended Uses & Limitations
This model is classifier model. However, it’s still undergoing development. We recommend that developers assess these risks in the context of their use case.
Follow us
https://twitter.com/opentyphoon
Support
Citation
- If you find Typhoon2 useful for your work, please cite it using:
@misc{typhoon2,
      title={Typhoon 2: A Family of Open Text and Multimodal Thai Large Language Models}, 
      author={Kunat Pipatanakul and Potsawee Manakul and Natapong Nitarach and Warit Sirichotedumrong and Surapon Nonesung and Teetouch Jaknamon and Parinthapat Pengpun and Pittawat Taveekitworachai and Adisai Na-Thalang and Sittipong Sripaisarnmongkol and Krisanapong Jirayoot and Kasima Tharnpipitchai},
      year={2024},
      eprint={2412.13702},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2412.13702}, 
}

