YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Evaluations are produced with https://github.com/neuralmagic/GuardBench and vLLM as an inference engine.

Evaluations are obtained with vllm==0.15.0 and bug fixes from this PR. If you are running vllm > 0.15.0, you will likely have the bug fixes already applied as the PR landed on main.

Dataset meta-llama/Llama-Guard-4-12B
F1
RedHatAI/Llama-Guard-4-12B-FP8-dynamic
(this model)
F1
F1 Recovery % meta-llama/Llama-Guard-4-12B
Recall
RedHatAI/Llama-Guard-4-12B-FP8-dynamic
(this model)
Recall
Recall Recovery %
AART 0.874 0.873 99.89 0.776 0.775 99.87
AdvBench Behaviors 0.964 0.965 100.1 0.931 0.933 100.21
AdvBench Strings 0.83 0.832 100.24 0.709 0.713 100.56
BeaverTails 330k 0.732 0.732 100 0.591 0.591 100
Bot-Adversarial Dialogue 0.513 0.506 98.64 0.376 0.371 98.67
CatQA 0.932 0.933 100.11 0.873 0.875 100.23
ConvAbuse 0.241 0.258 107.05 0.148 0.164 110.81
DecodingTrust Stereotypes 0.591 0.601 101.69 0.419 0.43 102.63
DICES 350 0.118 0.118 100 0.063 0.063 100
DICES 990 0.219 0.226 103.2 0.135 0.142 105.19
Do Anything Now Questions 0.746 0.74 99.2 0.595 0.587 98.66
DoNotAnswer 0.546 0.544 99.63 0.376 0.374 99.47
DynaHate 0.603 0.604 100.17 0.481 0.482 100.21
HarmEval 0.56 0.566 101.07 0.389 0.395 101.54
HarmBench Behaviors 0.959 0.958 99.9 0.922 0.919 99.67
HarmfulQ 0.86 0.86 100 0.755 0.755 100
HarmfulQA Questions 0.588 0.591 100.51 0.416 0.42 100.96
HarmfulQA 0.374 0.377 100.8 0.231 0.232 100.43
HateCheck 0.782 0.785 100.38 0.667 0.671 100.6
Hatemoji Check 0.625 0.627 100.32 0.474 0.478 100.84
HEx-PHI 0.966 0.964 99.79 0.933 0.93 99.68
I-CoNa 0.837 0.833 99.52 0.719 0.713 99.17
I-Controversial 0.596 0.621 104.19 0.425 0.45 105.88
I-MaliciousInstructions 0.824 0.81 98.3 0.7 0.68 97.14
I-Physical-Safety 0.493 0.482 97.77 0.34 0.33 97.06
JBB Behaviors 0.86 0.86 100 0.86 0.86 100
MaliciousInstruct 0.953 0.947 99.37 0.91 0.9 98.9
MITRE 0.663 0.685 103.32 0.495 0.521 105.25
NicheHazardQA 0.46 0.469 101.96 0.299 0.307 102.68
OpenAI Moderation Dataset 0.739 0.734 99.32 0.787 0.787 100
ProsocialDialog 0.427 0.432 101.17 0.276 0.28 101.45
SafeText 0.372 0.38 102.15 0.254 0.262 103.15
SimpleSafetyTests 0.985 0.985 100 0.97 0.97 100
StrongREJECT Instructions 0.91 0.91 100 0.836 0.836 100
TDCRedTeaming 0.947 0.947 100 0.9 0.9 100
TechHazardQA 0.758 0.761 100.4 0.61 0.615 100.82
Toxic Chat 0.433 0.425 98.15 0.519 0.519 100
ToxiGen 0.46 0.47 102.17 0.315 0.325 103.17
XSTest 0.834 0.833 99.88 0.78 0.775 99.36
Average Score 0.6711282051 0.6729230769 100.5220513 0.5706410256 0.5725641026 100.8784615
Downloads last month
35
Safetensors
Model size
12B params
Tensor type
BF16
·
F8_E4M3
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support