Wes
Resolve README.md conflict (add Spaces front-matter)
c920a51

A newer version of the Gradio SDK is available: 5.49.1

Upgrade
metadata
title: MLP Safety Classifier
emoji: 🛡️
colorFrom: indigo
colorTo: green
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: mit

🛡️ Aegis Safety Classifier

This Space hosts a text classifier trained on the NVIDIA Aegis 2.0 dataset.
It predicts whether a piece of text is safe or unsafe.

The model is a simple TF-IDF + MLP pipeline implemented in scikit-learn.


Features

  • Paste any text into the textbox and get a prediction.
  • Adjustable threshold for the probability of being classified as "safe".
  • JSON output with:
    • prediction: 1 = safe, 0 = unsafe
    • probabilities: [p(unsafe), p(safe)] if available

How to Use

  • Open this Space.
  • Enter some text in the input field.
  • Adjust the threshold (default = 0.5).
  • Press Submit to get results.