Spaces:

cortex359
/

germeval2025

Runtime error

App Files Files Community

germeval2025 / README.md

Christian Rene Thelen

Mixed up gradio and python version

fbdafda about 1 month ago

preview code

raw

history blame contribute delete

5.61 kB

A newer version of the Gradio SDK is available: 5.49.1

Upgrade

metadata

title: AIxcellent Vibes at GermEval 2025 Shared Task on Candy Speech Detection
emoji: 🍭
colorFrom: yellow
colorTo: pink
sdk: gradio
python_version: 3.12.9
sdk_version: 5.35.0
app_file: app.py
pinned: true

AIxcellent Vibes at GermEval 2025 Shared Task on Candy Speech Detection 🍭

Results

Subtask	Submission	Model	(strict) F1 Score
1	1	Qwen3-Embedding-8B	0.875	Notebook
1	2	XLM-RoBERTa-Large	0.891	Notebook
2	1	GBERT-Large	0.623	Notebook
2	2	XLM-RoBERTa-Large	0.631	Notebook

Setup

python_version="$(cat .python-version)"

# install the interpreter if it’s missing
pyenv install -s "${python_version}"

# select python version for current shell
pyenv shell "${python_version}"

# create venv if missing
if [[ ! -d venv ]]; then
  python -m venv venv
fi

# activate venv & install packages
source venv/bin/activate

pip install -U pip setuptools wheel
pip install -r requirements.txt

:trophy: Model

Model on Huggingface

Model Details

Model Type: Transformer-based encoder (XLM-RoBERTa-Large)
Developed by: Christian Rene Thelen, Patrick Gustav Blaneck, Tobias Bornheim, Niklas Grieger, Stephan Bialonski (FH Aachen, RWTH Aachen, ORDIX AG, Utrecht University)
Paper: AIxcellent Vibes at GermEval 2025 Shared Task on Candy Speech Detection: Improving Model Performance by Span-Level Training
Base Model: XLM-RoBERTa-Large (Conneau et al., 2020)
Fine-tuning Objective: Detection of candy speech (positive/supportive language) in German YouTube comments.

Model Description

This model is a fine-tuned XLM-RoBERTa-Large adapted for the GermEval 2025 Shared Task on Candy Speech Detection. It was trained to identify candy speech at both:

Binary level: Classify whether a comment contains candy speech.
Span level: Detect the exact spans and categories of candy speech within comments, using a BIO tagging scheme across 10 categories (positive feedback, compliment, affection declaration, encouragement, gratitude, agreement, ambiguous, implicit, group membership, sympathy).

The span-level model also proved effective for binary detection by classifying a comment as candy speech if at least one positive span was detected.

Intended Uses

Research: Analysis of positive/supportive communication in German social media.
Applications: Social media analytics, conversational AI safety (mitigating sycophancy), computational social science.
Not for: Deployments without fairness/robustness testing on out-of-domain data.

Performance

Dataset: 46k German YouTube comments, annotated with candy speech spans.
Training Data Split: 37,057 comments (train), 9,229 (test).
Shared Task Results:
- Subtask 1 (binary detection): Positive F1 = 0.891 (ranked 1st)
- Subtask 2 (span detection): Strict F1 = 0.631 (ranked 1st)

Training Procedure

Architecture: XLM-RoBERTa-Large + linear classification layer (BIO tagging, 21 labels including “O”).
Optimizer: AdamW
Learning Rate: Peak 2e-5 with linear decay and warmup (500 steps).
Epochs: 20 (with early stopping).
Batch Size: 32
Regularization: Dropout (0.1), weight decay (0.01), gradient clipping (L2 norm 1.0).
Postprocessing: BIO tag correction and subword alignment.

Limitations

Domain Specificity: Trained only on German YouTube comments; performance may degrade on other platforms, genres, or languages.
Overlapping Spans: Cannot handle overlapping spans, as they were rare (<2%) in the training data.
Biases: May reflect biases present in the dataset (e.g., demographic skews in YouTube communities).
Generalization: Needs evaluation before deployment in real-world moderation systems.

Ethical Considerations

Positive speech detection is less studied than toxic speech, but automatic labeling of “supportiveness” may reinforce cultural biases about what counts as “positive.”
Must be complemented with human-in-the-loop moderation to avoid misuse.

Citation

If you use this model, please cite:

@inproceedings{thelen-etal-2025-aixcellent,
    title = "{AI}xcellent Vibes at {G}erm{E}val 2025 Shared Task on Candy Speech Detection: Improving Model Performance by Span-Level Training",
    author = "Thelen, Christian Rene  and
      Blaneck, Patrick Gustav  and
      Bornheim, Tobias  and
      Grieger, Niklas  and
      Bialonski, Stephan",
    editor = "Wartena, Christian  and
      Heid, Ulrich",
    booktitle = "Proceedings of the 21st Conference on Natural Language Processing (KONVENS 2025): Workshops",
    month = sep,
    year = "2025",
    address = "Hannover, Germany",
    publisher = "HsH Applied Academics",
    url = "https://aclanthology.org/2025.konvens-2.33/",
    pages = "398--403"
}