YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

TREA 2.0 Pipeline

Audio question-answering dataset generator using ESC-50. Creates four task types: COUNT, DURATION, ORDER, and VOLUME.

Quick Start

# 1. Install dependencies
pip install -r requirements.txt

# 2. Preprocess ESC-50 (required for DURATION task only)
python preprocess_esc50.py --config config.yaml

# 3. Generate datasets
python main.py --config config.yaml

Configuration

Edit config.yaml to set:

  • Task duration: task_duration_size (hours) per task
  • Clip duration range: min_clip_duration to max_clip_duration (seconds)
  • ESC-50 paths: Point to your ESC-50 dataset location
  • Enable/disable tasks: Set enabled: true/false for each task

Key Files

  • config.yaml - All configuration parameters
  • main.py - Pipeline entry point (runs all tasks)
  • preprocess_esc50.py - Preprocess ESC-50 for duration task
  • tasks/task_*.py - Individual task generators

Tasks

Task Question Example
COUNT "How many unique sounds?" Audio with distinct sound types
DURATION "Which sound is longest/shortest?" Compare sound durations
ORDER "Which sound is first/last/after X?" Temporal sequence questions
VOLUME "Which sound is loudest/softest?" Loudness comparison

Output Structure

output/{task}/
β”œβ”€β”€ audios/*.wav          # Generated audio files
β”œβ”€β”€ {task}_mcq.csv        # Multiple choice questions
β”œβ”€β”€ {task}_open_text.csv  # Open-ended questions
└── {task}_metadata.csv   # Detailed metadata

Shell scripts (quick)

Use the provided shell helpers for simple runs.

Run full pipeline (uses python main.py under the hood):

# Make executable and run (from pipeline/)
./run_pipeline.sh

# With custom config, tasks, and output
./run_pipeline.sh --config my_config.yaml --tasks count,order --output ./my_dataset

Run the LLM answer generation across splits (uses llm_answer_generator.py):

# Processes open_text CSVs across splits/tasks defined in the script
./run_llm_answers_all.sh

# Or run per-file with the helper script directly
python llm_answer_generator.py --input /path/to/count_open_text.csv --mode open_text --task count

Advanced Usage

# Run specific tasks only
python main.py --tasks count order

# Use custom config
python main.py --config my_config.yaml

# Custom output directory
python main.py --output /path/to/output

# Preprocess with custom parameters
python preprocess_esc50.py --config config.yaml \
    --threshold-strategy noise_floor \
    --noise-floor-percentile 2.0 \
    --noise-floor-delta-db 5.0

Documentation

See DOCS.md for complete technical documentation including:

  • Mathematical formulations
  • Detailed algorithm explanations
  • Configuration parameter reference
  • Preprocessing pipeline details
  • Balancing mechanisms

Requirements

  • Python 3.8+
  • pydub
  • numpy
  • pandas
  • tqdm
  • pyyaml
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support