YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
TREA 2.0 Pipeline
Audio question-answering dataset generator using ESC-50. Creates four task types: COUNT, DURATION, ORDER, and VOLUME.
Quick Start
# 1. Install dependencies
pip install -r requirements.txt
# 2. Preprocess ESC-50 (required for DURATION task only)
python preprocess_esc50.py --config config.yaml
# 3. Generate datasets
python main.py --config config.yaml
Configuration
Edit config.yaml to set:
- Task duration:
task_duration_size(hours) per task - Clip duration range:
min_clip_durationtomax_clip_duration(seconds) - ESC-50 paths: Point to your ESC-50 dataset location
- Enable/disable tasks: Set
enabled: true/falsefor each task
Key Files
config.yaml- All configuration parametersmain.py- Pipeline entry point (runs all tasks)preprocess_esc50.py- Preprocess ESC-50 for duration tasktasks/task_*.py- Individual task generators
Tasks
| Task | Question | Example |
|---|---|---|
| COUNT | "How many unique sounds?" | Audio with distinct sound types |
| DURATION | "Which sound is longest/shortest?" | Compare sound durations |
| ORDER | "Which sound is first/last/after X?" | Temporal sequence questions |
| VOLUME | "Which sound is loudest/softest?" | Loudness comparison |
Output Structure
output/{task}/
βββ audios/*.wav # Generated audio files
βββ {task}_mcq.csv # Multiple choice questions
βββ {task}_open_text.csv # Open-ended questions
βββ {task}_metadata.csv # Detailed metadata
Shell scripts (quick)
Use the provided shell helpers for simple runs.
Run full pipeline (uses python main.py under the hood):
# Make executable and run (from pipeline/)
./run_pipeline.sh
# With custom config, tasks, and output
./run_pipeline.sh --config my_config.yaml --tasks count,order --output ./my_dataset
Run the LLM answer generation across splits (uses llm_answer_generator.py):
# Processes open_text CSVs across splits/tasks defined in the script
./run_llm_answers_all.sh
# Or run per-file with the helper script directly
python llm_answer_generator.py --input /path/to/count_open_text.csv --mode open_text --task count
Advanced Usage
# Run specific tasks only
python main.py --tasks count order
# Use custom config
python main.py --config my_config.yaml
# Custom output directory
python main.py --output /path/to/output
# Preprocess with custom parameters
python preprocess_esc50.py --config config.yaml \
--threshold-strategy noise_floor \
--noise-floor-percentile 2.0 \
--noise-floor-delta-db 5.0
Documentation
See DOCS.md for complete technical documentation including:
- Mathematical formulations
- Detailed algorithm explanations
- Configuration parameter reference
- Preprocessing pipeline details
- Balancing mechanisms
Requirements
- Python 3.8+
- pydub
- numpy
- pandas
- tqdm
- pyyaml
- Downloads last month
- 3
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support