Spaces:
Runtime error
Runtime error
File size: 1,792 Bytes
c207bc4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
# YourMT3+ Enhanced Music Transcription
This is an enhanced version of YourMT3+ with **instrument conditioning** capabilities to solve instrument switching mid-track issues.
## Features
- **Instrument Conditioning**: Choose your target instrument to maintain consistency throughout transcription
- **Multi-track Support**: Transcribe multiple instruments from polyphonic audio
- **Format Options**: Output as MIDI, MusicXML, ABC notation, or audio
- **Free CPU Inference**: Optimized to run on HuggingFace Spaces free tier (CPU-only, 16GB RAM)
## How to Use
1. **Upload Your Audio**: Drag and drop or select an audio file
2. **Select Target Instrument**: Choose from the dropdown (vocals, piano, guitar, drums, etc.)
3. **Choose Output Format**: MIDI, MusicXML, ABC, or audio
4. **Transcribe**: Click the transcribe button and wait for results
## Instrument Conditioning System
This enhanced version addresses the common issue where YourMT3+ switches instruments mid-track (e.g., vocals → violin → guitar). The system uses:
- **Task Tokens**: Special conditioning tokens when available in the model
- **Post-processing Filtering**: Consistent instrument filtering based on MIDI program numbers
- **Debug Output**: Console logs showing instrument detection and filtering results
## Supported Instruments
- Vocals/Singing
- Piano
- Guitar (Electric/Acoustic)
- Bass
- Drums
- Violin
- Trumpet
- Saxophone
- And many more...
## Technical Details
- **Model**: YourMT3+ (Multi-channel T5 decoder with Perceiver-TF encoder)
- **Framework**: PyTorch Lightning + Gradio
- **Inference**: CPU-only for free tier compatibility
- **Memory**: Optimized for 16GB RAM constraint
## Credits
Based on the original YourMT3 by the MT3 team, enhanced with instrument conditioning capabilities.
|