amaai-lab
/

text2midi

Model card Files Files and versions

keshavbhandari commited on Dec 29, 2024

Commit

f996db0

·

verified ·

1 Parent(s): 5ed0076

Update README.md

Files changed (1) hide show

README.md +19 -5

README.md CHANGED Viewed

@@ -70,11 +70,18 @@ generated_midi.dump_midi("output.mid")
 ```
 ## Installation
 ```bash
 git clone https://github.com/AMAAI-Lab/text-2-midi
 cd text-2-midi
 pip install -r requirements.txt
 ```
 ## Datasets
 The MidiCaps dataset is a large-scale dataset of 168k MIDI files paired with rich text captions. These captions contain musical attributes such as key, tempo, style, and mood, making it ideal for text-to-MIDI generation tasks.
@@ -98,12 +105,12 @@ Each question is rated on a Likert scale from 1 (very bad) to 7 (very good). The
 | Metric              | text2midi | MidiCaps | MuseCoco |
 |---------------------|-----------|----------|----------|
-| CR ↑               | 2.14      | 3.43     | 2.12     |
 | CLAP ↑             | 0.22      | 0.26     | 0.21     |
-| TB (%) ↑           | 27.85     | -        | 21.71    |
-| TBT (%) ↑          | 57.78     | -        | 54.63    |
-| CK (%) ↑           | 7.69      | -        | 13.70    |
-| CKD (%) ↑          | 14.80     | -        | 14.59    |
 **Note**:
 CR = Compression ratio
@@ -132,6 +139,12 @@ accelerate launch train.py \
 --epochs=40 \
 ```
 ## Citation
 If you use text2midi in your research, please cite:
 ```
@@ -142,3 +155,4 @@ If you use text2midi in your research, please cite:
     year={2025}
 }
 ```

 ```
 ## Installation
+If you have CUDA supported machine:
 ```bash
 git clone https://github.com/AMAAI-Lab/text-2-midi
 cd text-2-midi
 pip install -r requirements.txt
 ```
+Alternatively, if you have MPS supported machine:
+```bash
+git clone https://github.com/AMAAI-Lab/text-2-midi
+cd text-2-midi
+pip install -r requirements-mac.txt
+```
 ## Datasets
 The MidiCaps dataset is a large-scale dataset of 168k MIDI files paired with rich text captions. These captions contain musical attributes such as key, tempo, style, and mood, making it ideal for text-to-MIDI generation tasks.
 | Metric              | text2midi | MidiCaps | MuseCoco |
 |---------------------|-----------|----------|----------|
+| CR ↑               | 2.31      | 3.43     | 2.12     |
 | CLAP ↑             | 0.22      | 0.26     | 0.21     |
+| TB (%) ↑           | 39.70     | -        | 21.71    |
+| TBT (%) ↑          | 65.80     | -        | 54.63    |
+| CK (%) ↑           | 33.60     | -        | 13.70    |
+| CKD (%) ↑          | 35.60     | -        | 14.59    |
 **Note**:
 CR = Compression ratio
 --epochs=40 \
 ```
+## Inference
+We spport inference on CUDA, MPS and cpu. Please make sure you have pip installed the correct requirement file (requirments.txt for CUDA, requirements-mac.txt for MPS)
+```bash
+python model/transformer_model.py --caption <your intended descriptions>
+```
 ## Citation
 If you use text2midi in your research, please cite:
 ```
     year={2025}
 }
 ```