Rob Jaret commited on
Commit
00e21da
·
1 Parent(s): 9c2487d

Updated layout, instructions, Read Me

Browse files
Files changed (2) hide show
  1. README.md +25 -7
  2. app.py +13 -13
README.md CHANGED
@@ -1,26 +1,44 @@
1
  ---
2
- title: Rma2
3
  emoji: 🌖
4
  colorFrom: purple
5
  colorTo: green
6
  sdk: gradio
 
7
  sdk_version: 5.39.0
8
  app_file: app.py
 
9
  pinned: false
 
 
 
 
 
 
10
  ---
11
 
12
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
13
 
14
-
15
  Built using:
16
  Mac OS Sequoia 15.5
17
  Python 3.12
18
 
19
- Some observations:
20
- - If all the parameters can be averaged, the result is usuallly a high pitch squeal or low rumble.
 
 
 
 
 
 
 
 
 
 
 
21
 
22
- Outstanding questions for any interested parties:
23
- - Since it doesn't work well when all params are compatible, are there some params that shouldn't be averaged to keep the resulting model functional?
24
  - Would it make logical sense to reshape the parameters that exist in both models but do not have the same shape so they can be averaged?
25
- - Anything else that could make the results sonically more like an average of two models?
26
 
 
1
  ---
2
+ title: Rave Model Averager
3
  emoji: 🌖
4
  colorFrom: purple
5
  colorTo: green
6
  sdk: gradio
7
+ python_version: 3.12
8
  sdk_version: 5.39.0
9
  app_file: app.py
10
+ suggested_hardware: cpu-basic
11
  pinned: false
12
+ short_description: Attempts to encode and then decode audio through an averaged version of two selected RAVE models.
13
+ models: Intelligent-Instruments-Lab/rave-models, shuoyang-zheng/jaspers-rave-models
14
+ preload_from_hub:
15
+ - Intelligent-Instruments-Lab/rave-models
16
+ - shuoyang-zheng/jaspers-rave-models]
17
+ tags: RAVE, Audio Model Manipulation, Encode, Decode
18
  ---
19
 
20
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
21
 
 
22
  Built using:
23
  Mac OS Sequoia 15.5
24
  Python 3.12
25
 
26
+ This app attempts to average two RAVE models and then encode and decode an audio file through the original and averaged models.
27
+
28
+ Instructions:
29
+ - Select the two models from the lists of pre-trained RAVE models.
30
+ - Select an audio file from the ones available in the dropdown, or upload an audio file of up to 60 seconds. Click 'Submit' at the bottom of the page.
31
+
32
+ Notes:
33
+ - Generally, the audio encoded/decoded using the average model does not sound like an 'average' of the two models. One of the better examples comes from the default settings, where you can here the sounds from model A (multi-timbral guitar) modulated somewhat by model B (water).
34
+ - The versions encoded/decoded through the individual models give interesting results though.
35
+ - In most cases not all parameters can be averaged. They may not exist in both models, or they may not have the same shape. The data sets in the output list which ones were and weren't averaged with their shapes and any notes. (You can copy them into a spreadsheet by clicking the icon in the top right corner of each.)
36
+ - The averaged model starts as a clone of Model A. Parameters that can't be averaged default to Model A values.
37
+ - If all the parameters can be averaged, the result is usually not good - a high pitch squeal or a low rumble.
38
+
39
 
40
+ Outstanding questions for interested parties:
41
+ - Since it doesn't work well when all params are compatible, mayber there are some params that shouldn't be averaged?
42
  - Would it make logical sense to reshape the parameters that exist in both models but do not have the same shape so they can be averaged?
43
+ - Anything thoughts on what could make the results sonically more like an 'average' of two models?
44
 
app.py CHANGED
@@ -26,7 +26,7 @@ available_audio_files=[
26
  "FrenchChildren.wav",
27
  "Organ-ND.wav",
28
  "SpigotsOfChateauLEtoge.wav",
29
- "Gestures-PercStrings.wav",
30
  "SingingBowl-OmniMic.wav",
31
  "BirdCalls.mp3",
32
  ]
@@ -231,16 +231,16 @@ def GenerateRaveEncDecAudio(model_name_a, model_name_b, audio_file_name, audio_f
231
  averaged_audio = (sr_multiplied, audio_outputs[bias].detach().numpy().squeeze())
232
 
233
  df_averaged = pd.DataFrame(messages['keys_averaged']).transpose() #reset_index(names='Param Key')
234
- df_averaged.columns=['Param Name', 'Model A Shape', 'Model B Shape', 'Errors']
235
 
236
  df_not_averaged = pd.DataFrame(messages["keys_not_averaged"]).transpose()
237
 
238
  # case when all params are averaged
239
  if len(df_not_averaged.columns) == 0:
240
- data = {'Param Name': [], 'Modeal A Shape': [], 'Model B Shape': [], 'Errors': []}
241
  df_not_averaged = pd.DataFrame(data)
242
 
243
- df_not_averaged.columns=['Param Name', 'Model A Shape', 'Model B Shape', 'Errors']
244
 
245
  messages["stats"] = f"Model A: {model_name_a}\nModel B: {model_name_b}\nAudio file: {os.path.basename(audio_file)}\nSample Rate Multiple for Averaged Version: {sr_multiple}\n\n" + messages["stats"]
246
 
@@ -253,11 +253,11 @@ waveform_options = gr.WaveformOptions(waveform_color="#01C6FF",
253
  waveform_progress_color="#0066B4",
254
  skip_length=2,)
255
 
256
- description = "<p style='line-height: 2'>This app attempts to average two RAVE models and then encode and decode an audio file through the original and averaged models.</p>" \
257
- "<ul><li style='line-height: 1'>Select the two models from the list of pre-trained RAVE models (put credits).</li>" \
258
- "<li style='line-height: 1'>Select an audio file from the ones available in the dropdown, or upload an audio file of up to 60 seconds. Click 'Submit'</li>" \
259
- "<p style='line-height: 1'>Note that in most cases not all parameters can be averaged. They may not exist in both models, or they may not have the same shape. " \
260
- "The data sets in the output show which ones were and weren't averaged, and why they weren't. These can be copied into a spreadsheet to be analyzed.</p>" \
261
  "<!-- <li>Select a sample rate multiple for the averaged model. When there is a useful result, it sometimes sounds better at double the sample rate.</li>" \
262
  "<li>Select a bias towards one of the models. A bias of 0 will average the two models equally. A positive bias will favor Model A, and a negative bias will favor Model B.</li></ul>" \
263
  "-->"
@@ -271,10 +271,10 @@ AverageModels = gr.Interface(title="Process Audio Through the Average of Two Rav
271
  gr.Radio(model_path_config_keys, label="Select Model B", value="Water", container=True),
272
  gr.Dropdown(available_audio_files, label="Select from these audio files or upload your own below:", value="SilverCaneAbbey-Voices.wav",container=True),
273
  gr.Audio(label="Upload an audio file (wav)", type="filepath", sources=["upload", "microphone"], max_length=60,
274
- waveform_options=waveform_options, format='wav'),
 
275
  gr.Radio([.2, .5, .75, 1, 2, 4], label="Sample Rate Multiple (Averaged version only)", value=1, container=True),
276
  gr.Slider(label="Bias towards Model A or B", minimum=-1, maximum=1, value=0, step=0.1, container=True),
277
-
278
  ],
279
  # if no way to pass dictionary, pass separate keys and values and zip them.
280
  outputs=[
@@ -283,8 +283,8 @@ AverageModels = gr.Interface(title="Process Audio Through the Average of Two Rav
283
  gr.Audio(label="Encoded/Decoded through Model B", sources=None, waveform_options=waveform_options,),
284
  gr.Audio(label="Encoded/Decoded through averaged model", sources=None, waveform_options=waveform_options,),
285
  gr.Textbox(label="Info:"),
286
- gr.Dataframe(label="Params Averaged", show_copy_button="True", scale=100, column_widths=column_widths, headers=['Param Name', 'Model A Shape', 'Model B Shape', 'Errors']),
287
- gr.Dataframe(label="Params Not Averaged", show_copy_button="True", scale=100, column_widths=column_widths, headers=['Param Name', 'Model A Shape', 'Model B Shape', 'Errors'])
288
  ]
289
  ,fill_width=True
290
  )
 
26
  "FrenchChildren.wav",
27
  "Organ-ND.wav",
28
  "SpigotsOfChateauLEtoge.wav",
29
+ "GesturesPercStrings.wav",
30
  "SingingBowl-OmniMic.wav",
31
  "BirdCalls.mp3",
32
  ]
 
231
  averaged_audio = (sr_multiplied, audio_outputs[bias].detach().numpy().squeeze())
232
 
233
  df_averaged = pd.DataFrame(messages['keys_averaged']).transpose() #reset_index(names='Param Key')
234
+ df_averaged.columns=['Param Name', 'Model A Shape', 'Model B Shape', 'Notes']
235
 
236
  df_not_averaged = pd.DataFrame(messages["keys_not_averaged"]).transpose()
237
 
238
  # case when all params are averaged
239
  if len(df_not_averaged.columns) == 0:
240
+ data = {'Param Name': [], 'Modeal A Shape': [], 'Model B Shape': [], 'Notes': []}
241
  df_not_averaged = pd.DataFrame(data)
242
 
243
+ df_not_averaged.columns=['Param Name', 'Model A Shape', 'Model B Shape', 'Notes']
244
 
245
  messages["stats"] = f"Model A: {model_name_a}\nModel B: {model_name_b}\nAudio file: {os.path.basename(audio_file)}\nSample Rate Multiple for Averaged Version: {sr_multiple}\n\n" + messages["stats"]
246
 
 
253
  waveform_progress_color="#0066B4",
254
  skip_length=2,)
255
 
256
+ description = "<p style='line-height: 1'>This app attempts to average two RAVE models and then encode and decode an audio file through the original and averaged models.</p>" \
257
+ "<ul style='padding-bottom: 0px'>Instructions:<li style='line-height: 1; padding-top: 5px'>Select the two models from the lists of pre-trained RAVE models.</li>" \
258
+ "<li style='line-height: 1; padding-top: 0px'>Select an audio file from the ones available in the dropdown, or upload an audio file of up to 60 seconds. Click 'Submit' at the bottom of the page.</li></ul>"\
259
+ "<p style='line-height: 1.2; padding-top: 0px; margin-top: 3px;'>Note that in most cases not all parameters can be averaged. They may not exist in both models or the two values may not have the same shape. The data sets in the output list which ones were and weren't averaged with their shapes and any notes. (You can copy them into a spreadsheet by clicking the icon at the top right corner of each.)</p>"
260
+
261
  "<!-- <li>Select a sample rate multiple for the averaged model. When there is a useful result, it sometimes sounds better at double the sample rate.</li>" \
262
  "<li>Select a bias towards one of the models. A bias of 0 will average the two models equally. A positive bias will favor Model A, and a negative bias will favor Model B.</li></ul>" \
263
  "-->"
 
271
  gr.Radio(model_path_config_keys, label="Select Model B", value="Water", container=True),
272
  gr.Dropdown(available_audio_files, label="Select from these audio files or upload your own below:", value="SilverCaneAbbey-Voices.wav",container=True),
273
  gr.Audio(label="Upload an audio file (wav)", type="filepath", sources=["upload", "microphone"], max_length=60,
274
+ waveform_options=waveform_options, format='wav'),],
275
+ additional_inputs=[
276
  gr.Radio([.2, .5, .75, 1, 2, 4], label="Sample Rate Multiple (Averaged version only)", value=1, container=True),
277
  gr.Slider(label="Bias towards Model A or B", minimum=-1, maximum=1, value=0, step=0.1, container=True),
 
278
  ],
279
  # if no way to pass dictionary, pass separate keys and values and zip them.
280
  outputs=[
 
283
  gr.Audio(label="Encoded/Decoded through Model B", sources=None, waveform_options=waveform_options,),
284
  gr.Audio(label="Encoded/Decoded through averaged model", sources=None, waveform_options=waveform_options,),
285
  gr.Textbox(label="Info:"),
286
+ gr.Dataframe(label="Params Averaged", show_copy_button="True", scale=100, column_widths=column_widths, headers=['Param Name', 'Model A Shape', 'Model B Shape', 'Notes']),
287
+ gr.Dataframe(label="Params Not Averaged", show_copy_button="True", scale=100, column_widths=column_widths, headers=['Param Name', 'Model A Shape', 'Model B Shape', 'Notes'])
288
  ]
289
  ,fill_width=True
290
  )