Spaces:

rjaret
/

rave_model_averager

Sleeping

App Files Files Community

Rob Jaret commited on Aug 5

Commit

00e21da

1 Parent(s): 9c2487d

Updated layout, instructions, Read Me

Browse files

Files changed (2) hide show

README.md +25 -7
app.py +13 -13

README.md CHANGED Viewed

@@ -1,26 +1,44 @@
 ---
-title: Rma2
 emoji: 🌖
 colorFrom: purple
 colorTo: green
 sdk: gradio
 sdk_version: 5.39.0
 app_file: app.py
 pinned: false
 ---
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 Built using:
 Mac OS Sequoia 15.5
 Python 3.12
-Some observations:
-- If all the parameters can be averaged, the result is usuallly a high pitch squeal or low rumble.
-Outstanding questions for any interested parties:
-- Since it doesn't work well when all params are compatible, are there some params that shouldn't be averaged to keep the resulting model functional?
 - Would it make logical sense to reshape the parameters that exist in both models but do not have the same shape so they can be averaged?
-- Anything else that could make the results sonically more like an average of two models?

 ---
+title: Rave Model Averager
 emoji: 🌖
 colorFrom: purple
 colorTo: green
 sdk: gradio
+python_version: 3.12
 sdk_version: 5.39.0
 app_file: app.py
+suggested_hardware: cpu-basic
 pinned: false
+short_description: Attempts to encode and then decode audio through an averaged version of two selected RAVE models.
+models: Intelligent-Instruments-Lab/rave-models, shuoyang-zheng/jaspers-rave-models
+preload_from_hub:
+    - Intelligent-Instruments-Lab/rave-models
+    - shuoyang-zheng/jaspers-rave-models]
+tags: RAVE, Audio Model Manipulation, Encode, Decode
 ---
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 Built using:
 Mac OS Sequoia 15.5
 Python 3.12
+This app attempts to average two RAVE models and then encode and decode an audio file through the original and averaged models.
+Instructions:
+- Select the two models from the lists of pre-trained RAVE models.
+- Select an audio file from the ones available in the dropdown, or upload an audio file of up to 60 seconds. Click 'Submit' at the bottom of the page.
+Notes:
+- Generally, the audio encoded/decoded using the average model does not sound like an 'average' of the two models. One of the better examples comes from the default settings, where you can here the sounds from model A (multi-timbral guitar) modulated somewhat by model B (water).
+- The versions encoded/decoded through the individual models give interesting results though.
+- In most cases not all parameters can be averaged. They may not exist in both models, or they may not have the same shape. The data sets in the output list which ones were and weren't averaged with their shapes and any notes. (You can copy them into a spreadsheet by clicking the icon in the top right corner of each.)
+- The averaged model starts as a clone of Model A. Parameters that can't be averaged default to Model A values.
+- If all the parameters can be averaged, the result is usually not good - a high pitch squeal or a low rumble.
+Outstanding questions for interested parties:
+- Since it doesn't work well when all params are compatible, mayber there are some params that shouldn't be averaged?
 - Would it make logical sense to reshape the parameters that exist in both models but do not have the same shape so they can be averaged?
+- Anything thoughts on what could make the results sonically more like an 'average' of two models?

app.py CHANGED Viewed

@@ -26,7 +26,7 @@ available_audio_files=[
     "FrenchChildren.wav",
     "Organ-ND.wav",
     "SpigotsOfChateauLEtoge.wav",
-    "Gestures-PercStrings.wav",
     "SingingBowl-OmniMic.wav",
     "BirdCalls.mp3",
     ]
@@ -231,16 +231,16 @@ def GenerateRaveEncDecAudio(model_name_a, model_name_b, audio_file_name, audio_f
         averaged_audio = (sr_multiplied, audio_outputs[bias].detach().numpy().squeeze())
         df_averaged = pd.DataFrame(messages['keys_averaged']).transpose() #reset_index(names='Param Key')
-        df_averaged.columns=['Param Name', 'Model A Shape', 'Model B Shape', 'Errors']
         df_not_averaged = pd.DataFrame(messages["keys_not_averaged"]).transpose()
         # case when all params are averaged
         if len(df_not_averaged.columns) == 0:
-            data = {'Param Name': [], 'Modeal A Shape': [], 'Model B Shape': [], 'Errors': []}
             df_not_averaged = pd.DataFrame(data)
-        df_not_averaged.columns=['Param Name', 'Model A Shape', 'Model B Shape', 'Errors']
         messages["stats"] = f"Model A: {model_name_a}\nModel B: {model_name_b}\nAudio file: {os.path.basename(audio_file)}\nSample Rate Multiple for Averaged Version: {sr_multiple}\n\n" + messages["stats"]
@@ -253,11 +253,11 @@ waveform_options = gr.WaveformOptions(waveform_color="#01C6FF",
                                                      waveform_progress_color="#0066B4",
                                                      skip_length=2,)
-description = "<p style='line-height: 2'>This app attempts to average two RAVE models and then encode and decode an audio file through the original and averaged models.</p>" \
-"<ul><li style='line-height: 1'>Select the two models from the list of pre-trained RAVE models (put credits).</li>" \
-"<li style='line-height: 1'>Select an audio file from the ones available in the dropdown, or upload an audio file of up to 60 seconds. Click 'Submit'</li>" \
-"<p style='line-height: 1'>Note that in most cases not all parameters can be averaged. They may not exist in both models, or they may not have the same shape. " \
-"The data sets in the output show which ones were and weren't averaged, and why they weren't. These can be copied into a spreadsheet to be analyzed.</p>" \
 "<!-- <li>Select a sample rate multiple for the averaged model. When there is a useful result, it sometimes sounds better at double the sample rate.</li>" \
 "<li>Select a bias towards one of the models. A bias of 0 will average the two models equally. A positive bias will favor Model A, and a negative bias will favor Model B.</li></ul>" \
 "-->"
@@ -271,10 +271,10 @@ AverageModels = gr.Interface(title="Process Audio Through the Average of Two Rav
         gr.Radio(model_path_config_keys, label="Select Model B", value="Water", container=True),
         gr.Dropdown(available_audio_files, label="Select from these audio files or upload your own below:", value="SilverCaneAbbey-Voices.wav",container=True),
         gr.Audio(label="Upload an audio file (wav)", type="filepath", sources=["upload", "microphone"], max_length=60,
-                waveform_options=waveform_options, format='wav'),
         gr.Radio([.2, .5, .75, 1, 2, 4], label="Sample Rate Multiple (Averaged version only)", value=1, container=True),
         gr.Slider(label="Bias towards Model A or B", minimum=-1, maximum=1, value=0, step=0.1, container=True),
         ],
     # if no way to pass dictionary, pass separate keys and values and zip them.
     outputs=[
@@ -283,8 +283,8 @@ AverageModels = gr.Interface(title="Process Audio Through the Average of Two Rav
         gr.Audio(label="Encoded/Decoded through Model B", sources=None, waveform_options=waveform_options,),
         gr.Audio(label="Encoded/Decoded through averaged model", sources=None, waveform_options=waveform_options,),
         gr.Textbox(label="Info:"),
-        gr.Dataframe(label="Params Averaged", show_copy_button="True", scale=100, column_widths=column_widths, headers=['Param Name', 'Model A Shape', 'Model B Shape', 'Errors']),
-        gr.Dataframe(label="Params Not Averaged", show_copy_button="True", scale=100, column_widths=column_widths, headers=['Param Name', 'Model A Shape', 'Model B Shape', 'Errors'])
         ]
     ,fill_width=True
 )

     "FrenchChildren.wav",
     "Organ-ND.wav",
     "SpigotsOfChateauLEtoge.wav",
+    "GesturesPercStrings.wav",
     "SingingBowl-OmniMic.wav",
     "BirdCalls.mp3",
     ]
         averaged_audio = (sr_multiplied, audio_outputs[bias].detach().numpy().squeeze())
         df_averaged = pd.DataFrame(messages['keys_averaged']).transpose() #reset_index(names='Param Key')
+        df_averaged.columns=['Param Name', 'Model A Shape', 'Model B Shape', 'Notes']
         df_not_averaged = pd.DataFrame(messages["keys_not_averaged"]).transpose()
         # case when all params are averaged
         if len(df_not_averaged.columns) == 0:
+            data = {'Param Name': [], 'Modeal A Shape': [], 'Model B Shape': [], 'Notes': []}
             df_not_averaged = pd.DataFrame(data)
+        df_not_averaged.columns=['Param Name', 'Model A Shape', 'Model B Shape', 'Notes']
         messages["stats"] = f"Model A: {model_name_a}\nModel B: {model_name_b}\nAudio file: {os.path.basename(audio_file)}\nSample Rate Multiple for Averaged Version: {sr_multiple}\n\n" + messages["stats"]
                                                      waveform_progress_color="#0066B4",
                                                      skip_length=2,)
+description = "<p style='line-height: 1'>This app attempts to average two RAVE models and then encode and decode an audio file through the original and averaged models.</p>" \
+"<ul style='padding-bottom: 0px'>Instructions:<li style='line-height: 1; padding-top: 5px'>Select the two models from the lists of pre-trained RAVE models.</li>" \
+"<li style='line-height: 1; padding-top: 0px'>Select an audio file from the ones available in the dropdown, or upload an audio file of up to 60 seconds. Click 'Submit' at the bottom of the page.</li></ul>"\
+"<p style='line-height: 1.2; padding-top: 0px; margin-top: 3px;'>Note that in most cases not all parameters can be averaged. They may not exist in both models or the two values may not have the same shape. The data sets in the output list which ones were and weren't averaged with their shapes and any notes. (You can copy them into a spreadsheet by clicking the icon at the top right corner of each.)</p>"
 "<!-- <li>Select a sample rate multiple for the averaged model. When there is a useful result, it sometimes sounds better at double the sample rate.</li>" \
 "<li>Select a bias towards one of the models. A bias of 0 will average the two models equally. A positive bias will favor Model A, and a negative bias will favor Model B.</li></ul>" \
 "-->"
         gr.Radio(model_path_config_keys, label="Select Model B", value="Water", container=True),
         gr.Dropdown(available_audio_files, label="Select from these audio files or upload your own below:", value="SilverCaneAbbey-Voices.wav",container=True),
         gr.Audio(label="Upload an audio file (wav)", type="filepath", sources=["upload", "microphone"], max_length=60,
+                waveform_options=waveform_options, format='wav'),],
+    additional_inputs=[
         gr.Radio([.2, .5, .75, 1, 2, 4], label="Sample Rate Multiple (Averaged version only)", value=1, container=True),
         gr.Slider(label="Bias towards Model A or B", minimum=-1, maximum=1, value=0, step=0.1, container=True),
         ],
     # if no way to pass dictionary, pass separate keys and values and zip them.
     outputs=[
         gr.Audio(label="Encoded/Decoded through Model B", sources=None, waveform_options=waveform_options,),
         gr.Audio(label="Encoded/Decoded through averaged model", sources=None, waveform_options=waveform_options,),
         gr.Textbox(label="Info:"),
+        gr.Dataframe(label="Params Averaged", show_copy_button="True", scale=100, column_widths=column_widths, headers=['Param Name', 'Model A Shape', 'Model B Shape', 'Notes']),
+        gr.Dataframe(label="Params Not Averaged", show_copy_button="True", scale=100, column_widths=column_widths, headers=['Param Name', 'Model A Shape', 'Model B Shape', 'Notes'])
         ]
     ,fill_width=True
 )