Spaces:

jacob-c
/

largermodel_lyrics_generation

Paused

App Files Files Community

root commited on Jun 1

Commit

19c0923

1 Parent(s): 14555be

css

Browse files

Files changed (2) hide show

README.md +94 -25
app.py +108 -40

README.md CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-title: Music Genre Classifier & Lyrics Generator
 emoji: 🎵
 colorFrom: indigo
 colorTo: purple
@@ -8,40 +8,109 @@ sdk_version: 5.22.0
 app_file: app.py
 pinned: false
 license: mit
-short_description: AI music genre detection and lyrics generation
 ---
-# Music Genre Classifier & Lyrics Generator
-This Hugging Face Space application provides two AI-powered features:
-1. **Music Genre Classification**: Upload a music file and get an analysis of its genre using the [dima806/music_genres_classification](https://huggingface.co/dima806/music_genres_classification) model.
-2. **Lyrics Generation**: Based on the detected genre, the app generates original lyrics using [Qwen/QwQ-32B](https://huggingface.co/Qwen/QwQ-32B) that match both the style of the genre and approximate length of the song.
-## Features
-- Upload any music file for instant genre classification
-- Receive genre predictions with confidence scores
-- Get AI-generated lyrics tailored to the detected music genre
-- Lyrics length is automatically adjusted based on the song duration
-- Simple and intuitive user interface
-## Usage
-1. Visit the live application on Hugging Face Spaces
-2. Upload your music file using the provided interface
-3. Click "Analyze & Generate" to process the audio
-4. View the detected genre and generated lyrics in the output panels
-## Technical Details
-- Uses MFCC features extraction from audio for genre classification
-- Leverages 4-bit quantization for efficient LLM inference on T4 GPU
-- Implements a specialized prompt engineering approach to generate genre-specific lyrics
-- Automatically scales lyrics length based on audio duration
-## Links
-- [Music Genre Classification Model](https://huggingface.co/dima806/music_genres_classification)
-- [Qwen QwQ-32B Model](https://huggingface.co/Qwen/QwQ-32B)

 ---
+title: Advanced Music Analysis & Beat-Matched Lyrics Generator
 emoji: 🎵
 colorFrom: indigo
 colorTo: purple
 app_file: app.py
 pinned: false
 license: mit
+short_description: AI-powered music analysis with beat-synchronized lyrics generation
 ---
+# Advanced Music Analysis & Beat-Matched Lyrics Generator
+This comprehensive AI-powered application provides advanced music analysis and generates perfectly synchronized lyrics that match the musical structure, rhythm, and emotional content of your audio files.
+## 🎯 Key Features
+### 🎼 **Comprehensive Music Analysis**
+- **Genre Classification**: Automatic detection using [dima806/music_genres_classification](https://huggingface.co/dima806/music_genres_classification)
+- **Tempo & Time Signature Detection**: Advanced multi-method analysis (4/4, 3/4, 6/8)
+- **Emotional Analysis**: 8-dimensional emotion detection (happy, sad, excited, calm, etc.)
+- **Thematic Analysis**: Identifies musical themes (love, triumph, loss, adventure, etc.)
+- **Tonal Analysis**: Key detection, mode analysis (major/minor), harmonic complexity
+- **Beat Pattern Analysis**: Precise beat tracking and stress pattern identification
+### 🎤 **Beat-Synchronized Lyrics Generation**
+- **Rhythm-Matched Lyrics**: Each line perfectly aligns with musical phrases and beat patterns
+- **Syllable-to-Beat Mapping**: Precise syllable counting and stress pattern matching
+- **Custom Requirements Integration**: Add your own creative directions and themes
+- **Genre-Specific Optimization**: Tailored for Pop, Rock, Country, Disco, and Metal
+- **Flow Analysis**: Ensures natural sentence flow across multiple lines
+- **Quality Metrics**: Detailed beat matching and syllable accuracy analysis
+### 🎨 **Personalization Features**
+- **Custom Prompt Input**: Specify themes, imagery, perspective, style, or content requirements
+- **Intelligent Blending**: Merges your requirements with detected musical characteristics
+- **Flexible Creative Control**: From simple themes to complex narrative directions
+## 🚀 How It Works
+1. **Upload Audio**: Support for various audio formats, or record directly
+2. **Add Custom Requirements** (Optional): Specify your creative vision
+3. **Advanced Analysis**: Multi-layered analysis of musical characteristics:
+   - Rhythm and tempo analysis
+   - Time signature detection using autocorrelation, pattern matching, and spectral analysis
+   - Emotional profiling using valence-arousal mapping
+   - Thematic classification based on musical features
+   - Beat pattern extraction and stress analysis
+4. **Lyrics Generation**: AI creates lyrics using [Qwen/QwQ-32B](https://huggingface.co/Qwen/QwQ-32B) that:
+   - Match the detected beat patterns and time signature
+   - Incorporate detected emotions and themes
+   - Follow your custom creative requirements
+   - Maintain proper syllable-to-beat ratios for the genre
+5. **Quality Analysis**: Comprehensive beat matching analysis with accuracy metrics
+## 🎵 Supported Genres for Lyrics Generation
+**Full Support** (Analysis + Beat-Matched Lyrics):
+- **Pop**: Optimized syllable patterns and emotional expression
+- **Rock**: Energetic phrasing with strong beat emphasis
+- **Country**: Narrative flow with authentic storytelling patterns
+- **Disco**: Rhythmic momentum with dance-friendly phrasing
+- **Metal**: Intense expression with dramatic beat alignment
+**Analysis Only**: All other genres receive comprehensive musical analysis without lyrics generation.
+## 🛠️ Technical Features
+### Advanced Analysis Algorithms
+- **Multi-Method Time Signature Detection**: Combines autocorrelation, pattern matching, spectral analysis, note density analysis, and tempo-based estimation
+- **Emotion Mapping**: 8-dimensional emotion space with valence-arousal coordinates
+- **Beat Strength Analysis**: Onset detection with energy and spectral flux analysis
+- **Syllable Stress Matching**: CMU Dictionary integration with rule-based fallback
+### AI-Powered Generation
+- **4-bit Quantization**: Efficient inference on T4 GPU using BitsAndBytesConfig
+- **Specialized Prompting**: Genre-aware prompt engineering for optimal results
+- **Quality Enforcement**: Automatic syllable limit enforcement and line count validation
+- **Flow Optimization**: Sentence continuation analysis for natural lyrical flow
+## 📊 Analysis Outputs
+### Musical Analysis
+- Tempo (BPM) and time signature with confidence scores
+- Primary and secondary emotions with confidence percentages
+- Musical themes and their relevance scores
+- Key signature and mode detection
+- Beat pattern visualization
+### Lyrics Quality Metrics
+- Syllable-to-beat match accuracy
+- Stress pattern alignment scores
+- Sentence flow quality assessment
+- Genre-appropriate range compliance
+- Overall rhythmic accuracy percentage
+## 🎯 Custom Requirements Examples
+**Themes**: "Write about a journey through mountains", "Focus on urban nightlife"
+**Imagery**: "Use ocean metaphors", "Include references to light and shadow"
+**Perspective**: "From a child's viewpoint", "Nostalgic memories", "Future aspirations"
+**Style**: "Conversational tone", "Include internal rhymes", "Simple everyday language"
+**Content**: "Avoid melancholy", "Include words 'freedom' and 'horizon'", "Focus on resilience"
+## 🔗 Model Credits
+- **Genre Classification**: [dima806/music_genres_classification](https://huggingface.co/dima806/music_genres_classification)
+- **Lyrics Generation**: [Qwen/QwQ-32B](https://huggingface.co/Qwen/QwQ-32B) with 4-bit quantization
+- **Audio Processing**: librosa, scipy, numpy for advanced signal processing
+- **Linguistic Analysis**: NLTK CMU Dictionary for syllable counting and stress analysis
+## 🎪 Try It Now
+Experience the future of AI-powered music analysis and lyrics generation. Upload your music and watch as the system creates perfectly synchronized, emotionally resonant lyrics tailored to your creative vision!

app.py CHANGED Viewed

@@ -87,7 +87,7 @@ except Exception as e:
 music_analyzer = MusicAnalyzer()
 # Process uploaded audio file
-def process_audio(audio_file):
     if audio_file is None:
         return "No audio file provided", None, None, None, None, None, None, None, None, None
@@ -200,7 +200,7 @@ def process_audio(audio_file):
         # Generate lyrics only for supported genres
         if genre_supported:
-            lyrics = generate_lyrics(music_analysis, primary_genre, duration)
             beat_match_analysis = analyze_lyrics_rhythm_match(lyrics, lyric_templates, primary_genre)
         else:
             supported_genres_str = ", ".join([genre.capitalize() for genre in beat_analyzer.supported_genres])
@@ -214,7 +214,7 @@ def process_audio(audio_file):
         print(error_msg)
         return error_msg, None, None, None, None, None, None, None, None, None
-def generate_lyrics(music_analysis, genre, duration):
     try:
         # Extract meaningful information for context
         tempo = music_analysis["rhythm_analysis"]["tempo"]
@@ -242,7 +242,8 @@ def generate_lyrics(music_analysis, genre, duration):
             min_syl_for_prompt = 2
             max_syl_for_prompt = 7
-            prompt = (f'''You are a professional songwriter. Write song lyrics for a {genre} song.
 SONG DETAILS:
 - Key: {key} {mode}
@@ -250,7 +251,18 @@ SONG DETAILS:
 - Primary emotion: {primary_emotion}
 - Secondary emotion: {secondary_emotion}
 - Primary theme: {primary_theme}
-- Secondary theme: {secondary_theme}
 CRITICAL REQUIREMENTS (MOST IMPORTANT):
 - You MUST write EXACTLY {num_phrases_for_prompt} lines of lyrics.
@@ -288,14 +300,15 @@ Under the "LYRICS:" heading, provide exactly {num_phrases_for_prompt} numbered l
 LYRICS:
 (Your {num_phrases_for_prompt} numbered lyric lines go here, each starting with its number, a period, and a space)
-Remember: Output EXACTLY {num_phrases_for_prompt} numbered lyric lines. Each line's content (after removing the number) must be {min_syl_for_prompt}-{max_syl_for_prompt} syllables.''')
         else:
             # Calculate the typical syllable range for this genre
             num_phrases_for_prompt = len(lyric_templates)
             max_syl_for_prompt = max([t.get('max_expected', 7) for t in lyric_templates]) if lyric_templates and lyric_templates[0].get('max_expected') else 7
             min_syl_for_prompt = min([t.get('min_expected', 2) for t in lyric_templates]) if lyric_templates and lyric_templates[0].get('min_expected') else 2
-            prompt = (f'''You are a professional songwriter. Write song lyrics for a {genre} song.
 SONG DETAILS:
 - Key: {key} {mode}
@@ -303,7 +316,18 @@ SONG DETAILS:
 - Primary emotion: {primary_emotion}
 - Secondary emotion: {secondary_emotion}
 - Primary theme: {primary_theme}
-- Secondary theme: {secondary_theme}
 CRITICAL REQUIREMENTS (MOST IMPORTANT):
 - You MUST write EXACTLY {num_phrases_for_prompt} lines of lyrics.
@@ -341,7 +365,7 @@ Under the "LYRICS:" heading, provide exactly {num_phrases_for_prompt} numbered l
 LYRICS:
 (Your {num_phrases_for_prompt} numbered lyric lines go here, each starting with its number, a period, and a space)
-Remember: Output EXACTLY {num_phrases_for_prompt} numbered lyric lines. Each line's content (after removing the number) must be {min_syl_for_prompt}-{max_syl_for_prompt} syllables.''')
         # Generate with optimized parameters for QwQ model
         messages = [
             {"role": "user", "content": prompt}
@@ -832,46 +856,55 @@ def enforce_syllable_limits(lines, max_syllables=6):
 # Create Gradio interface
 def create_interface():
-    with gr.Blocks(title="Music Analysis & Lyrics Generator") as demo:
-        gr.Markdown("# Music Analysis & Lyrics Generator")
-        gr.Markdown("Upload a music file or record audio to analyze it and generate matching lyrics")
         with gr.Row():
             with gr.Column(scale=1):
                 audio_input = gr.Audio(
-                    label="Upload or Record Audio",
                     type="filepath",
                     sources=["upload", "microphone"]
                 )
-                analyze_btn = gr.Button("Analyze and Generate Lyrics", variant="primary")
             with gr.Column(scale=2):
-                with gr.Tab("Analysis"):
-                    analysis_output = gr.Textbox(label="Music Analysis Results", lines=10)
                     with gr.Row():
-                        tempo_output = gr.Number(label="Tempo (BPM)")
-                        time_sig_output = gr.Textbox(label="Time Signature")
                     with gr.Row():
-                        primary_emotion_output = gr.Textbox(label="Primary Emotion")
-                        secondary_emotion_output = gr.Textbox(label="Secondary Emotion")
                     with gr.Row():
-                        primary_theme_output = gr.Textbox(label="Primary Theme")
-                        secondary_theme_output = gr.Textbox(label="Secondary Theme")
-                        genre_output = gr.Textbox(label="Primary Genre")
-                with gr.Tab("Generated Lyrics"):
-                    lyrics_output = gr.Textbox(label="Generated Lyrics", lines=20)
-                with gr.Tab("Beat Matching"):
-                    beat_match_output = gr.Markdown(label="Beat & Syllable Matching Analysis")
         # Set up event handlers
         analyze_btn.click(
             fn=process_audio,
-            inputs=[audio_input],
             outputs=[
                 analysis_output, lyrics_output, tempo_output, time_sig_output,
                 primary_emotion_output, secondary_emotion_output,
@@ -881,22 +914,57 @@ def create_interface():
         )
         # Format supported genres for display
-        supported_genres_md = "\n".join([f"- {genre.capitalize()}" for genre in beat_analyzer.supported_genres])
         gr.Markdown(f"""
-        ## How it works
-        1. Upload or record a music file
-        2. The system analyzes tempo, beats, time signature and other musical features
-        3. It detects emotions, themes, and music genre
-        4. Using beat patterns and syllable stress analysis, it generates perfectly aligned lyrics
-        5. Each line of the lyrics is matched to the beat pattern of the corresponding musical phrase
-        ## Supported Genres
-        **Note:** Lyrics generation is currently only supported for the following genres:
         {supported_genres_md}
-        These genres have consistent syllable-to-beat patterns that work well with our algorithm.
-        For other genres, only music analysis will be provided.
         """)
     return demo

 music_analyzer = MusicAnalyzer()
 # Process uploaded audio file
+def process_audio(audio_file, custom_prompt=""):
     if audio_file is None:
         return "No audio file provided", None, None, None, None, None, None, None, None, None
         # Generate lyrics only for supported genres
         if genre_supported:
+            lyrics = generate_lyrics(music_analysis, primary_genre, duration, custom_prompt)
             beat_match_analysis = analyze_lyrics_rhythm_match(lyrics, lyric_templates, primary_genre)
         else:
             supported_genres_str = ", ".join([genre.capitalize() for genre in beat_analyzer.supported_genres])
         print(error_msg)
         return error_msg, None, None, None, None, None, None, None, None, None
+def generate_lyrics(music_analysis, genre, duration, custom_prompt=""):
     try:
         # Extract meaningful information for context
         tempo = music_analysis["rhythm_analysis"]["tempo"]
             min_syl_for_prompt = 2
             max_syl_for_prompt = 7
+            # Build the base prompt
+            base_prompt = f'''You are a professional songwriter. Write song lyrics for a {genre} song.
 SONG DETAILS:
 - Key: {key} {mode}
 - Primary emotion: {primary_emotion}
 - Secondary emotion: {secondary_emotion}
 - Primary theme: {primary_theme}
+- Secondary theme: {secondary_theme}'''
+            # Add custom requirements if provided
+            custom_requirements = ""
+            if custom_prompt and custom_prompt.strip():
+                custom_requirements = f'''
+SPECIAL REQUIREMENTS FROM USER:
+{custom_prompt.strip()}
+Please incorporate these requirements while still following all the technical constraints below.'''
+            prompt = base_prompt + custom_requirements + f'''
 CRITICAL REQUIREMENTS (MOST IMPORTANT):
 - You MUST write EXACTLY {num_phrases_for_prompt} lines of lyrics.
 LYRICS:
 (Your {num_phrases_for_prompt} numbered lyric lines go here, each starting with its number, a period, and a space)
+Remember: Output EXACTLY {num_phrases_for_prompt} numbered lyric lines. Each line's content (after removing the number) must be {min_syl_for_prompt}-{max_syl_for_prompt} syllables.'''
         else:
             # Calculate the typical syllable range for this genre
             num_phrases_for_prompt = len(lyric_templates)
             max_syl_for_prompt = max([t.get('max_expected', 7) for t in lyric_templates]) if lyric_templates and lyric_templates[0].get('max_expected') else 7
             min_syl_for_prompt = min([t.get('min_expected', 2) for t in lyric_templates]) if lyric_templates and lyric_templates[0].get('min_expected') else 2
+            # Build the base prompt
+            base_prompt = f'''You are a professional songwriter. Write song lyrics for a {genre} song.
 SONG DETAILS:
 - Key: {key} {mode}
 - Primary emotion: {primary_emotion}
 - Secondary emotion: {secondary_emotion}
 - Primary theme: {primary_theme}
+- Secondary theme: {secondary_theme}'''
+            # Add custom requirements if provided
+            custom_requirements = ""
+            if custom_prompt and custom_prompt.strip():
+                custom_requirements = f'''
+SPECIAL REQUIREMENTS FROM USER:
+{custom_prompt.strip()}
+Please incorporate these requirements while still following all the technical constraints below.'''
+            prompt = base_prompt + custom_requirements + f'''
 CRITICAL REQUIREMENTS (MOST IMPORTANT):
 - You MUST write EXACTLY {num_phrases_for_prompt} lines of lyrics.
 LYRICS:
 (Your {num_phrases_for_prompt} numbered lyric lines go here, each starting with its number, a period, and a space)
+Remember: Output EXACTLY {num_phrases_for_prompt} numbered lyric lines. Each line's content (after removing the number) must be {min_syl_for_prompt}-{max_syl_for_prompt} syllables.'''
         # Generate with optimized parameters for QwQ model
         messages = [
             {"role": "user", "content": prompt}
 # Create Gradio interface
 def create_interface():
+    with gr.Blocks(title="Advanced Music Analysis & Beat-Matched Lyrics Generator") as demo:
+        gr.Markdown("# 🎵 Advanced Music Analysis & Beat-Matched Lyrics Generator")
+        gr.Markdown("**Upload music to get comprehensive analysis and generate perfectly synchronized lyrics that match the rhythm, emotion, and structure of your audio**")
         with gr.Row():
             with gr.Column(scale=1):
                 audio_input = gr.Audio(
+                    label="🎧 Upload or Record Audio",
                     type="filepath",
                     sources=["upload", "microphone"]
                 )
+                # Add custom prompt input
+                custom_prompt_input = gr.Textbox(
+                    label="🎨 Custom Lyrics Requirements (Optional)",
+                    placeholder="e.g., 'Write about a rainy day in the city' or 'Include metaphors about flying' or 'Make it about overcoming challenges'",
+                    lines=3,
+                    info="Add any specific requirements, themes, or creative directions for the lyrics. This will be merged with the music analysis to create personalized lyrics."
+                )
+                analyze_btn = gr.Button("🚀 Analyze Music & Generate Lyrics", variant="primary", size="lg")
             with gr.Column(scale=2):
+                with gr.Tab("📊 Music Analysis"):
+                    analysis_output = gr.Textbox(label="Comprehensive Music Analysis Results", lines=10)
                     with gr.Row():
+                        tempo_output = gr.Number(label="🥁 Tempo (BPM)")
+                        time_sig_output = gr.Textbox(label="⏱️ Time Signature")
                     with gr.Row():
+                        primary_emotion_output = gr.Textbox(label="😊 Primary Emotion")
+                        secondary_emotion_output = gr.Textbox(label="😌 Secondary Emotion")
                     with gr.Row():
+                        primary_theme_output = gr.Textbox(label="🎭 Primary Theme")
+                        secondary_theme_output = gr.Textbox(label="🎪 Secondary Theme")
+                        genre_output = gr.Textbox(label="🎼 Primary Genre")
+                with gr.Tab("🎤 Generated Lyrics"):
+                    lyrics_output = gr.Textbox(label="Beat-Synchronized Lyrics", lines=20)
+                with gr.Tab("🎯 Beat Matching Analysis"):
+                    beat_match_output = gr.Markdown(label="Rhythm & Syllable Synchronization Analysis")
         # Set up event handlers
         analyze_btn.click(
             fn=process_audio,
+            inputs=[audio_input, custom_prompt_input],
             outputs=[
                 analysis_output, lyrics_output, tempo_output, time_sig_output,
                 primary_emotion_output, secondary_emotion_output,
         )
         # Format supported genres for display
+        supported_genres_md = "\n".join([f"- **{genre.capitalize()}**: Optimized for {genre} music patterns" for genre in beat_analyzer.supported_genres])
         gr.Markdown(f"""
+        ## 🚀 How It Works
+        1. **🎧 Upload Audio**: Support for various formats (MP3, WAV, etc.) or record directly in your browser
+        2. **🎨 Add Custom Requirements** (Optional): Specify your creative vision, themes, or style preferences
+        3. **🔍 Advanced Analysis**: Multi-layered analysis including:
+           - **Tempo & Time Signature**: Advanced detection using multiple algorithms
+           - **Emotional Profiling**: 8-dimensional emotion mapping (happy, sad, excited, calm, etc.)
+           - **Thematic Analysis**: Musical themes (love, triumph, adventure, reflection, etc.)
+           - **Beat Pattern Extraction**: Precise rhythm and stress pattern identification
+           - **Genre Classification**: AI-powered genre detection with confidence scores
+        4. **🎤 Lyrics Generation**: AI creates perfectly synchronized lyrics that:
+           - **Match Beat Patterns**: Each line aligns with musical phrases and rhythm
+           - **Follow Syllable Constraints**: Precise syllable-to-beat mapping for natural flow
+           - **Incorporate Emotions & Themes**: Blend detected musical characteristics
+           - **Include Your Requirements**: Merge your creative directions seamlessly
+        5. **📊 Quality Analysis**: Comprehensive metrics showing beat matching accuracy and flow quality
+        ## 🎨 Custom Requirements Examples
+        **🌟 Themes**: "Write about nature and freedom", "Focus on urban nightlife", "Tell a story about friendship"
+        **🖼️ Imagery**: "Use ocean metaphors", "Include references to stars and sky", "Focus on light and shadow"
+        **👁️ Perspective**: "From a child's viewpoint", "Make it nostalgic", "Focus on hope and resilience"
+        **✍️ Style**: "Use simple everyday language", "Include some rhyming", "Make it conversational"
+        **📝 Content**: "Avoid sad themes", "Include words 'journey' and 'home'", "Focus on personal growth"
+        The system intelligently blends your requirements with detected musical characteristics to create personalized, rhythm-perfect lyrics.
+        ## 🎵 Supported Genres for Full Lyrics Generation
+        **✅ Full Support** (Complete Analysis + Beat-Matched Lyrics):
         {supported_genres_md}
+        These genres have consistent syllable-to-beat patterns that work optimally with our advanced rhythm-matching algorithm.
+        **📊 Analysis Only**: All other genres receive comprehensive musical analysis (tempo, emotion, themes, etc.) without lyrics generation.
+        ## 🛠️ Advanced Features
+        - **🎯 Beat Synchronization**: Syllable-perfect alignment with musical phrases
+        - **🧠 Emotion Integration**: Lyrics reflect detected emotional characteristics
+        - **🎭 Theme Incorporation**: Musical themes guide lyrical content
+        - **📏 Quality Metrics**: Detailed analysis of rhythm matching accuracy
+        - **🔄 Flow Optimization**: Natural sentence continuation across lines
+        - **⚙️ Genre Optimization**: Tailored patterns for different musical styles
         """)
     return demo