--- title: ToGMAL - AI Difficulty & Safety Analysis emoji: 🧠 colorFrom: yellow colorTo: purple sdk: gradio sdk_version: 5.42.0 app_file: app_combined.py pinned: false license: apache-2.0 short_description: LLM difficulty analyzer with chat assistant & MCP tools --- # 🧠 ToGMAL - Intelligent LLM Difficulty & Safety Analysis **Taxonomy of Generative Model Apparent Limitations** - Real-time difficulty assessment and chat interface with MCP tool integration. ## 🎯 Unified Tabbed Interface Switch seamlessly between two powerful tools: ### 📊 **Tab 1: Difficulty Analyzer** - Direct analysis using 32K+ benchmark questions - Instant difficulty ratings and success rates - Vector similarity search - Perfect for quick assessments ### 🤖 **Tab 2: Chat Assistant** 🆕 **Interactive chat where a free LLM can call MCP tools!** - 🤖 Chat with Mistral-7B (free via HuggingFace) - 🛠️ LLM calls tools dynamically based on context - 📊 Transparent tool execution (see what's happening) - 💬 Natural language responses using tool data ## Features - 📊 **Real Benchmark Data**: Analyzes prompts against 14,042 questions from MMLU, MMLU-Pro, GPQA, and MATH datasets - 🎯 **Vector Similarity Search**: Uses semantic embeddings to find similar benchmark questions - 📈 **Success Rate Prediction**: Shows weighted success rates from top LLMs (Claude, GPT-4, Gemini) - 💡 **Smart Recommendations**: Provides actionable suggestions based on difficulty level ## How It Works 1. Enter any prompt or question 2. The system finds the 5 most similar benchmark questions using vector embeddings 3. Calculates a weighted difficulty score based on how well LLMs perform on similar questions 4. Provides risk assessment and recommendations ## Example Prompts - "Calculate the quantum correction to the partition function for a 3D harmonic oscillator" - "Prove that there are infinitely many prime numbers" - "Diagnose a patient with acute chest pain and shortness of breath" - "Implement a binary search tree with insert and search operations" ## 🎯 Quick Start ### Run Combined Demo (Recommended) ```bash python app_combined.py ``` Or run individual demos: ### Run Difficulty Analyzer Only ```bash python app.py ``` ### Run Chat Demo Only ```bash python chat_app.py # Or use the launcher: ./launch_chat.sh ``` **Try in the Chat tab:** - "How difficult is this: [your prompt]?" - "Is this safe: [your prompt]?" - "Analyze the difficulty of: Calculate quantum corrections..." See [`CHAT_DEMO_README.md`](CHAT_DEMO_README.md) for full documentation. ## Technology - **Vector Database**: ChromaDB with persistent storage - **Embeddings**: sentence-transformers (all-MiniLM-L6-v2) - **Frontend**: Gradio - **Data**: Real benchmark questions with ground-truth success rates ## Repository Full source code: [github.com/HeTalksInMaths/togmal-mcp](https://github.com/HeTalksInMaths/togmal-mcp)