Togmal-demo / README.md
HeTalksInMaths
Add combined tabbed interface with MCP tools
310c773

A newer version of the Gradio SDK is available: 5.49.1

Upgrade
metadata
title: ToGMAL - AI Difficulty & Safety Analysis
emoji: 🧠
colorFrom: yellow
colorTo: purple
sdk: gradio
sdk_version: 5.42.0
app_file: app_combined.py
pinned: false
license: apache-2.0
short_description: LLM difficulty analyzer with chat assistant & MCP tools

🧠 ToGMAL - Intelligent LLM Difficulty & Safety Analysis

Taxonomy of Generative Model Apparent Limitations - Real-time difficulty assessment and chat interface with MCP tool integration.

🎯 Unified Tabbed Interface

Switch seamlessly between two powerful tools:

πŸ“Š Tab 1: Difficulty Analyzer

  • Direct analysis using 32K+ benchmark questions
  • Instant difficulty ratings and success rates
  • Vector similarity search
  • Perfect for quick assessments

πŸ€– Tab 2: Chat Assistant πŸ†•

Interactive chat where a free LLM can call MCP tools!

  • πŸ€– Chat with Mistral-7B (free via HuggingFace)
  • πŸ› οΈ LLM calls tools dynamically based on context
  • πŸ“Š Transparent tool execution (see what's happening)
  • πŸ’¬ Natural language responses using tool data

Features

  • πŸ“Š Real Benchmark Data: Analyzes prompts against 14,042 questions from MMLU, MMLU-Pro, GPQA, and MATH datasets
  • 🎯 Vector Similarity Search: Uses semantic embeddings to find similar benchmark questions
  • πŸ“ˆ Success Rate Prediction: Shows weighted success rates from top LLMs (Claude, GPT-4, Gemini)
  • πŸ’‘ Smart Recommendations: Provides actionable suggestions based on difficulty level

How It Works

  1. Enter any prompt or question
  2. The system finds the 5 most similar benchmark questions using vector embeddings
  3. Calculates a weighted difficulty score based on how well LLMs perform on similar questions
  4. Provides risk assessment and recommendations

Example Prompts

  • "Calculate the quantum correction to the partition function for a 3D harmonic oscillator"
  • "Prove that there are infinitely many prime numbers"
  • "Diagnose a patient with acute chest pain and shortness of breath"
  • "Implement a binary search tree with insert and search operations"

🎯 Quick Start

Run Combined Demo (Recommended)

python app_combined.py

Or run individual demos:

Run Difficulty Analyzer Only

python app.py

Run Chat Demo Only

python chat_app.py
# Or use the launcher:
./launch_chat.sh

Try in the Chat tab:

  • "How difficult is this: [your prompt]?"
  • "Is this safe: [your prompt]?"
  • "Analyze the difficulty of: Calculate quantum corrections..."

See CHAT_DEMO_README.md for full documentation.

Technology

  • Vector Database: ChromaDB with persistent storage
  • Embeddings: sentence-transformers (all-MiniLM-L6-v2)
  • Frontend: Gradio
  • Data: Real benchmark questions with ground-truth success rates

Repository

Full source code: github.com/HeTalksInMaths/togmal-mcp