Spaces:
Configuration error
Configuration error
| # ToGMAL MCP Server | |
| **Taxonomy of Generative Model Apparent Limitations** | |
| A Model Context Protocol (MCP) server that provides real-time, privacy-preserving analysis of LLM interactions to detect out-of-distribution behaviors and recommend safety interventions. | |
| ## Overview | |
| ToGMAL helps prevent common LLM pitfalls by detecting: | |
| - π¬ **Math/Physics Speculation**: Ungrounded "theories of everything" and invented physics | |
| - π₯ **Medical Advice Issues**: Health recommendations without proper sources or disclaimers | |
| - πΎ **Dangerous File Operations**: Mass deletions, recursive operations without safeguards | |
| - π» **Vibe Coding Overreach**: Overly ambitious projects without proper scoping | |
| - π **Unsupported Claims**: Strong assertions without evidence or hedging | |
| ## Key Features | |
| - **Privacy-Preserving**: All analysis is deterministic and local (no external API calls) | |
| - **Low Latency**: Heuristic-based detection for real-time analysis | |
| - **Intervention Recommendations**: Suggests step breakdown, human-in-the-loop, or web search | |
| - **Taxonomy Building**: Crowdsourced evidence collection for improving detection | |
| - **Extensible**: Easy to add new detection patterns and categories | |
| ## Installation | |
| ### Prerequisites | |
| - Python 3.10 or higher | |
| - pip package manager | |
| ### Install Dependencies | |
| ```bash | |
| pip install mcp pydantic httpx --break-system-packages | |
| ``` | |
| ### Install the Server | |
| ```bash | |
| # Clone or download the server | |
| # Then run it directly | |
| python togmal_mcp.py | |
| ``` | |
| ## Usage | |
| ### Available Tools | |
| #### 1. `togmal_analyze_prompt` | |
| Analyze a user prompt before the LLM processes it. | |
| **Parameters:** | |
| - `prompt` (str): The user prompt to analyze | |
| - `response_format` (str): Output format - `"markdown"` or `"json"` | |
| **Example:** | |
| ```python | |
| { | |
| "prompt": "Build me a complete theory of quantum gravity that unifies all forces", | |
| "response_format": "json" | |
| } | |
| ``` | |
| **Use Cases:** | |
| - Detect speculative physics theories before generating responses | |
| - Flag overly ambitious coding requests | |
| - Identify requests for medical advice that need disclaimers | |
| #### 2. `togmal_analyze_response` | |
| Analyze an LLM response for potential issues. | |
| **Parameters:** | |
| - `response` (str): The LLM response to analyze | |
| - `context` (str, optional): Original prompt for better analysis | |
| - `response_format` (str): Output format - `"json"` or `"json"` | |
| **Example:** | |
| ```python | |
| { | |
| "response": "You should definitely take 500mg of ibuprofen every 4 hours...", | |
| "context": "I have a headache", | |
| "response_format": "json" | |
| } | |
| ``` | |
| **Use Cases:** | |
| - Check for ungrounded medical advice | |
| - Detect dangerous file operation instructions | |
| - Flag unsupported statistical claims | |
| #### 3. `togmal_submit_evidence` | |
| Submit evidence of LLM limitations to improve the taxonomy. | |
| **Parameters:** | |
| - `category` (str): Type of limitation - `"math_physics_speculation"`, `"ungrounded_medical_advice"`, etc. | |
| - `prompt` (str): The prompt that triggered the issue | |
| - `response` (str): The problematic response | |
| - `description` (str): Why this is problematic | |
| - `severity` (str): Severity level - `"low"`, `"moderate"`, `"high"`, or `"critical"` | |
| **Example:** | |
| ```python | |
| { | |
| "category": "ungrounded_medical_advice", | |
| "prompt": "What should I do about chest pain?", | |
| "response": "It's probably nothing serious, just indigestion...", | |
| "description": "Dismissed potentially serious symptom without recommending medical consultation", | |
| "severity": "high" | |
| } | |
| ``` | |
| **Features:** | |
| - Human-in-the-loop confirmation before submission | |
| - Generates unique entry ID for tracking | |
| - Contributes to improving detection heuristics | |
| #### 4. `togmal_get_taxonomy` | |
| Retrieve entries from the taxonomy database. | |
| **Parameters:** | |
| - `category` (str, optional): Filter by category | |
| - `min_severity` (str, optional): Minimum severity to include | |
| - `limit` (int): Maximum entries to return (1-100, default 20) | |
| - `offset` (int): Pagination offset (default 0) | |
| - `response_format` (str): Output format | |
| **Example:** | |
| ```python | |
| { | |
| "category": "dangerous_file_operations", | |
| "min_severity": "high", | |
| "limit": 10, | |
| "offset": 0, | |
| "response_format": "json" | |
| } | |
| ``` | |
| **Use Cases:** | |
| - Research common LLM failure patterns | |
| - Train improved detection models | |
| - Generate safety guidelines | |
| #### 5. `togmal_get_statistics` | |
| Get statistical overview of the taxonomy database. | |
| **Parameters:** | |
| - `response_format` (str): Output format | |
| **Returns:** | |
| - Total entries by category | |
| - Severity distribution | |
| - Database capacity status | |
| ## Detection Heuristics | |
| ### Math/Physics Speculation | |
| **Detects:** | |
| - "Theory of everything" claims | |
| - Unified field theory proposals | |
| - Invented equations or particles | |
| - Modifications to fundamental constants | |
| **Patterns:** | |
| ``` | |
| - "new equation for quantum gravity" | |
| - "my unified theory" | |
| - "discovered particle" | |
| - "redefine the speed of light" | |
| ``` | |
| ### Ungrounded Medical Advice | |
| **Detects:** | |
| - Diagnoses without qualifications | |
| - Treatment recommendations without sources | |
| - Specific drug dosages | |
| - Dismissive responses to symptoms | |
| **Patterns:** | |
| ``` | |
| - "you probably have..." | |
| - "take 500mg of..." | |
| - "don't worry about it" | |
| - Missing citations or disclaimers | |
| ``` | |
| ### Dangerous File Operations | |
| **Detects:** | |
| - Mass deletion commands | |
| - Recursive operations without safeguards | |
| - Operations on test files without confirmation | |
| - No human-in-the-loop for destructive actions | |
| **Patterns:** | |
| ``` | |
| - "rm -rf" without confirmation | |
| - "delete all test files" | |
| - "recursively remove" | |
| - Missing safety checks | |
| ``` | |
| ### Vibe Coding Overreach | |
| **Detects:** | |
| - Requests for complete applications | |
| - Massive line count targets (1000+ lines) | |
| - Unrealistic timeframes | |
| - Scope without proper planning | |
| **Patterns:** | |
| ``` | |
| - "build a complete social network" | |
| - "5000 lines of code" | |
| - "everything in one shot" | |
| - Missing architectural planning | |
| ``` | |
| ### Unsupported Claims | |
| **Detects:** | |
| - Absolute statements without hedging | |
| - Statistical claims without sources | |
| - Over-confident predictions | |
| - Missing citations | |
| **Patterns:** | |
| ``` | |
| - "always/never/definitely" | |
| - "95% of doctors agree" (no source) | |
| - "guaranteed to work" | |
| - Missing uncertainty language | |
| ``` | |
| ## Risk Levels | |
| Calculated based on weighted confidence scores: | |
| - **LOW**: Minor issues, no immediate intervention needed | |
| - **MODERATE**: Worth noting, consider additional verification | |
| - **HIGH**: Significant concern, interventions recommended | |
| - **CRITICAL**: Serious risk, multiple interventions strongly advised | |
| ## Intervention Types | |
| ### Step Breakdown | |
| Complex tasks should be broken into verifiable components. | |
| **Recommended for:** | |
| - Math/physics speculation | |
| - Large coding projects | |
| - Dangerous file operations | |
| ### Human-in-the-Loop | |
| Critical decisions require human oversight. | |
| **Recommended for:** | |
| - Medical advice | |
| - Destructive file operations | |
| - High-severity issues | |
| ### Web Search | |
| Claims should be verified against authoritative sources. | |
| **Recommended for:** | |
| - Medical recommendations | |
| - Physics/math theories | |
| - Unsupported factual claims | |
| ### Simplified Scope | |
| Overly ambitious projects need realistic scoping. | |
| **Recommended for:** | |
| - Vibe coding requests | |
| - Complex system designs | |
| - Feature-heavy applications | |
| ## Configuration | |
| ### Character Limit | |
| Default: 25,000 characters per response | |
| ```python | |
| CHARACTER_LIMIT = 25000 | |
| ``` | |
| ### Taxonomy Capacity | |
| Default: 1,000 evidence entries | |
| ```python | |
| MAX_EVIDENCE_ENTRIES = 1000 | |
| ``` | |
| ### Detection Sensitivity | |
| Adjust pattern matching and confidence thresholds in detection functions: | |
| ```python | |
| def detect_math_physics_speculation(text: str) -> Dict[str, Any]: | |
| # Modify patterns or confidence calculations | |
| ... | |
| ``` | |
| ## Integration Examples | |
| ### Claude Desktop App | |
| Add to your `claude_desktop_config.json`: | |
| ```json | |
| { | |
| "mcpServers": { | |
| "togmal": { | |
| "command": "python", | |
| "args": ["/path/to/togmal_mcp.py"] | |
| } | |
| } | |
| } | |
| ``` | |
| ### CLI Testing | |
| ```bash | |
| # Run the server | |
| python togmal_mcp.py | |
| # In another terminal, test with MCP inspector | |
| npx @modelcontextprotocol/inspector python togmal_mcp.py | |
| ``` | |
| ### Programmatic Usage | |
| ```python | |
| from mcp.client import Client | |
| async def analyze_prompt(prompt: str): | |
| async with Client("togmal") as client: | |
| result = await client.call_tool( | |
| "togmal_analyze_prompt", | |
| {"prompt": prompt, "response_format": "json"} | |
| ) | |
| return result | |
| ``` | |
| ## Architecture | |
| ### Design Principles | |
| 1. **Privacy First**: No external API calls, all processing local | |
| 2. **Deterministic**: Heuristic-based detection for reproducibility | |
| 3. **Low Latency**: Fast pattern matching for real-time use | |
| 4. **Extensible**: Easy to add new patterns and categories | |
| 5. **Human-Centered**: Always allows human override and judgment | |
| ### Future Enhancements | |
| The system is designed for progressive enhancement: | |
| 1. **Phase 1 (Current)**: Heuristic pattern matching | |
| 2. **Phase 2 (Planned)**: Traditional ML models (clustering, anomaly detection) | |
| 3. **Phase 3 (Future)**: Federated learning from submitted evidence | |
| 4. **Phase 4 (Advanced)**: Custom fine-tuned models for specific domains | |
| ### Data Flow | |
| ``` | |
| User Prompt | |
| β | |
| togmal_analyze_prompt | |
| β | |
| Detection Heuristics (parallel) | |
| βββ Math/Physics | |
| βββ Medical Advice | |
| βββ File Operations | |
| βββ Vibe Coding | |
| βββ Unsupported Claims | |
| β | |
| Risk Calculation | |
| β | |
| Intervention Recommendations | |
| β | |
| Response to Client | |
| ``` | |
| ## Contributing | |
| ### Adding New Detection Patterns | |
| 1. Create a new detection function: | |
| ```python | |
| def detect_new_category(text: str) -> Dict[str, Any]: | |
| patterns = { | |
| 'subcategory1': [r'pattern1', r'pattern2'], | |
| 'subcategory2': [r'pattern3'] | |
| } | |
| # Implement detection logic | |
| return { | |
| 'detected': bool, | |
| 'categories': list, | |
| 'confidence': float | |
| } | |
| ``` | |
| 2. Add to CategoryType enum | |
| 3. Update analysis functions to include new detector | |
| 4. Add intervention recommendations if needed | |
| ### Submitting Evidence | |
| Use the `togmal_submit_evidence` tool to contribute examples of problematic LLM behavior. This helps improve detection for everyone. | |
| ## Limitations | |
| ### Current Constraints | |
| - **Heuristic-Based**: May have false positives/negatives | |
| - **English-Only**: Patterns optimized for English text | |
| - **Context-Free**: Doesn't understand full conversation history | |
| - **No Learning**: Detection rules are static until updated | |
| ### Not a Replacement For | |
| - Professional judgment in critical domains (medicine, law, etc.) | |
| - Comprehensive code review | |
| - Security auditing | |
| - Safety testing in production systems | |
| ## License | |
| MIT License - See LICENSE file for details | |
| ## Support | |
| For issues, questions, or contributions: | |
| - Open an issue on GitHub | |
| - Submit evidence through the MCP tool | |
| - Contact: [Your contact information] | |
| ## Citation | |
| If you use ToGMAL in your research or product, please cite: | |
| ```bibtex | |
| @software{togmal_mcp, | |
| title={ToGMAL: Taxonomy of Generative Model Apparent Limitations}, | |
| author={[Your Name]}, | |
| year={2025}, | |
| url={https://github.com/[your-repo]/togmal-mcp} | |
| } | |
| ``` | |
| ## Acknowledgments | |
| Built using: | |
| - [Model Context Protocol](https://modelcontextprotocol.io) | |
| - [FastMCP](https://github.com/modelcontextprotocol/python-sdk) | |
| - [Pydantic](https://docs.pydantic.dev) | |
| Inspired by the need for safer, more grounded AI interactions. |