Spaces:

JustTheStatsHuman
/

Togmal-demo

Configuration error

App Files Files Community

Togmal-demo / README.md

HeTalksInMaths

Clean up repository: Remove unnecessary markdown files and update README

560c34e 21 days ago

preview code

raw

history blame contribute delete

11.2 kB

	# ToGMAL MCP Server

	Taxonomy of Generative Model Apparent Limitations

	A Model Context Protocol (MCP) server that provides real-time, privacy-preserving analysis of LLM interactions to detect out-of-distribution behaviors and recommend safety interventions.

	## Overview

	ToGMAL helps prevent common LLM pitfalls by detecting:

	- 🔬 Math/Physics Speculation: Ungrounded "theories of everything" and invented physics
	- 🏥 Medical Advice Issues: Health recommendations without proper sources or disclaimers
	- 💾 Dangerous File Operations: Mass deletions, recursive operations without safeguards
	- 💻 Vibe Coding Overreach: Overly ambitious projects without proper scoping
	- 📊 Unsupported Claims: Strong assertions without evidence or hedging

	## Key Features

	- Privacy-Preserving: All analysis is deterministic and local (no external API calls)
	- Low Latency: Heuristic-based detection for real-time analysis
	- Intervention Recommendations: Suggests step breakdown, human-in-the-loop, or web search
	- Taxonomy Building: Crowdsourced evidence collection for improving detection
	- Extensible: Easy to add new detection patterns and categories

	## Installation

	### Prerequisites

	- Python 3.10 or higher
	- pip package manager

	### Install Dependencies

	```bash
	pip install mcp pydantic httpx --break-system-packages
	```

	### Install the Server

	```bash
	# Clone or download the server
	# Then run it directly
	python togmal_mcp.py
	```

	## Usage

	### Available Tools

	#### 1. `togmal_analyze_prompt`

	Analyze a user prompt before the LLM processes it.

	Parameters:
	- `prompt` (str): The user prompt to analyze
	- `response_format` (str): Output format - `"markdown"` or `"json"`

	Example:
	```python
	{
	"prompt": "Build me a complete theory of quantum gravity that unifies all forces",
	"response_format": "json"
	}
	```

	Use Cases:
	- Detect speculative physics theories before generating responses
	- Flag overly ambitious coding requests
	- Identify requests for medical advice that need disclaimers

	#### 2. `togmal_analyze_response`

	Analyze an LLM response for potential issues.

	Parameters:
	- `response` (str): The LLM response to analyze
	- `context` (str, optional): Original prompt for better analysis
	- `response_format` (str): Output format - `"json"` or `"json"`

	Example:
	```python
	{
	"response": "You should definitely take 500mg of ibuprofen every 4 hours...",
	"context": "I have a headache",
	"response_format": "json"
	}
	```

	Use Cases:
	- Check for ungrounded medical advice
	- Detect dangerous file operation instructions
	- Flag unsupported statistical claims

	#### 3. `togmal_submit_evidence`

	Submit evidence of LLM limitations to improve the taxonomy.

	Parameters:
	- `category` (str): Type of limitation - `"math_physics_speculation"`, `"ungrounded_medical_advice"`, etc.
	- `prompt` (str): The prompt that triggered the issue
	- `response` (str): The problematic response
	- `description` (str): Why this is problematic
	- `severity` (str): Severity level - `"low"`, `"moderate"`, `"high"`, or `"critical"`

	Example:
	```python
	{
	"category": "ungrounded_medical_advice",
	"prompt": "What should I do about chest pain?",
	"response": "It's probably nothing serious, just indigestion...",
	"description": "Dismissed potentially serious symptom without recommending medical consultation",
	"severity": "high"
	}
	```

	Features:
	- Human-in-the-loop confirmation before submission
	- Generates unique entry ID for tracking
	- Contributes to improving detection heuristics

	#### 4. `togmal_get_taxonomy`

	Retrieve entries from the taxonomy database.

	Parameters:
	- `category` (str, optional): Filter by category
	- `min_severity` (str, optional): Minimum severity to include
	- `limit` (int): Maximum entries to return (1-100, default 20)
	- `offset` (int): Pagination offset (default 0)
	- `response_format` (str): Output format

	Example:
	```python
	{
	"category": "dangerous_file_operations",
	"min_severity": "high",
	"limit": 10,
	"offset": 0,
	"response_format": "json"
	}
	```

	Use Cases:
	- Research common LLM failure patterns
	- Train improved detection models
	- Generate safety guidelines

	#### 5. `togmal_get_statistics`

	Get statistical overview of the taxonomy database.

	Parameters:
	- `response_format` (str): Output format

	Returns:
	- Total entries by category
	- Severity distribution
	- Database capacity status

	## Detection Heuristics

	### Math/Physics Speculation

	Detects:
	- "Theory of everything" claims
	- Unified field theory proposals
	- Invented equations or particles
	- Modifications to fundamental constants

	Patterns:
	```
	- "new equation for quantum gravity"
	- "my unified theory"
	- "discovered particle"
	- "redefine the speed of light"
	```

	### Ungrounded Medical Advice

	Detects:
	- Diagnoses without qualifications
	- Treatment recommendations without sources
	- Specific drug dosages
	- Dismissive responses to symptoms

	Patterns:
	```
	- "you probably have..."
	- "take 500mg of..."
	- "don't worry about it"
	- Missing citations or disclaimers
	```

	### Dangerous File Operations

	Detects:
	- Mass deletion commands
	- Recursive operations without safeguards
	- Operations on test files without confirmation
	- No human-in-the-loop for destructive actions

	Patterns:
	```
	- "rm -rf" without confirmation
	- "delete all test files"
	- "recursively remove"
	- Missing safety checks
	```

	### Vibe Coding Overreach

	Detects:
	- Requests for complete applications
	- Massive line count targets (1000+ lines)
	- Unrealistic timeframes
	- Scope without proper planning

	Patterns:
	```
	- "build a complete social network"
	- "5000 lines of code"
	- "everything in one shot"
	- Missing architectural planning
	```

	### Unsupported Claims

	Detects:
	- Absolute statements without hedging
	- Statistical claims without sources
	- Over-confident predictions
	- Missing citations

	Patterns:
	```
	- "always/never/definitely"
	- "95% of doctors agree" (no source)
	- "guaranteed to work"
	- Missing uncertainty language
	```

	## Risk Levels

	Calculated based on weighted confidence scores:

	- LOW: Minor issues, no immediate intervention needed
	- MODERATE: Worth noting, consider additional verification
	- HIGH: Significant concern, interventions recommended
	- CRITICAL: Serious risk, multiple interventions strongly advised

	## Intervention Types

	### Step Breakdown
	Complex tasks should be broken into verifiable components.

	Recommended for:
	- Math/physics speculation
	- Large coding projects
	- Dangerous file operations

	### Human-in-the-Loop
	Critical decisions require human oversight.

	Recommended for:
	- Medical advice
	- Destructive file operations
	- High-severity issues

	### Web Search
	Claims should be verified against authoritative sources.

	Recommended for:
	- Medical recommendations
	- Physics/math theories
	- Unsupported factual claims

	### Simplified Scope
	Overly ambitious projects need realistic scoping.

	Recommended for:
	- Vibe coding requests
	- Complex system designs
	- Feature-heavy applications

	## Configuration

	### Character Limit
	Default: 25,000 characters per response
	```python
	CHARACTER_LIMIT = 25000
	```

	### Taxonomy Capacity
	Default: 1,000 evidence entries
	```python
	MAX_EVIDENCE_ENTRIES = 1000
	```

	### Detection Sensitivity
	Adjust pattern matching and confidence thresholds in detection functions:
	```python
	def detect_math_physics_speculation(text: str) -> Dict[str, Any]:
	# Modify patterns or confidence calculations
	...
	```

	## Integration Examples

	### Claude Desktop App

	Add to your `claude_desktop_config.json`:

	```json
	{
	"mcpServers": {
	"togmal": {
	"command": "python",
	"args": ["/path/to/togmal_mcp.py"]
	}
	}
	}
	```

	### CLI Testing

	```bash
	# Run the server
	python togmal_mcp.py

	# In another terminal, test with MCP inspector
	npx @modelcontextprotocol/inspector python togmal_mcp.py
	```

	### Programmatic Usage

	```python
	from mcp.client import Client

	async def analyze_prompt(prompt: str):
	async with Client("togmal") as client:
	result = await client.call_tool(
	"togmal_analyze_prompt",
	{"prompt": prompt, "response_format": "json"}
	)
	return result
	```

	## Architecture

	### Design Principles

	1. Privacy First: No external API calls, all processing local
	2. Deterministic: Heuristic-based detection for reproducibility
	3. Low Latency: Fast pattern matching for real-time use
	4. Extensible: Easy to add new patterns and categories
	5. Human-Centered: Always allows human override and judgment

	### Future Enhancements

	The system is designed for progressive enhancement:

	1. Phase 1 (Current): Heuristic pattern matching
	2. Phase 2 (Planned): Traditional ML models (clustering, anomaly detection)
	3. Phase 3 (Future): Federated learning from submitted evidence
	4. Phase 4 (Advanced): Custom fine-tuned models for specific domains

	### Data Flow

	```
	User Prompt
	↓
	togmal_analyze_prompt
	↓
	Detection Heuristics (parallel)
	├── Math/Physics
	├── Medical Advice
	├── File Operations
	├── Vibe Coding
	└── Unsupported Claims
	↓
	Risk Calculation
	↓
	Intervention Recommendations
	↓
	Response to Client
	```

	## Contributing

	### Adding New Detection Patterns

	1. Create a new detection function:
	```python
	def detect_new_category(text: str) -> Dict[str, Any]:
	patterns = {
	'subcategory1': [r'pattern1', r'pattern2'],
	'subcategory2': [r'pattern3']
	}
	# Implement detection logic
	return {
	'detected': bool,
	'categories': list,
	'confidence': float
	}
	```

	2. Add to CategoryType enum
	3. Update analysis functions to include new detector
	4. Add intervention recommendations if needed

	### Submitting Evidence

	Use the `togmal_submit_evidence` tool to contribute examples of problematic LLM behavior. This helps improve detection for everyone.

	## Limitations

	### Current Constraints

	- Heuristic-Based: May have false positives/negatives
	- English-Only: Patterns optimized for English text
	- Context-Free: Doesn't understand full conversation history
	- No Learning: Detection rules are static until updated

	### Not a Replacement For

	- Professional judgment in critical domains (medicine, law, etc.)
	- Comprehensive code review
	- Security auditing
	- Safety testing in production systems

	## License

	MIT License - See LICENSE file for details

	## Support

	For issues, questions, or contributions:
	- Open an issue on GitHub
	- Submit evidence through the MCP tool
	- Contact: [Your contact information]

	## Citation

	If you use ToGMAL in your research or product, please cite:

	```bibtex
	@software{togmal_mcp,
	title={ToGMAL: Taxonomy of Generative Model Apparent Limitations},
	author={[Your Name]},
	year={2025},
	url={https://github.com/[your-repo]/togmal-mcp}
	}
	```

	## Acknowledgments

	Built using:
	- [Model Context Protocol](https://modelcontextprotocol.io)
	- [FastMCP](https://github.com/modelcontextprotocol/python-sdk)
	- [Pydantic](https://docs.pydantic.dev)

	Inspired by the need for safer, more grounded AI interactions.