Spaces:

neural-thinker
/

cidadao.ai-models

Sleeping

App Files Files Community

cidadao.ai-models / migration_plan.md

neural-thinker

feat: initial cidadao.ai-models deployment

b95e73a 3 months ago

preview code

raw

history blame contribute delete

13.4 kB

	# 🔄 PLANO DE MIGRAÇÃO ML: BACKEND → MODELS

	> Documento de Planejamento da Migração
	> Status: Em Execução - Janeiro 2025
	> Objetivo: Separar responsabilidades ML do sistema multi-agente

	---

	## 📊 ANÁLISE PRÉ-MIGRAÇÃO

	### CÓDIGO ML NO BACKEND ATUAL
	- Total: 7.004 linhas em 13 módulos `src/ml/`
	- Funcionalidade: Pipeline completo ML funcional
	- Integração: Importado diretamente pelos 16 agentes
	- Status: Production-ready, mas acoplado ao backend

	### CIDADAO.AI-MODELS STATUS
	- Repositório: Criado com documentação MLOps completa
	- Código: Apenas main.py placeholder (16 linhas)
	- Documentação: 654 linhas de especificação técnica
	- Pronto: Para receber migração ML

	---

	## 🎯 ESTRATÉGIA DE MIGRAÇÃO

	### ABORDAGEM: MIGRAÇÃO PROGRESSIVA
	1. ✅ Não quebrar funcionamento atual do backend
	2. ✅ Migrar código gradualmente testando a cada etapa
	3. ✅ Manter compatibilidade durante transição
	4. ✅ Implementar fallback local se models indisponível

	---

	## 📋 FASE 1: ESTRUTURAÇÃO (HOJE)

	### 1.1 Criar Estrutura Base
	```bash
	cidadao.ai-models/
	├── src/
	│ ├── __init__.py
	│ ├── models/ # Core ML models
	│ │ ├── __init__.py
	│ │ ├── anomaly_detection/ # Anomaly detection pipeline
	│ │ ├── pattern_analysis/ # Pattern recognition
	│ │ ├── spectral_analysis/ # Frequency domain analysis
	│ │ └── core/ # Base classes and utilities
	│ ├── training/ # Training infrastructure
	│ │ ├── __init__.py
	│ │ ├── pipelines/ # Training pipelines
	│ │ ├── configs/ # Training configurations
	│ │ └── utils/ # Training utilities
	│ ├── inference/ # Model serving
	│ │ ├── __init__.py
	│ │ ├── api_server.py # FastAPI inference server
	│ │ ├── batch_processor.py # Batch inference
	│ │ └── streaming.py # Real-time inference
	│ └── deployment/ # Deployment tools
	│ ├── __init__.py
	│ ├── huggingface/ # HF Hub integration
	│ ├── docker/ # Containerization
	│ └── monitoring/ # ML monitoring
	├── tests/
	│ ├── __init__.py
	│ ├── unit/ # Unit tests
	│ ├── integration/ # Integration tests
	│ └── e2e/ # End-to-end tests
	├── configs/ # Model configurations
	├── notebooks/ # Jupyter experiments
	├── datasets/ # Dataset management
	├── requirements.txt # Dependencies
	├── setup.py # Package setup
	└── README.md # Documentation
	```

	### 1.2 Configurar Dependências
	```python
	# requirements.txt
	torch>=2.0.0
	transformers>=4.36.0
	scikit-learn>=1.3.2
	pandas>=2.1.4
	numpy>=1.26.3
	fastapi>=0.104.0
	uvicorn>=0.24.0
	huggingface-hub>=0.19.0
	mlflow>=2.8.0
	wandb>=0.16.0
	```

	---

	## 📋 FASE 2: MIGRAÇÃO MÓDULOS (PRÓXIMA SEMANA)

	### 2.1 Mapeamento de Migração
	```python
	# Migração de arquivos backend → models
	MIGRATION_MAP = {
	# Core ML modules
	"src/ml/anomaly_detector.py": "src/models/anomaly_detection/detector.py",
	"src/ml/pattern_analyzer.py": "src/models/pattern_analysis/analyzer.py",
	"src/ml/spectral_analyzer.py": "src/models/spectral_analysis/analyzer.py",
	"src/ml/models.py": "src/models/core/base_models.py",

	# Training pipeline
	"src/ml/training_pipeline.py": "src/training/pipelines/training.py",
	"src/ml/advanced_pipeline.py": "src/training/pipelines/advanced.py",
	"src/ml/data_pipeline.py": "src/training/pipelines/data.py",

	# HuggingFace integration
	"src/ml/hf_cidadao_model.py": "src/models/core/hf_model.py",
	"src/ml/hf_integration.py": "src/deployment/huggingface/integration.py",
	"src/ml/cidadao_model.py": "src/models/core/cidadao_model.py",

	# API and serving
	"src/ml/model_api.py": "src/inference/api_server.py",
	"src/ml/transparency_benchmark.py": "src/models/evaluation/benchmark.py"
	}
	```

	### 2.2 Refatoração de Imports
	```python
	# Antes (backend atual)
	from src.ml.anomaly_detector import AnomalyDetector
	from src.ml.pattern_analyzer import PatternAnalyzer

	# Depois (models repo)
	from cidadao_models.models.anomaly_detection import AnomalyDetector
	from cidadao_models.models.pattern_analysis import PatternAnalyzer
	```

	### 2.3 Configurar Package
	```python
	# setup.py
	from setuptools import setup, find_packages

	setup(
	name="cidadao-ai-models",
	version="1.0.0",
	description="ML models for Cidadão.AI transparency analysis",
	packages=find_packages(where="src"),
	package_dir={"": "src"},
	install_requires=[
	"torch>=2.0.0",
	"transformers>=4.36.0",
	"scikit-learn>=1.3.2",
	# ... outras dependências
	],
	python_requires=">=3.11",
	)
	```

	---

	## 📋 FASE 3: SERVIDOR DE INFERÊNCIA (SEMANA 2)

	### 3.1 API Server Dedicado
	```python
	# src/inference/api_server.py
	from fastapi import FastAPI, HTTPException
	from cidadao_models.models.anomaly_detection import AnomalyDetector
	from cidadao_models.models.pattern_analysis import PatternAnalyzer

	app = FastAPI(title="Cidadão.AI Models API")

	# Initialize models
	anomaly_detector = AnomalyDetector()
	pattern_analyzer = PatternAnalyzer()

	@app.post("/v1/detect-anomalies")
	async def detect_anomalies(contracts: List[Contract]):
	"""Detect anomalies in government contracts"""
	try:
	results = await anomaly_detector.analyze(contracts)
	return {"anomalies": results, "model_version": "1.0.0"}
	except Exception as e:
	raise HTTPException(status_code=500, detail=str(e))

	@app.post("/v1/analyze-patterns")
	async def analyze_patterns(data: Dict[str, Any]):
	"""Analyze patterns in government data"""
	try:
	patterns = await pattern_analyzer.analyze(data)
	return {"patterns": patterns, "confidence": 0.87}
	except Exception as e:
	raise HTTPException(status_code=500, detail=str(e))

	@app.get("/health")
	async def health_check():
	return {"status": "healthy", "models_loaded": True}
	```

	### 3.2 Client no Backend
	```python
	# backend/src/tools/models_client.py
	import httpx
	from typing import Optional, List, Dict, Any

	class ModelsClient:
	"""Client for cidadao.ai-models API"""

	def __init__(self, base_url: str = "http://localhost:8001"):
	self.base_url = base_url
	self.client = httpx.AsyncClient(timeout=30.0)

	async def detect_anomalies(self, contracts: List[Dict]) -> Dict[str, Any]:
	"""Call anomaly detection API"""
	try:
	response = await self.client.post(
	f"{self.base_url}/v1/detect-anomalies",
	json={"contracts": contracts}
	)
	response.raise_for_status()
	return response.json()
	except httpx.RequestError:
	# Fallback to local processing if models API unavailable
	return await self._local_anomaly_detection(contracts)

	async def _local_anomaly_detection(self, contracts: List[Dict]) -> Dict[str, Any]:
	"""Fallback local processing"""
	# Import local ML if models API unavailable
	from src.ml.anomaly_detector import AnomalyDetector
	detector = AnomalyDetector()
	return detector.analyze(contracts)
	```

	---

	## 📋 FASE 4: INTEGRAÇÃO AGENTES (SEMANA 3)

	### 4.1 Atualizar Agente Zumbi
	```python
	# backend/src/agents/zumbi.py - ANTES
	from src.ml.anomaly_detector import AnomalyDetector
	from src.ml.spectral_analyzer import SpectralAnalyzer

	class InvestigatorAgent(BaseAgent):
	def __init__(self):
	self.anomaly_detector = AnomalyDetector()
	self.spectral_analyzer = SpectralAnalyzer()

	# backend/src/agents/zumbi.py - DEPOIS
	from src.tools.models_client import ModelsClient

	class InvestigatorAgent(BaseAgent):
	def __init__(self):
	self.models_client = ModelsClient()
	# Fallback local se necessário
	self._local_detector = None

	async def investigate(self, contracts):
	# Tenta usar models API primeiro
	try:
	results = await self.models_client.detect_anomalies(contracts)
	return results
	except Exception:
	# Fallback para processamento local
	if not self._local_detector:
	from src.ml.anomaly_detector import AnomalyDetector
	self._local_detector = AnomalyDetector()
	return self._local_detector.analyze(contracts)
	```

	### 4.2 Configuração Híbrida
	```python
	# backend/src/core/config.py - Adicionar
	class Settings(BaseSettings):
	# ... existing settings ...

	# Models API configuration
	models_api_enabled: bool = Field(default=True, description="Enable models API")
	models_api_url: str = Field(default="http://localhost:8001", description="Models API URL")
	models_api_timeout: int = Field(default=30, description="API timeout seconds")
	models_fallback_local: bool = Field(default=True, description="Use local ML as fallback")
	```

	---

	## 📋 FASE 5: DEPLOYMENT (SEMANA 4)

	### 5.1 Docker Models
	```dockerfile
	# cidadao.ai-models/Dockerfile
	FROM python:3.11-slim

	WORKDIR /app

	# Install dependencies
	COPY requirements.txt .
	RUN pip install --no-cache-dir -r requirements.txt

	# Copy source code
	COPY src/ ./src/
	COPY setup.py .
	RUN pip install -e .

	# Expose port
	EXPOSE 8001

	# Run inference server
	CMD ["uvicorn", "src.inference.api_server:app", "--host", "0.0.0.0", "--port", "8001"]
	```

	### 5.2 Docker Compose Integration
	```yaml
	# docker-compose.yml (no backend)
	version: '3.8'

	services:
	cidadao-backend:
	build: .
	ports:
	- "8000:8000"
	depends_on:
	- cidadao-models
	environment:
	- MODELS_API_URL=http://cidadao-models:8001

	cidadao-models:
	build: ../cidadao.ai-models
	ports:
	- "8001:8001"
	environment:
	- MODEL_CACHE_SIZE=1000
	```

	### 5.3 HuggingFace Spaces
	```python
	# cidadao.ai-models/spaces_app.py
	import gradio as gr
	from src.models.anomaly_detection import AnomalyDetector
	from src.models.pattern_analysis import PatternAnalyzer

	detector = AnomalyDetector()
	analyzer = PatternAnalyzer()

	def analyze_contract(contract_text):
	"""Analyze contract for anomalies"""
	result = detector.analyze_text(contract_text)
	return {
	"anomaly_score": result.score,
	"risk_level": result.risk_level,
	"explanation": result.explanation
	}

	# Gradio interface
	with gr.Blocks(title="Cidadão.AI Models Demo") as demo:
	gr.Markdown("# 🤖 Cidadão.AI - Modelos de Transparência")

	with gr.Row():
	input_text = gr.Textbox(
	label="Texto do Contrato",
	placeholder="Cole aqui o texto do contrato para análise..."
	)

	analyze_btn = gr.Button("Analisar Anomalias")

	with gr.Row():
	output = gr.JSON(label="Resultado da Análise")

	analyze_btn.click(analyze_contract, inputs=input_text, outputs=output)

	if __name__ == "__main__":
	demo.launch()
	```

	---

	## 🔄 INTEGRAÇÃO ENTRE REPOSITÓRIOS

	### COMUNICAÇÃO API-BASED
	```python
	# Fluxo: Backend → Models
	1. Backend Agent precisa análise ML
	2. Chama Models API via HTTP
	3. Models processa e retorna resultado
	4. Backend integra resultado na resposta
	5. Fallback local se Models indisponível
	```

	### VERSIONAMENTO INDEPENDENTE
	```python
	# cidadao.ai-models releases
	v1.0.0: "Initial anomaly detection model"
	v1.1.0: "Pattern analysis improvements"
	v1.2.0: "New corruption detection model"

	# cidadao.ai-backend usa models
	requirements.txt:
	cidadao-ai-models>=1.0.0,<2.0.0
	```

	---

	## 📊 CRONOGRAMA EXECUÇÃO

	### SEMANA 1: Setup & Estrutura
	- [ ] Criar estrutura completa cidadao.ai-models
	- [ ] Configurar requirements e setup.py
	- [ ] Migrar primeiro módulo (anomaly_detector.py)
	- [ ] Testar importação e funcionamento básico

	### SEMANA 2: Migração Core
	- [ ] Migrar todos os 13 módulos ML
	- [ ] Refatorar imports e dependências
	- [ ] Implementar API server básico
	- [ ] Criar client no backend

	### SEMANA 3: Integração Agentes
	- [ ] Atualizar Zumbi para usar Models API
	- [ ] Implementar fallback local
	- [ ] Testar integração completa
	- [ ] Atualizar documentação

	### SEMANA 4: Deploy & Production
	- [ ] Containerização Docker
	- [ ] Deploy HuggingFace Spaces
	- [ ] Monitoramento e métricas
	- [ ] Testes de carga e performance

	---

	## ✅ CRITÉRIOS DE SUCESSO

	### FUNCIONAIS
	- [ ] Backend continua funcionando sem interrupção
	- [ ] Models API responde <500ms
	- [ ] Fallback local funciona se API indisponível
	- [ ] Todos agentes usam nova arquitetura

	### NÃO-FUNCIONAIS
	- [ ] Performance igual ou melhor que atual
	- [ ] Deploy independente dos repositórios
	- [ ] Documentação atualizada
	- [ ] Testes cobrindo >80% código migrado

	---

	## 🎯 PRÓXIMO PASSO IMEDIATO

	COMEÇAR FASE 1 AGORA: Criar estrutura base no cidadao.ai-models e migrar primeiro módulo para validar approach.

	Vamos começar?