Spaces:

colin730
/

SummarizerApp

Running

ming commited on Oct 1

Commit

76c1e68

1 Parent(s): 045076b

docs: add comprehensive cloud deployment guide

- Add detailed DEPLOYMENT.md with 5 deployment options
- Add railway.json for Railway deployment
- Add env.example for environment variables
- Update README with quick deployment instructions
- Include Railway, Google Cloud Run, AWS ECS, DigitalOcean, and Render options
- Add important considerations for memory, costs, and security

Files changed (4) hide show

DEPLOYMENT.md +267 -0
README.md +26 -14
env.example +22 -0
railway.json +13 -0

DEPLOYMENT.md ADDED Viewed

	@@ -0,0 +1,267 @@

+# Cloud Deployment Guide
+This guide covers multiple options for deploying your text summarizer backend to the cloud.
+## 🚀 **Option 1: Railway (Recommended - Easiest)**
+Railway is perfect for this project because it supports Docker Compose and persistent volumes.
+### Steps:
+1. **Create Railway Account**
+   ```bash
+   # Install Railway CLI
+   npm install -g @railway/cli
+   # Login
+   railway login
+   ```
+2. **Deploy from GitHub**
+   - Go to [railway.app](https://railway.app)
+   - Connect your GitHub repository
+   - Select your `SummerizerApp` repository
+   - Railway will automatically detect `docker-compose.yml`
+3. **Set Environment Variables**
+   In Railway dashboard, add these environment variables:
+   ```
+   OLLAMA_MODEL=llama3.1:8b
+   OLLAMA_HOST=http://ollama:11434
+   OLLAMA_TIMEOUT=30
+   SERVER_HOST=0.0.0.0
+   SERVER_PORT=8000
+   LOG_LEVEL=INFO
+   ```
+4. **Deploy**
+   ```bash
+   # Or deploy via CLI
+   railway up
+   ```
+### Railway Advantages:
+- ✅ Supports Docker Compose
+- ✅ Persistent volumes for Ollama models
+- ✅ Automatic HTTPS
+- ✅ Easy environment variable management
+- ✅ Built-in monitoring
+---
+## ☁️ **Option 2: Google Cloud Run**
+### Steps:
+1. **Build and Push to Google Container Registry**
+   ```bash
+   # Set up gcloud CLI
+   gcloud auth login
+   gcloud config set project YOUR_PROJECT_ID
+   # Build and push
+   docker build -t gcr.io/YOUR_PROJECT_ID/summarizer-backend .
+   docker push gcr.io/YOUR_PROJECT_ID/summarizer-backend
+   ```
+2. **Deploy with Cloud Run**
+   ```bash
+   gcloud run deploy summarizer-backend \
+     --image gcr.io/YOUR_PROJECT_ID/summarizer-backend \
+     --platform managed \
+     --region us-central1 \
+     --allow-unauthenticated \
+     --memory 4Gi \
+     --cpu 2 \
+     --timeout 300 \
+     --set-env-vars OLLAMA_MODEL=llama3.1:8b,SERVER_HOST=0.0.0.0,SERVER_PORT=8000
+   ```
+### Cloud Run Advantages:
+- ✅ Serverless scaling
+- ✅ Pay per request
+- ✅ Global CDN
+- ✅ Integrated with Google Cloud
+---
+## 🐳 **Option 3: AWS ECS with Fargate**
+### Steps:
+1. **Create ECR Repository**
+   ```bash
+   aws ecr create-repository --repository-name summarizer-backend
+   ```
+2. **Build and Push**
+   ```bash
+   # Get login token
+   aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com
+   # Build and push
+   docker build -t summarizer-backend .
+   docker tag summarizer-backend:latest YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/summarizer-backend:latest
+   docker push YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/summarizer-backend:latest
+   ```
+3. **Create ECS Task Definition**
+   ```json
+   {
+     "family": "summarizer-backend",
+     "networkMode": "awsvpc",
+     "requiresCompatibilities": ["FARGATE"],
+     "cpu": "2048",
+     "memory": "4096",
+     "executionRoleArn": "arn:aws:iam::YOUR_ACCOUNT:role/ecsTaskExecutionRole",
+     "containerDefinitions": [
+       {
+         "name": "summarizer-backend",
+         "image": "YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/summarizer-backend:latest",
+         "portMappings": [
+           {
+             "containerPort": 8000,
+             "protocol": "tcp"
+           }
+         ],
+         "environment": [
+           {
+             "name": "OLLAMA_MODEL",
+             "value": "llama3.1:8b"
+           },
+           {
+             "name": "SERVER_HOST",
+             "value": "0.0.0.0"
+           },
+           {
+             "name": "SERVER_PORT",
+             "value": "8000"
+           }
+         ],
+         "logConfiguration": {
+           "logDriver": "awslogs",
+           "options": {
+             "awslogs-group": "/ecs/summarizer-backend",
+             "awslogs-region": "us-east-1",
+             "awslogs-stream-prefix": "ecs"
+           }
+         }
+       }
+     ]
+   }
+   ```
+---
+## 🌊 **Option 4: DigitalOcean App Platform**
+### Steps:
+1. **Create App Spec**
+   ```yaml
+   # .do/app.yaml
+   name: summarizer-backend
+   services:
+   - name: api
+     source_dir: /
+     github:
+       repo: MingLu0/SummarizerBackend
+       branch: main
+     run_command: uvicorn app.main:app --host 0.0.0.0 --port 8080
+     environment_slug: python
+     instance_count: 1
+     instance_size_slug: basic-xxl
+     http_port: 8080
+     envs:
+     - key: OLLAMA_MODEL
+       value: llama3.1:8b
+     - key: SERVER_HOST
+       value: 0.0.0.0
+     - key: SERVER_PORT
+       value: 8080
+   ```
+2. **Deploy**
+   ```bash
+   doctl apps create --spec .do/app.yaml
+   ```
+---
+## 🔧 **Option 5: Render (Simple)**
+### Steps:
+1. **Connect GitHub Repository**
+   - Go to [render.com](https://render.com)
+   - Connect your GitHub account
+   - Select your repository
+2. **Create Web Service**
+   - Choose "Web Service"
+   - Select your repository
+   - Use these settings:
+     ```
+     Build Command: docker-compose build
+     Start Command: docker-compose up
+     Environment: Docker
+     ```
+3. **Set Environment Variables**
+   ```
+   OLLAMA_MODEL=llama3.1:8b
+   OLLAMA_HOST=http://ollama:11434
+   SERVER_HOST=0.0.0.0
+   SERVER_PORT=8000
+   ```
+---
+## ⚠️ **Important Considerations**
+### **Model Download in Cloud**
+Your Ollama models need to be downloaded after deployment. Add this to your deployment:
+```bash
+# Add to docker-compose.yml or startup script
+ollama pull llama3.1:8b
+```
+### **Memory Requirements**
+- **llama3.1:8b** needs ~8GB RAM
+- **llama3.1:7b** needs ~7GB RAM
+- **mistral:7b** needs ~7GB RAM
+### **Cost Optimization**
+- Use smaller models for production: `mistral:7b` or `llama3.1:7b`
+- Consider using spot instances for development
+- Monitor usage and scale accordingly
+### **Security**
+- Enable API key authentication for production
+- Use HTTPS (most platforms provide this automatically)
+- Set up rate limiting
+- Monitor logs for abuse
+---
+## 🎯 **Recommended Deployment Flow**
+1. **Start with Railway** (easiest setup)
+2. **Test with a smaller model** (mistral:7b)
+3. **Monitor performance and costs**
+4. **Scale up model size if needed**
+5. **Add security features**
+### **Quick Railway Deploy:**
+```bash
+# 1. Install Railway CLI
+npm install -g @railway/cli
+# 2. Login and deploy
+railway login
+railway init
+railway up
+```
+Your backend will be live at `https://your-app.railway.app`! 🚀

README.md CHANGED Viewed

@@ -318,29 +318,41 @@ docker run -p 8000:8000 \
 ## Cloud Deployment Options
-### Railway
 ```bash
-# Install Railway CLI
 npm install -g @railway/cli
-# Deploy
 railway login
 railway init
 railway up
 ```
-### Render
-1. Connect your GitHub repository
-2. Set environment variables
-3. Deploy automatically on push
-### AWS ECS
-```bash
-# Build and push to ECR
-aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin your-account.dkr.ecr.us-east-1.amazonaws.com
-docker tag summarizer-backend:latest your-account.dkr.ecr.us-east-1.amazonaws.com/summarizer-backend:latest
-docker push your-account.dkr.ecr.us-east-1.amazonaws.com/summarizer-backend:latest
-```
 ## Monitoring and Logging

 ## Cloud Deployment Options
+### 🚀 **Quick Deploy with Railway (Recommended)**
 ```bash
+# 1. Install Railway CLI
 npm install -g @railway/cli
+# 2. Login and deploy
 railway login
 railway init
 railway up
 ```
+**Railway Advantages:**
+- ✅ Supports Docker Compose with Ollama
+- ✅ Persistent volumes for models
+- ✅ Automatic HTTPS
+- ✅ Easy environment management
+### 📋 **Other Options**
+- **Google Cloud Run**: Serverless with auto-scaling
+- **AWS ECS**: Full container orchestration
+- **DigitalOcean App Platform**: Simple deployment
+- **Render**: GitHub integration
+### 📖 **Detailed Deployment Guide**
+See [DEPLOYMENT.md](DEPLOYMENT.md) for comprehensive deployment instructions for all platforms.
+### ⚠️ **Important Notes**
+- **Memory Requirements**: llama3.1:8b needs ~8GB RAM
+- **Model Download**: Models are downloaded after deployment
+- **Cost Optimization**: Start with smaller models (mistral:7b)
+- **Security**: Enable API keys for production use
 ## Monitoring and Logging

env.example ADDED Viewed

	@@ -0,0 +1,22 @@

+# Ollama Configuration
+OLLAMA_MODEL=llama3.1:8b
+OLLAMA_HOST=http://ollama:11434
+OLLAMA_TIMEOUT=30
+# Server Configuration
+SERVER_HOST=0.0.0.0
+SERVER_PORT=8000
+LOG_LEVEL=INFO
+# Optional: API Security
+API_KEY_ENABLED=false
+API_KEY=your-secret-key-here
+# Optional: Rate Limiting
+RATE_LIMIT_ENABLED=false
+RATE_LIMIT_REQUESTS=60
+RATE_LIMIT_WINDOW=60
+# Input validation
+MAX_TEXT_LENGTH=32000
+MAX_TOKENS_DEFAULT=256

railway.json ADDED Viewed

	@@ -0,0 +1,13 @@

+{
+  "$schema": "https://railway.app/railway.schema.json",
+  "build": {
+    "builder": "DOCKERFILE"
+  },
+  "deploy": {
+    "startCommand": "docker-compose up --build",
+    "healthcheckPath": "/health",
+    "healthcheckTimeout": 300,
+    "restartPolicyType": "ON_FAILURE",
+    "restartPolicyMaxRetries": 10
+  }
+}