Spaces:
Running
Running
File size: 6,275 Bytes
76c1e68 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 |
# Cloud Deployment Guide
This guide covers multiple options for deploying your text summarizer backend to the cloud.
## π **Option 1: Railway (Recommended - Easiest)**
Railway is perfect for this project because it supports Docker Compose and persistent volumes.
### Steps:
1. **Create Railway Account**
```bash
# Install Railway CLI
npm install -g @railway/cli
# Login
railway login
```
2. **Deploy from GitHub**
- Go to [railway.app](https://railway.app)
- Connect your GitHub repository
- Select your `SummerizerApp` repository
- Railway will automatically detect `docker-compose.yml`
3. **Set Environment Variables**
In Railway dashboard, add these environment variables:
```
OLLAMA_MODEL=llama3.1:8b
OLLAMA_HOST=http://ollama:11434
OLLAMA_TIMEOUT=30
SERVER_HOST=0.0.0.0
SERVER_PORT=8000
LOG_LEVEL=INFO
```
4. **Deploy**
```bash
# Or deploy via CLI
railway up
```
### Railway Advantages:
- β
Supports Docker Compose
- β
Persistent volumes for Ollama models
- β
Automatic HTTPS
- β
Easy environment variable management
- β
Built-in monitoring
---
## βοΈ **Option 2: Google Cloud Run**
### Steps:
1. **Build and Push to Google Container Registry**
```bash
# Set up gcloud CLI
gcloud auth login
gcloud config set project YOUR_PROJECT_ID
# Build and push
docker build -t gcr.io/YOUR_PROJECT_ID/summarizer-backend .
docker push gcr.io/YOUR_PROJECT_ID/summarizer-backend
```
2. **Deploy with Cloud Run**
```bash
gcloud run deploy summarizer-backend \
--image gcr.io/YOUR_PROJECT_ID/summarizer-backend \
--platform managed \
--region us-central1 \
--allow-unauthenticated \
--memory 4Gi \
--cpu 2 \
--timeout 300 \
--set-env-vars OLLAMA_MODEL=llama3.1:8b,SERVER_HOST=0.0.0.0,SERVER_PORT=8000
```
### Cloud Run Advantages:
- β
Serverless scaling
- β
Pay per request
- β
Global CDN
- β
Integrated with Google Cloud
---
## π³ **Option 3: AWS ECS with Fargate**
### Steps:
1. **Create ECR Repository**
```bash
aws ecr create-repository --repository-name summarizer-backend
```
2. **Build and Push**
```bash
# Get login token
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com
# Build and push
docker build -t summarizer-backend .
docker tag summarizer-backend:latest YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/summarizer-backend:latest
docker push YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/summarizer-backend:latest
```
3. **Create ECS Task Definition**
```json
{
"family": "summarizer-backend",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "2048",
"memory": "4096",
"executionRoleArn": "arn:aws:iam::YOUR_ACCOUNT:role/ecsTaskExecutionRole",
"containerDefinitions": [
{
"name": "summarizer-backend",
"image": "YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/summarizer-backend:latest",
"portMappings": [
{
"containerPort": 8000,
"protocol": "tcp"
}
],
"environment": [
{
"name": "OLLAMA_MODEL",
"value": "llama3.1:8b"
},
{
"name": "SERVER_HOST",
"value": "0.0.0.0"
},
{
"name": "SERVER_PORT",
"value": "8000"
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/summarizer-backend",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "ecs"
}
}
}
]
}
```
---
## π **Option 4: DigitalOcean App Platform**
### Steps:
1. **Create App Spec**
```yaml
# .do/app.yaml
name: summarizer-backend
services:
- name: api
source_dir: /
github:
repo: MingLu0/SummarizerBackend
branch: main
run_command: uvicorn app.main:app --host 0.0.0.0 --port 8080
environment_slug: python
instance_count: 1
instance_size_slug: basic-xxl
http_port: 8080
envs:
- key: OLLAMA_MODEL
value: llama3.1:8b
- key: SERVER_HOST
value: 0.0.0.0
- key: SERVER_PORT
value: 8080
```
2. **Deploy**
```bash
doctl apps create --spec .do/app.yaml
```
---
## π§ **Option 5: Render (Simple)**
### Steps:
1. **Connect GitHub Repository**
- Go to [render.com](https://render.com)
- Connect your GitHub account
- Select your repository
2. **Create Web Service**
- Choose "Web Service"
- Select your repository
- Use these settings:
```
Build Command: docker-compose build
Start Command: docker-compose up
Environment: Docker
```
3. **Set Environment Variables**
```
OLLAMA_MODEL=llama3.1:8b
OLLAMA_HOST=http://ollama:11434
SERVER_HOST=0.0.0.0
SERVER_PORT=8000
```
---
## β οΈ **Important Considerations**
### **Model Download in Cloud**
Your Ollama models need to be downloaded after deployment. Add this to your deployment:
```bash
# Add to docker-compose.yml or startup script
ollama pull llama3.1:8b
```
### **Memory Requirements**
- **llama3.1:8b** needs ~8GB RAM
- **llama3.1:7b** needs ~7GB RAM
- **mistral:7b** needs ~7GB RAM
### **Cost Optimization**
- Use smaller models for production: `mistral:7b` or `llama3.1:7b`
- Consider using spot instances for development
- Monitor usage and scale accordingly
### **Security**
- Enable API key authentication for production
- Use HTTPS (most platforms provide this automatically)
- Set up rate limiting
- Monitor logs for abuse
---
## π― **Recommended Deployment Flow**
1. **Start with Railway** (easiest setup)
2. **Test with a smaller model** (mistral:7b)
3. **Monitor performance and costs**
4. **Scale up model size if needed**
5. **Add security features**
### **Quick Railway Deploy:**
```bash
# 1. Install Railway CLI
npm install -g @railway/cli
# 2. Login and deploy
railway login
railway init
railway up
```
Your backend will be live at `https://your-app.railway.app`! π
|