File size: 6,275 Bytes
76c1e68
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
# Cloud Deployment Guide

This guide covers multiple options for deploying your text summarizer backend to the cloud.

## πŸš€ **Option 1: Railway (Recommended - Easiest)**

Railway is perfect for this project because it supports Docker Compose and persistent volumes.

### Steps:

1. **Create Railway Account**
   ```bash
   # Install Railway CLI
   npm install -g @railway/cli
   
   # Login
   railway login
   ```

2. **Deploy from GitHub**
   - Go to [railway.app](https://railway.app)
   - Connect your GitHub repository
   - Select your `SummerizerApp` repository
   - Railway will automatically detect `docker-compose.yml`

3. **Set Environment Variables**
   In Railway dashboard, add these environment variables:
   ```
   OLLAMA_MODEL=llama3.1:8b
   OLLAMA_HOST=http://ollama:11434
   OLLAMA_TIMEOUT=30
   SERVER_HOST=0.0.0.0
   SERVER_PORT=8000
   LOG_LEVEL=INFO
   ```

4. **Deploy**
   ```bash
   # Or deploy via CLI
   railway up
   ```

### Railway Advantages:
- βœ… Supports Docker Compose
- βœ… Persistent volumes for Ollama models
- βœ… Automatic HTTPS
- βœ… Easy environment variable management
- βœ… Built-in monitoring

---

## ☁️ **Option 2: Google Cloud Run**

### Steps:

1. **Build and Push to Google Container Registry**
   ```bash
   # Set up gcloud CLI
   gcloud auth login
   gcloud config set project YOUR_PROJECT_ID
   
   # Build and push
   docker build -t gcr.io/YOUR_PROJECT_ID/summarizer-backend .
   docker push gcr.io/YOUR_PROJECT_ID/summarizer-backend
   ```

2. **Deploy with Cloud Run**
   ```bash
   gcloud run deploy summarizer-backend \
     --image gcr.io/YOUR_PROJECT_ID/summarizer-backend \
     --platform managed \
     --region us-central1 \
     --allow-unauthenticated \
     --memory 4Gi \
     --cpu 2 \
     --timeout 300 \
     --set-env-vars OLLAMA_MODEL=llama3.1:8b,SERVER_HOST=0.0.0.0,SERVER_PORT=8000
   ```

### Cloud Run Advantages:
- βœ… Serverless scaling
- βœ… Pay per request
- βœ… Global CDN
- βœ… Integrated with Google Cloud

---

## 🐳 **Option 3: AWS ECS with Fargate**

### Steps:

1. **Create ECR Repository**
   ```bash
   aws ecr create-repository --repository-name summarizer-backend
   ```

2. **Build and Push**
   ```bash
   # Get login token
   aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com
   
   # Build and push
   docker build -t summarizer-backend .
   docker tag summarizer-backend:latest YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/summarizer-backend:latest
   docker push YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/summarizer-backend:latest
   ```

3. **Create ECS Task Definition**
   ```json
   {
     "family": "summarizer-backend",
     "networkMode": "awsvpc",
     "requiresCompatibilities": ["FARGATE"],
     "cpu": "2048",
     "memory": "4096",
     "executionRoleArn": "arn:aws:iam::YOUR_ACCOUNT:role/ecsTaskExecutionRole",
     "containerDefinitions": [
       {
         "name": "summarizer-backend",
         "image": "YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/summarizer-backend:latest",
         "portMappings": [
           {
             "containerPort": 8000,
             "protocol": "tcp"
           }
         ],
         "environment": [
           {
             "name": "OLLAMA_MODEL",
             "value": "llama3.1:8b"
           },
           {
             "name": "SERVER_HOST",
             "value": "0.0.0.0"
           },
           {
             "name": "SERVER_PORT",
             "value": "8000"
           }
         ],
         "logConfiguration": {
           "logDriver": "awslogs",
           "options": {
             "awslogs-group": "/ecs/summarizer-backend",
             "awslogs-region": "us-east-1",
             "awslogs-stream-prefix": "ecs"
           }
         }
       }
     ]
   }
   ```

---

## 🌊 **Option 4: DigitalOcean App Platform**

### Steps:

1. **Create App Spec**
   ```yaml
   # .do/app.yaml
   name: summarizer-backend
   services:
   - name: api
     source_dir: /
     github:
       repo: MingLu0/SummarizerBackend
       branch: main
     run_command: uvicorn app.main:app --host 0.0.0.0 --port 8080
     environment_slug: python
     instance_count: 1
     instance_size_slug: basic-xxl
     http_port: 8080
     envs:
     - key: OLLAMA_MODEL
       value: llama3.1:8b
     - key: SERVER_HOST
       value: 0.0.0.0
     - key: SERVER_PORT
       value: 8080
   ```

2. **Deploy**
   ```bash
   doctl apps create --spec .do/app.yaml
   ```

---

## πŸ”§ **Option 5: Render (Simple)**

### Steps:

1. **Connect GitHub Repository**
   - Go to [render.com](https://render.com)
   - Connect your GitHub account
   - Select your repository

2. **Create Web Service**
   - Choose "Web Service"
   - Select your repository
   - Use these settings:
     ```
     Build Command: docker-compose build
     Start Command: docker-compose up
     Environment: Docker
     ```

3. **Set Environment Variables**
   ```
   OLLAMA_MODEL=llama3.1:8b
   OLLAMA_HOST=http://ollama:11434
   SERVER_HOST=0.0.0.0
   SERVER_PORT=8000
   ```

---

## ⚠️ **Important Considerations**

### **Model Download in Cloud**
Your Ollama models need to be downloaded after deployment. Add this to your deployment:

```bash
# Add to docker-compose.yml or startup script
ollama pull llama3.1:8b
```

### **Memory Requirements**
- **llama3.1:8b** needs ~8GB RAM
- **llama3.1:7b** needs ~7GB RAM
- **mistral:7b** needs ~7GB RAM

### **Cost Optimization**
- Use smaller models for production: `mistral:7b` or `llama3.1:7b`
- Consider using spot instances for development
- Monitor usage and scale accordingly

### **Security**
- Enable API key authentication for production
- Use HTTPS (most platforms provide this automatically)
- Set up rate limiting
- Monitor logs for abuse

---

## 🎯 **Recommended Deployment Flow**

1. **Start with Railway** (easiest setup)
2. **Test with a smaller model** (mistral:7b)
3. **Monitor performance and costs**
4. **Scale up model size if needed**
5. **Add security features**

### **Quick Railway Deploy:**
```bash
# 1. Install Railway CLI
npm install -g @railway/cli

# 2. Login and deploy
railway login
railway init
railway up
```

Your backend will be live at `https://your-app.railway.app`! πŸš€