ming commited on
Commit
76c1e68
Β·
1 Parent(s): 045076b

docs: add comprehensive cloud deployment guide

Browse files

- Add detailed DEPLOYMENT.md with 5 deployment options
- Add railway.json for Railway deployment
- Add env.example for environment variables
- Update README with quick deployment instructions
- Include Railway, Google Cloud Run, AWS ECS, DigitalOcean, and Render options
- Add important considerations for memory, costs, and security

Files changed (4) hide show
  1. DEPLOYMENT.md +267 -0
  2. README.md +26 -14
  3. env.example +22 -0
  4. railway.json +13 -0
DEPLOYMENT.md ADDED
@@ -0,0 +1,267 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Cloud Deployment Guide
2
+
3
+ This guide covers multiple options for deploying your text summarizer backend to the cloud.
4
+
5
+ ## πŸš€ **Option 1: Railway (Recommended - Easiest)**
6
+
7
+ Railway is perfect for this project because it supports Docker Compose and persistent volumes.
8
+
9
+ ### Steps:
10
+
11
+ 1. **Create Railway Account**
12
+ ```bash
13
+ # Install Railway CLI
14
+ npm install -g @railway/cli
15
+
16
+ # Login
17
+ railway login
18
+ ```
19
+
20
+ 2. **Deploy from GitHub**
21
+ - Go to [railway.app](https://railway.app)
22
+ - Connect your GitHub repository
23
+ - Select your `SummerizerApp` repository
24
+ - Railway will automatically detect `docker-compose.yml`
25
+
26
+ 3. **Set Environment Variables**
27
+ In Railway dashboard, add these environment variables:
28
+ ```
29
+ OLLAMA_MODEL=llama3.1:8b
30
+ OLLAMA_HOST=http://ollama:11434
31
+ OLLAMA_TIMEOUT=30
32
+ SERVER_HOST=0.0.0.0
33
+ SERVER_PORT=8000
34
+ LOG_LEVEL=INFO
35
+ ```
36
+
37
+ 4. **Deploy**
38
+ ```bash
39
+ # Or deploy via CLI
40
+ railway up
41
+ ```
42
+
43
+ ### Railway Advantages:
44
+ - βœ… Supports Docker Compose
45
+ - βœ… Persistent volumes for Ollama models
46
+ - βœ… Automatic HTTPS
47
+ - βœ… Easy environment variable management
48
+ - βœ… Built-in monitoring
49
+
50
+ ---
51
+
52
+ ## ☁️ **Option 2: Google Cloud Run**
53
+
54
+ ### Steps:
55
+
56
+ 1. **Build and Push to Google Container Registry**
57
+ ```bash
58
+ # Set up gcloud CLI
59
+ gcloud auth login
60
+ gcloud config set project YOUR_PROJECT_ID
61
+
62
+ # Build and push
63
+ docker build -t gcr.io/YOUR_PROJECT_ID/summarizer-backend .
64
+ docker push gcr.io/YOUR_PROJECT_ID/summarizer-backend
65
+ ```
66
+
67
+ 2. **Deploy with Cloud Run**
68
+ ```bash
69
+ gcloud run deploy summarizer-backend \
70
+ --image gcr.io/YOUR_PROJECT_ID/summarizer-backend \
71
+ --platform managed \
72
+ --region us-central1 \
73
+ --allow-unauthenticated \
74
+ --memory 4Gi \
75
+ --cpu 2 \
76
+ --timeout 300 \
77
+ --set-env-vars OLLAMA_MODEL=llama3.1:8b,SERVER_HOST=0.0.0.0,SERVER_PORT=8000
78
+ ```
79
+
80
+ ### Cloud Run Advantages:
81
+ - βœ… Serverless scaling
82
+ - βœ… Pay per request
83
+ - βœ… Global CDN
84
+ - βœ… Integrated with Google Cloud
85
+
86
+ ---
87
+
88
+ ## 🐳 **Option 3: AWS ECS with Fargate**
89
+
90
+ ### Steps:
91
+
92
+ 1. **Create ECR Repository**
93
+ ```bash
94
+ aws ecr create-repository --repository-name summarizer-backend
95
+ ```
96
+
97
+ 2. **Build and Push**
98
+ ```bash
99
+ # Get login token
100
+ aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com
101
+
102
+ # Build and push
103
+ docker build -t summarizer-backend .
104
+ docker tag summarizer-backend:latest YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/summarizer-backend:latest
105
+ docker push YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/summarizer-backend:latest
106
+ ```
107
+
108
+ 3. **Create ECS Task Definition**
109
+ ```json
110
+ {
111
+ "family": "summarizer-backend",
112
+ "networkMode": "awsvpc",
113
+ "requiresCompatibilities": ["FARGATE"],
114
+ "cpu": "2048",
115
+ "memory": "4096",
116
+ "executionRoleArn": "arn:aws:iam::YOUR_ACCOUNT:role/ecsTaskExecutionRole",
117
+ "containerDefinitions": [
118
+ {
119
+ "name": "summarizer-backend",
120
+ "image": "YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/summarizer-backend:latest",
121
+ "portMappings": [
122
+ {
123
+ "containerPort": 8000,
124
+ "protocol": "tcp"
125
+ }
126
+ ],
127
+ "environment": [
128
+ {
129
+ "name": "OLLAMA_MODEL",
130
+ "value": "llama3.1:8b"
131
+ },
132
+ {
133
+ "name": "SERVER_HOST",
134
+ "value": "0.0.0.0"
135
+ },
136
+ {
137
+ "name": "SERVER_PORT",
138
+ "value": "8000"
139
+ }
140
+ ],
141
+ "logConfiguration": {
142
+ "logDriver": "awslogs",
143
+ "options": {
144
+ "awslogs-group": "/ecs/summarizer-backend",
145
+ "awslogs-region": "us-east-1",
146
+ "awslogs-stream-prefix": "ecs"
147
+ }
148
+ }
149
+ }
150
+ ]
151
+ }
152
+ ```
153
+
154
+ ---
155
+
156
+ ## 🌊 **Option 4: DigitalOcean App Platform**
157
+
158
+ ### Steps:
159
+
160
+ 1. **Create App Spec**
161
+ ```yaml
162
+ # .do/app.yaml
163
+ name: summarizer-backend
164
+ services:
165
+ - name: api
166
+ source_dir: /
167
+ github:
168
+ repo: MingLu0/SummarizerBackend
169
+ branch: main
170
+ run_command: uvicorn app.main:app --host 0.0.0.0 --port 8080
171
+ environment_slug: python
172
+ instance_count: 1
173
+ instance_size_slug: basic-xxl
174
+ http_port: 8080
175
+ envs:
176
+ - key: OLLAMA_MODEL
177
+ value: llama3.1:8b
178
+ - key: SERVER_HOST
179
+ value: 0.0.0.0
180
+ - key: SERVER_PORT
181
+ value: 8080
182
+ ```
183
+
184
+ 2. **Deploy**
185
+ ```bash
186
+ doctl apps create --spec .do/app.yaml
187
+ ```
188
+
189
+ ---
190
+
191
+ ## πŸ”§ **Option 5: Render (Simple)**
192
+
193
+ ### Steps:
194
+
195
+ 1. **Connect GitHub Repository**
196
+ - Go to [render.com](https://render.com)
197
+ - Connect your GitHub account
198
+ - Select your repository
199
+
200
+ 2. **Create Web Service**
201
+ - Choose "Web Service"
202
+ - Select your repository
203
+ - Use these settings:
204
+ ```
205
+ Build Command: docker-compose build
206
+ Start Command: docker-compose up
207
+ Environment: Docker
208
+ ```
209
+
210
+ 3. **Set Environment Variables**
211
+ ```
212
+ OLLAMA_MODEL=llama3.1:8b
213
+ OLLAMA_HOST=http://ollama:11434
214
+ SERVER_HOST=0.0.0.0
215
+ SERVER_PORT=8000
216
+ ```
217
+
218
+ ---
219
+
220
+ ## ⚠️ **Important Considerations**
221
+
222
+ ### **Model Download in Cloud**
223
+ Your Ollama models need to be downloaded after deployment. Add this to your deployment:
224
+
225
+ ```bash
226
+ # Add to docker-compose.yml or startup script
227
+ ollama pull llama3.1:8b
228
+ ```
229
+
230
+ ### **Memory Requirements**
231
+ - **llama3.1:8b** needs ~8GB RAM
232
+ - **llama3.1:7b** needs ~7GB RAM
233
+ - **mistral:7b** needs ~7GB RAM
234
+
235
+ ### **Cost Optimization**
236
+ - Use smaller models for production: `mistral:7b` or `llama3.1:7b`
237
+ - Consider using spot instances for development
238
+ - Monitor usage and scale accordingly
239
+
240
+ ### **Security**
241
+ - Enable API key authentication for production
242
+ - Use HTTPS (most platforms provide this automatically)
243
+ - Set up rate limiting
244
+ - Monitor logs for abuse
245
+
246
+ ---
247
+
248
+ ## 🎯 **Recommended Deployment Flow**
249
+
250
+ 1. **Start with Railway** (easiest setup)
251
+ 2. **Test with a smaller model** (mistral:7b)
252
+ 3. **Monitor performance and costs**
253
+ 4. **Scale up model size if needed**
254
+ 5. **Add security features**
255
+
256
+ ### **Quick Railway Deploy:**
257
+ ```bash
258
+ # 1. Install Railway CLI
259
+ npm install -g @railway/cli
260
+
261
+ # 2. Login and deploy
262
+ railway login
263
+ railway init
264
+ railway up
265
+ ```
266
+
267
+ Your backend will be live at `https://your-app.railway.app`! πŸš€
README.md CHANGED
@@ -318,29 +318,41 @@ docker run -p 8000:8000 \
318
 
319
  ## Cloud Deployment Options
320
 
321
- ### Railway
 
322
  ```bash
323
- # Install Railway CLI
324
  npm install -g @railway/cli
325
 
326
- # Deploy
327
  railway login
328
  railway init
329
  railway up
330
  ```
331
 
332
- ### Render
333
- 1. Connect your GitHub repository
334
- 2. Set environment variables
335
- 3. Deploy automatically on push
 
336
 
337
- ### AWS ECS
338
- ```bash
339
- # Build and push to ECR
340
- aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin your-account.dkr.ecr.us-east-1.amazonaws.com
341
- docker tag summarizer-backend:latest your-account.dkr.ecr.us-east-1.amazonaws.com/summarizer-backend:latest
342
- docker push your-account.dkr.ecr.us-east-1.amazonaws.com/summarizer-backend:latest
343
- ```
 
 
 
 
 
 
 
 
 
 
344
 
345
  ## Monitoring and Logging
346
 
 
318
 
319
  ## Cloud Deployment Options
320
 
321
+ ### πŸš€ **Quick Deploy with Railway (Recommended)**
322
+
323
  ```bash
324
+ # 1. Install Railway CLI
325
  npm install -g @railway/cli
326
 
327
+ # 2. Login and deploy
328
  railway login
329
  railway init
330
  railway up
331
  ```
332
 
333
+ **Railway Advantages:**
334
+ - βœ… Supports Docker Compose with Ollama
335
+ - βœ… Persistent volumes for models
336
+ - βœ… Automatic HTTPS
337
+ - βœ… Easy environment management
338
 
339
+ ### πŸ“‹ **Other Options**
340
+
341
+ - **Google Cloud Run**: Serverless with auto-scaling
342
+ - **AWS ECS**: Full container orchestration
343
+ - **DigitalOcean App Platform**: Simple deployment
344
+ - **Render**: GitHub integration
345
+
346
+ ### πŸ“– **Detailed Deployment Guide**
347
+
348
+ See [DEPLOYMENT.md](DEPLOYMENT.md) for comprehensive deployment instructions for all platforms.
349
+
350
+ ### ⚠️ **Important Notes**
351
+
352
+ - **Memory Requirements**: llama3.1:8b needs ~8GB RAM
353
+ - **Model Download**: Models are downloaded after deployment
354
+ - **Cost Optimization**: Start with smaller models (mistral:7b)
355
+ - **Security**: Enable API keys for production use
356
 
357
  ## Monitoring and Logging
358
 
env.example ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Ollama Configuration
2
+ OLLAMA_MODEL=llama3.1:8b
3
+ OLLAMA_HOST=http://ollama:11434
4
+ OLLAMA_TIMEOUT=30
5
+
6
+ # Server Configuration
7
+ SERVER_HOST=0.0.0.0
8
+ SERVER_PORT=8000
9
+ LOG_LEVEL=INFO
10
+
11
+ # Optional: API Security
12
+ API_KEY_ENABLED=false
13
+ API_KEY=your-secret-key-here
14
+
15
+ # Optional: Rate Limiting
16
+ RATE_LIMIT_ENABLED=false
17
+ RATE_LIMIT_REQUESTS=60
18
+ RATE_LIMIT_WINDOW=60
19
+
20
+ # Input validation
21
+ MAX_TEXT_LENGTH=32000
22
+ MAX_TOKENS_DEFAULT=256
railway.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "$schema": "https://railway.app/railway.schema.json",
3
+ "build": {
4
+ "builder": "DOCKERFILE"
5
+ },
6
+ "deploy": {
7
+ "startCommand": "docker-compose up --build",
8
+ "healthcheckPath": "/health",
9
+ "healthcheckTimeout": 300,
10
+ "restartPolicyType": "ON_FAILURE",
11
+ "restartPolicyMaxRetries": 10
12
+ }
13
+ }