Spaces:

colin730
/

SummarizerApp

Running

App Files Files Community

ming commited on about 1 month ago

Commit

3a7a125

1 Parent(s): e6b70e4

Add Hugging Face Spaces configuration and deployment files

Browse files

Files changed (4) hide show

Dockerfile +33 -8
HUGGINGFACE_DEPLOYMENT.md +220 -0
README.md +110 -413
env.hf +25 -0

Dockerfile CHANGED Viewed

@@ -1,5 +1,5 @@
-# Use Python 3.7 slim image for compatibility
-FROM python:3.7-slim
 # Set environment variables
 ENV PYTHONDONTWRITEBYTECODE=1 \
@@ -14,8 +14,13 @@ RUN apt-get update \
     && apt-get install -y --no-install-recommends \
         curl \
         ca-certificates \
     && rm -rf /var/lib/apt/lists/*
 # Copy requirements first for better caching
 COPY requirements.txt .
@@ -30,14 +35,34 @@ COPY pytest.ini .
 # Create non-root user for security
 RUN groupadd -r appuser && useradd -r -g appuser appuser \
     && chown -R appuser:appuser /app
 USER appuser
-# Expose port
-EXPOSE 8000
 # Health check
-HEALTHCHECK --interval=30s --timeout=30s --start-period=5s --retries=3 \
-    CMD curl -f http://localhost:8000/health || exit 1
-# Run the application
-CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

+# Hugging Face Spaces compatible Dockerfile
+FROM python:3.9-slim
 # Set environment variables
 ENV PYTHONDONTWRITEBYTECODE=1 \
     && apt-get install -y --no-install-recommends \
         curl \
         ca-certificates \
+        wget \
+        git \
     && rm -rf /var/lib/apt/lists/*
+# Install Ollama
+RUN curl -fsSL https://ollama.ai/install.sh | sh
 # Copy requirements first for better caching
 COPY requirements.txt .
 # Create non-root user for security
 RUN groupadd -r appuser && useradd -r -g appuser appuser \
     && chown -R appuser:appuser /app
+# Create startup script
+RUN echo '#!/bin/bash\n\
+# Start Ollama in background\n\
+ollama serve &\n\
+\n\
+# Wait for Ollama to be ready\n\
+echo "Waiting for Ollama to start..."\n\
+sleep 10\n\
+\n\
+# Pull the model (this will take a few minutes on first run)\n\
+echo "Pulling model..."\n\
+ollama pull mistral:7b\n\
+\n\
+# Start the FastAPI app\n\
+echo "Starting FastAPI app..."\n\
+exec uvicorn app.main:app --host 0.0.0.0 --port 7860' > /app/start.sh \
+    && chmod +x /app/start.sh \
+    && chown appuser:appuser /app/start.sh
 USER appuser
+# Expose port (Hugging Face Spaces uses port 7860)
+EXPOSE 7860
 # Health check
+HEALTHCHECK --interval=30s --timeout=30s --start-period=60s --retries=3 \
+    CMD curl -f http://localhost:7860/health || exit 1
+# Run the startup script
+CMD ["/app/start.sh"]

HUGGINGFACE_DEPLOYMENT.md ADDED Viewed

	@@ -0,0 +1,220 @@

+# 🚀 Hugging Face Spaces Deployment Guide
+This guide will help you deploy your SummarizerApp to Hugging Face Spaces for **FREE**!
+## 🎯 Why Hugging Face Spaces?
+- ✅ **100% Free** - No credit card required
+- ✅ **16GB RAM** - Perfect for Mistral 7B model
+- ✅ **Docker Support** - Easy deployment
+- ✅ **Auto HTTPS** - Secure connections
+- ✅ **Built for AI** - Designed for ML/AI applications
+- ✅ **GitHub Integration** - Automatic deployments
+## 📋 Prerequisites
+1. **Hugging Face Account** - Sign up at [huggingface.co](https://huggingface.co)
+2. **GitHub Repository** - Your code should be on GitHub
+3. **Docker Knowledge** - Basic understanding helpful but not required
+## 🛠️ Step-by-Step Deployment
+### Step 1: Create a New Space
+1. Go to [huggingface.co/spaces](https://huggingface.co/spaces)
+2. Click **"Create new Space"**
+3. Fill in the details:
+   - **Space name**: `summarizer-app` (or your preferred name)
+   - **License**: MIT
+   - **SDK**: **Docker** (important!)
+   - **Hardware**: CPU (free tier)
+   - **Visibility**: Public or Private
+### Step 2: Configure Your Repository
+You need to make these changes to your GitHub repository:
+#### A. Rename Files
+```bash
+# Rename the Hugging Face specific files
+mv Dockerfile.hf Dockerfile
+mv README_HF.md README.md
+```
+#### B. Update Dockerfile (if needed)
+The `Dockerfile.hf` is already optimized for Hugging Face Spaces, but verify it uses:
+- Port `7860` (required by HF Spaces)
+- `mistral:7b` model (smaller, faster)
+- Proper startup script
+#### C. Push Changes to GitHub
+```bash
+git add .
+git commit -m "Add Hugging Face Spaces configuration"
+git push origin main
+```
+### Step 3: Connect GitHub to Hugging Face
+1. In your Hugging Face Space settings
+2. Go to **"Repository"** tab
+3. Click **"Connect to GitHub"**
+4. Select your `SummerizerApp` repository
+5. Choose the `main` branch
+### Step 4: Configure Environment Variables
+In your Hugging Face Space settings:
+1. Go to **"Settings"** tab
+2. Scroll to **"Environment Variables"**
+3. Add these variables:
+```
+OLLAMA_MODEL=mistral:7b
+OLLAMA_HOST=http://localhost:11434
+OLLAMA_TIMEOUT=30
+SERVER_HOST=0.0.0.0
+SERVER_PORT=7860
+LOG_LEVEL=INFO
+MAX_TEXT_LENGTH=32000
+MAX_TOKENS_DEFAULT=256
+```
+### Step 5: Deploy
+1. Go to the **"Deploy"** tab in your Space
+2. Click **"Deploy"**
+3. Wait for the build to complete (5-10 minutes)
+**What happens during deployment:**
+- Docker image builds
+- Ollama installs
+- Mistral 7B model downloads (~4GB)
+- FastAPI app starts
+- Health checks run
+## 🔍 Verification
+### Check Your Deployment
+1. **Visit your Space URL**: `https://your-username-summarizer-app.hf.space`
+2. **Test Health Endpoint**: `https://your-username-summarizer-app.hf.space/health`
+3. **View API Docs**: `https://your-username-summarizer-app.hf.space/docs`
+### Test the API
+```bash
+# Test summarization
+curl -X POST "https://your-username-summarizer-app.hf.space/api/v1/summarize" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "text": "This is a long article about artificial intelligence and machine learning. It discusses various topics including natural language processing, computer vision, and deep learning techniques. The article covers the history of AI, current applications, and future prospects.",
+    "max_tokens": 100
+  }'
+```
+## 🚨 Troubleshooting
+### Common Issues
+#### 1. Build Fails
+- **Check Dockerfile**: Ensure it's named `Dockerfile` (not `Dockerfile.hf`)
+- **Check README**: Ensure it has the proper frontmatter
+- **Check logs**: View build logs in Hugging Face interface
+#### 2. Model Not Loading
+- **Wait longer**: Model download takes 5-10 minutes on first run
+- **Check logs**: Look for Ollama-related errors
+- **Verify model name**: Ensure `mistral:7b` is correct
+#### 3. Out of Memory
+- **Use smaller model**: Switch to `mistral:7b` (already configured)
+- **Check hardware**: Ensure you're using CPU tier, not GPU
+#### 4. Port Issues
+- **Verify port**: Must use port `7860` for Hugging Face Spaces
+- **Check SERVER_PORT**: Environment variable should be `7860`
+### Debugging Commands
+If you need to debug locally with HF configuration:
+```bash
+# Test with HF settings
+cp env.hf .env
+docker build -f Dockerfile.hf -t summarizer-hf .
+docker run -p 7860:7860 summarizer-hf
+```
+## 📊 Performance Expectations
+### Startup Time
+- **First deployment**: 8-12 minutes (includes model download)
+- **Subsequent deployments**: 3-5 minutes
+- **Cold start**: 30-60 seconds
+### Runtime Performance
+- **Memory usage**: ~7-8GB RAM
+- **Response time**: 2-5 seconds per request
+- **Concurrent requests**: 1-2 (CPU limitation)
+### Limitations
+- **No GPU**: CPU-only inference
+- **Shared resources**: May be slower during peak usage
+- **Sleep mode**: Space may sleep after 48 hours of inactivity
+## 🔧 Customization Options
+### Use Different Model
+Edit environment variables:
+```
+OLLAMA_MODEL=llama3.1:7b  # Smaller than 8b
+OLLAMA_MODEL=mistral:7b   # Default, fastest
+```
+### Enable Security Features
+```
+API_KEY_ENABLED=true
+API_KEY=your-secret-key
+RATE_LIMIT_ENABLED=true
+```
+### Custom Domain
+1. Go to Space settings
+2. Add custom domain in "Settings" tab
+3. Configure DNS as instructed
+## 📈 Monitoring
+### View Logs
+1. Go to your Space
+2. Click **"Logs"** tab
+3. Monitor startup and runtime logs
+### Health Monitoring
+- **Health endpoint**: `/health`
+- **Metrics**: Built-in Hugging Face monitoring
+- **Uptime**: Check Space status page
+## 🎉 Success!
+Once deployed, your SummarizerApp will be available at:
+`https://your-username-summarizer-app.hf.space`
+### What You Get
+- ✅ **Free hosting** forever
+- ✅ **HTTPS endpoint** for your API
+- ✅ **16GB RAM** for AI models
+- ✅ **Automatic deployments** from GitHub
+- ✅ **Built-in monitoring** and logs
+### Next Steps
+1. **Share your API** with others
+2. **Integrate with apps** using the REST API
+3. **Monitor usage** and performance
+4. **Upgrade to GPU** if needed (paid tier)
+---
+**Congratulations! Your text summarization service is now live on Hugging Face Spaces! 🚀**

README.md CHANGED Viewed

@@ -1,473 +1,170 @@
-# Text Summarizer Backend API
-A FastAPI-based backend service for text summarization using Ollama's local language models. Designed for Android app integration with cloud deployment capabilities.
-## Features
-- 🚀 **FastAPI** - Modern, fast web framework for building APIs
-- 🤖 **Ollama Integration** - Local LLM inference with privacy-first approach
-- 📱 **Android Ready** - RESTful API optimized for mobile consumption
-- 🔒 **Request Tracking** - Unique request IDs and structured logging
-- ✅ **Comprehensive Testing** - 30+ tests with >90% coverage
-- 🐳 **Docker Ready** - Containerized deployment support
-- ☁️ **Cloud Extensible** - Easy migration to cloud hosting
-## Quick Start
-### Prerequisites
-- Python 3.7+
-- [Ollama](https://ollama.ai) installed and running
-- A compatible language model (e.g., `llama3.1:8b`)
-### Installation
-1. **Clone the repository**
-   ```bash
-   git clone https://github.com/MingLu0/SummarizerBackend.git
-   cd SummarizerBackend
-   ```
-2. **Set up Ollama**
-   ```bash
-   # Install Ollama (macOS)
-   brew install ollama
-   # Start Ollama service
-   ollama serve
-   # Pull a model (in another terminal)
-   ollama pull llama3.1:8b
-   ```
-3. **Set up Python environment**
-   ```bash
-   # Create virtual environment
-   python3 -m venv .venv
-   source .venv/bin/activate  # On Windows: .venv\Scripts\activate
-   # Install dependencies
-   pip install -r requirements.txt
-   ```
-4. **Start the server (Recommended)**
-   ```bash
-   # Use the automated startup script (checks everything for you)
-   ./start-server.sh
-   ```
-   **OR manually:**
-   ```bash
-   # Start the server manually
-   uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
-   ```
-## Configuration
-The server uses environment variables for configuration. A `.env` file is automatically created with sensible defaults:
-```bash
-# Ollama Configuration
-OLLAMA_HOST=http://127.0.0.1:11434
-OLLAMA_MODEL=llama3.2:latest
-OLLAMA_TIMEOUT=30
-# Server Configuration
-SERVER_HOST=0.0.0.0
-SERVER_PORT=8000
-LOG_LEVEL=INFO
-```
-**Common Issues & Solutions:**
-- **Port already in use**: The startup script automatically handles this
-- **Ollama connection failed**: Ensure Ollama is running (`ollama serve`)
-- **Model not found**: Install the model (`ollama pull llama3.2:latest`)
-- **Wrong host configuration**: The `.env` file ensures correct localhost settings
-## API Usage
-   ```
-5. **Test the API**
-   ```bash
-   # Health check
-   curl http://127.0.0.1:8000/health
-   # Summarize text
-   curl -X POST http://127.0.0.1:8000/api/v1/summarize/ \
-     -H "Content-Type: application/json" \
-     -d '{"text": "Your long text to summarize here..."}'
-   ```
-## API Documentation
-### Interactive Docs
-- **Swagger UI**: http://127.0.0.1:8000/docs
-- **ReDoc**: http://127.0.0.1:8000/redoc
-### Endpoints
-#### `GET /health`
-Health check endpoint.
-**Response:**
-```json
-{
-  "status": "ok",
-  "service": "text-summarizer-api",
-  "version": "1.0.0"
-}
 ```
-#### `POST /api/v1/summarize/`
-Summarize text using Ollama.
-**Request:**
-```json
-{
-  "text": "Your text to summarize...",
-  "max_tokens": 256,
-  "prompt": "Summarize the following text concisely:"
-}
 ```
-**Response:**
-```json
-{
-  "summary": "Generated summary text",
-  "model": "llama3.1:8b",
-  "tokens_used": 150,
-  "latency_ms": 1234.5
-}
 ```
-**Error Response:**
-```json
 {
-  "detail": "Summarization failed: Connection error",
-  "code": "OLLAMA_ERROR",
-  "request_id": "req-12345"
-}
-```
-## Configuration
-Configure the API using environment variables:
-```bash
-# Ollama Configuration
-export OLLAMA_MODEL=llama3.1:8b
-export OLLAMA_HOST=http://127.0.0.1:11434
-export OLLAMA_TIMEOUT=30
-# Server Configuration
-export SERVER_HOST=127.0.0.1
-export SERVER_PORT=8000
-export LOG_LEVEL=INFO
-# Optional: API Security
-export API_KEY_ENABLED=false
-export API_KEY=your-secret-key
-# Optional: Rate Limiting
-export RATE_LIMIT_ENABLED=false
-export RATE_LIMIT_REQUESTS=60
-export RATE_LIMIT_WINDOW=60
-```
-## Android Integration
-### Retrofit Example
-```kotlin
-// API Interface
-interface SummarizerApi {
-    @POST("api/v1/summarize/")
-    suspend fun summarize(@Body request: SummarizeRequest): SummarizeResponse
 }
-// Data Classes
-data class SummarizeRequest(
-    val text: String,
-    val max_tokens: Int = 256,
-    val prompt: String = "Summarize the following text concisely:"
-)
-data class SummarizeResponse(
-    val summary: String,
-    val model: String,
-    val tokens_used: Int?,
-    val latency_ms: Double?
-)
-// Usage
-val retrofit = Retrofit.Builder()
-    .baseUrl("http://127.0.0.1:8000/")
-    .addConverterFactory(GsonConverterFactory.create())
-    .build()
-val api = retrofit.create(SummarizerApi::class.java)
-val response = api.summarize(SummarizeRequest(text = "Your text here"))
 ```
-### OkHttp Example
-```kotlin
-val client = OkHttpClient()
-val json = JSONObject().apply {
-    put("text", "Your text to summarize")
-    put("max_tokens", 256)
-}
-val request = Request.Builder()
-    .url("http://127.0.0.1:8000/api/v1/summarize/")
-    .post(json.toString().toRequestBody("application/json".toMediaType()))
-    .build()
-client.newCall(request).execute().use { response ->
-    val result = response.body?.string()
-    // Handle response
-}
-```
-## Development
-### Running Tests
 ```bash
-# Run all tests locally
-pytest
-# Run with coverage
-pytest --cov=app --cov-report=html --cov-report=term
-# Run tests in Docker
-./scripts/run-tests.sh
-# Run specific test file
-pytest tests/test_api.py -v
-# Run tests and stop on first failure
-pytest -x
 ```
-### Code Quality
-```bash
-# Format code
-black app/ tests/
-# Sort imports
-isort app/ tests/
-# Lint code
-flake8 app/ tests/
-```
-### Project Structure
-```
-app/
-├── main.py                 # FastAPI app entry point
-├── api/
-│   └── v1/
-│       ├── routes.py       # API route definitions
-│       ├── schemas.py      # Pydantic models
-│       └── summarize.py    # Summarization endpoint
-├── services/
-│   └── summarizer.py       # Ollama integration
-└── core/
-    ├── config.py          # Configuration management
-    ├── logging.py         # Logging setup
-    ├── middleware.py      # Request middleware
-    └── errors.py          # Error handling
-tests/
-├── test_api.py            # API endpoint tests
-├── test_services.py       # Service layer tests
-├── test_schemas.py        # Pydantic model tests
-├── test_config.py         # Configuration tests
-└── conftest.py           # Test configuration
-```
-## Docker Deployment
-### Quick Start with Docker
 ```bash
-# 1. Start Ollama service
-docker-compose up ollama -d
-# 2. Download a model (first time only)
-./scripts/setup-ollama.sh llama3.1:8b
-# 3. Start the API
-docker-compose up api -d
-# 4. Test the setup
-curl http://localhost:8000/health
 ```
-### Development with Hot Reload
-```bash
-# Use development compose file
-docker-compose -f docker-compose.dev.yml up --build
-```
-### Production with Nginx
 ```bash
-# Start with Nginx reverse proxy
-docker-compose --profile production up --build
-```
-### Manual Build
-```bash
-# Build the image
-docker build -t summarizer-backend .
-# Run with Ollama
-docker run -p 8000:8000 \
-  -e OLLAMA_HOST=http://host.docker.internal:11434 \
-  summarizer-backend
 ```
-### Production Deployment
-1. **Build the image**
-   ```bash
-   docker build -t your-registry/summarizer-backend:latest .
-   ```
-2. **Deploy to cloud**
-   ```bash
-   # Push to registry
-   docker push your-registry/summarizer-backend:latest
-   # Deploy to your cloud provider
-   # (AWS ECS, Google Cloud Run, Azure Container Instances, etc.)
-   ```
-## Cloud Deployment Options
-### 🚀 **Quick Deploy with Railway (Recommended)**
-```bash
-# 1. Install Railway CLI
-npm install -g @railway/cli
-# 2. Login and deploy
-railway login
-railway init
-railway up
 ```
-**Railway Advantages:**
-- ✅ Supports Docker Compose with Ollama
-- ✅ Persistent volumes for models
-- ✅ Automatic HTTPS
-- ✅ Easy environment management
-### 📋 **Other Options**
-- **Google Cloud Run**: Serverless with auto-scaling
-- **AWS ECS**: Full container orchestration
-- **DigitalOcean App Platform**: Simple deployment
-- **Render**: GitHub integration
-### 📖 **Detailed Deployment Guide**
-See [DEPLOYMENT.md](DEPLOYMENT.md) for comprehensive deployment instructions for all platforms.
-### ⚠️ **Important Notes**
-- **Memory Requirements**: llama3.1:8b needs ~8GB RAM
-- **Model Download**: Models are downloaded after deployment
-- **Cost Optimization**: Start with smaller models (mistral:7b)
-- **Security**: Enable API keys for production use
-## Monitoring and Logging
-### Request Tracking
-Every request gets a unique ID for tracking:
 ```bash
-curl -H "X-Request-ID: my-custom-id" http://127.0.0.1:8000/api/v1/summarize/ \
-  -d '{"text": "test"}'
 ```
-### Log Format
-```
-2025-09-29 20:47:46,949 - app.core.middleware - INFO - Request abc123: POST /api/v1/summarize/
-2025-09-29 20:47:46,987 - app.core.middleware - INFO - Response abc123: 200 (38.48ms)
-```
-## Performance Considerations
-### Model Selection
-- **llama3.1:8b** - Good balance of speed and quality
-- **mistral:7b** - Faster, good for real-time apps
-- **llama3.1:70b** - Higher quality, slower inference
-### Optimization Tips
-1. **Batch requests** when possible
-2. **Cache summaries** for repeated content
-3. **Use appropriate max_tokens** (256-512 for most use cases)
-4. **Monitor latency** and adjust timeout settings
-## Troubleshooting
 ### Common Issues
-**Ollama connection failed**
-```bash
-# Check if Ollama is running
-curl http://127.0.0.1:11434/api/tags
-# Restart Ollama
-ollama serve
-```
-**Model not found**
-```bash
-# List available models
-ollama list
-# Pull the required model
-ollama pull llama3.1:8b
-```
-**Port already in use**
-```bash
-# Use a different port
-uvicorn app.main:app --port 8001
-```
-### Debug Mode
-```bash
-# Enable debug logging
-export LOG_LEVEL=DEBUG
-uvicorn app.main:app --reload
-```
-## Contributing
 1. Fork the repository
-2. Create a feature branch (`git checkout -b feature/amazing-feature`)
-3. Commit your changes (`git commit -m 'Add amazing feature'`)
-4. Push to the branch (`git push origin feature/amazing-feature`)
-5. Open a Pull Request
-## License
-This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
-## Support
-- 📧 **Email**: [email protected]
-- 🐛 **Issues**: [GitHub Issues](https://github.com/MingLu0/SummarizerBackend/issues)
-- 📖 **Documentation**: [API Docs](http://127.0.0.1:8000/docs)
 ---
-**Built with ❤️ for privacy-first text summarization**

+---
+title: Text Summarizer API
+emoji: 📝
+colorFrom: blue
+colorTo: purple
+sdk: docker
+pinned: false
+license: mit
+app_port: 7860
+---
+# Text Summarizer API
+A FastAPI-based text summarization service powered by Ollama and Mistral 7B model.
+## 🚀 Features
+- **Fast text summarization** using local LLM inference
+- **RESTful API** with FastAPI
+- **Health monitoring** and logging
+- **Docker containerized** for easy deployment
+- **Free deployment** on Hugging Face Spaces
+## 📡 API Endpoints
+### Health Check
 ```
+GET /health
 ```
+### Summarize Text
 ```
+POST /api/v1/summarize
+Content-Type: application/json
 {
+  "text": "Your long text to summarize here...",
+  "max_tokens": 256,
+  "temperature": 0.7
 }
 ```
+### API Documentation
+- **Swagger UI**: `/docs`
+- **ReDoc**: `/redoc`
+## 🔧 Configuration
+The service uses the following environment variables:
+- `OLLAMA_MODEL`: Model to use (default: `mistral:7b`)
+- `OLLAMA_HOST`: Ollama service host (default: `http://localhost:11434`)
+- `OLLAMA_TIMEOUT`: Request timeout in seconds (default: `30`)
+- `SERVER_HOST`: Server host (default: `0.0.0.0`)
+- `SERVER_PORT`: Server port (default: `7860`)
+- `LOG_LEVEL`: Logging level (default: `INFO`)
+## 🐳 Docker Deployment
+### Local Development
 ```bash
+# Build and run with docker-compose
+docker-compose up --build
+# Or run directly
+docker build -f Dockerfile.hf -t summarizer-app .
+docker run -p 7860:7860 summarizer-app
 ```
+### Hugging Face Spaces
+This app is configured for deployment on Hugging Face Spaces using Docker SDK.
+## 📊 Performance
+- **Model**: Mistral 7B (7GB RAM requirement)
+- **Startup time**: ~2-3 minutes (includes model download)
+- **Inference speed**: ~2-5 seconds per request
+- **Memory usage**: ~8GB RAM
+## 🛠️ Development
+### Setup
 ```bash
+# Install dependencies
+pip install -r requirements.txt
+# Run locally
+uvicorn app.main:app --host 0.0.0.0 --port 7860
 ```
+### Testing
 ```bash
+# Run tests
+pytest
+# Run with coverage
+pytest --cov=app
 ```
+## 📝 Usage Examples
+### Python
+```python
+import requests
+# Summarize text
+response = requests.post(
+    "https://your-space.hf.space/api/v1/summarize",
+    json={
+        "text": "Your long article or text here...",
+        "max_tokens": 256
+    }
+)
+result = response.json()
+print(result["summary"])
 ```
+### cURL
 ```bash
+curl -X POST "https://your-space.hf.space/api/v1/summarize" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "text": "Your text to summarize...",
+    "max_tokens": 256
+  }'
 ```
+## 🔒 Security
+- Non-root user execution
+- Input validation and sanitization
+- Rate limiting (configurable)
+- API key authentication (optional)
+## 📈 Monitoring
+The service includes:
+- Health check endpoint
+- Request logging
+- Error tracking
+- Performance metrics
+## 🆘 Troubleshooting
 ### Common Issues
+1. **Model not loading**: Check if Ollama is running and model is pulled
+2. **Out of memory**: Ensure sufficient RAM (8GB+) for Mistral 7B
+3. **Slow startup**: Normal on first run due to model download
+4. **API errors**: Check logs via `/docs` endpoint
+### Logs
+View application logs in the Hugging Face Spaces interface or check the health endpoint for service status.
+## 📄 License
+MIT License - see LICENSE file for details.
+## 🤝 Contributing
 1. Fork the repository
+2. Create a feature branch
+3. Make your changes
+4. Add tests
+5. Submit a pull request
 ---
+**Deployed on Hugging Face Spaces** 🚀

env.hf ADDED Viewed

	@@ -0,0 +1,25 @@

+# Hugging Face Spaces Environment Configuration
+# Copy this to .env for local development
+# Ollama Configuration
+OLLAMA_MODEL=mistral:7b
+OLLAMA_HOST=http://localhost:11434
+OLLAMA_TIMEOUT=30
+# Server Configuration
+SERVER_HOST=0.0.0.0
+SERVER_PORT=7860
+LOG_LEVEL=INFO
+# Optional: API Security
+API_KEY_ENABLED=false
+API_KEY=your-secret-key-here
+# Optional: Rate Limiting
+RATE_LIMIT_ENABLED=false
+RATE_LIMIT_REQUESTS=60
+RATE_LIMIT_WINDOW=60
+# Input validation
+MAX_TEXT_LENGTH=32000
+MAX_TOKENS_DEFAULT=256