feat: downloader

This commit is contained in:
Anatoly Antonov 2025-06-23 04:19:26 +03:00
parent b9c1a98895
commit 42e7924be9
37 changed files with 2422 additions and 94 deletions

501
DEPLOYMENT.md Normal file
View file

@ -0,0 +1,501 @@
# Deployment Guide
This guide covers deploying the Predictor Service using Docker and Docker Compose.
## Prerequisites
- Docker Engine 20.10+
- Docker Compose 2.0+
- At least 2GB RAM available
- 10GB free disk space
## Quick Deployment
### 1. Clone and Setup
```bash
git clone <repository-url>
cd predictor
```
### 2. Validate Configuration
```bash
# Validate Docker configuration
./scripts/validate-docker.sh
```
### 3. Deploy
```bash
# Build and start services
make up-build
# Check status
make ps
# View logs
make logs
```
## Production Deployment
### Environment Configuration
1. **Copy environment template:**
```bash
cp cmd/api/.env cmd/api/.env.production
```
2. **Edit production environment:**
```bash
nano cmd/api/.env.production
```
3. **Key production settings:**
```bash
# Security
GSN_PREDICTOR_REDIS_PASSWORD=your_secure_password
# Performance
GSN_PREDICTOR_GRIB_PARALLEL=8
GSN_PREDICTOR_GRIB_CACHE_TTL=2h
# Monitoring
GSN_PREDICTOR_GRIB_UPDATER_INTERVAL=3h
```
### Production Docker Compose
Create `docker-compose.prod.yml`:
```yaml
version: '3.8'
services:
predictor:
build:
context: .
dockerfile: Dockerfile
container_name: predictor-prod
ports:
- "8080:8080"
env_file:
- cmd/api/.env.production
volumes:
- grib_data:/tmp/grib
depends_on:
redis:
condition: service_healthy
networks:
- predictor-network
restart: unless-stopped
deploy:
resources:
limits:
memory: 1G
cpus: '0.5'
reservations:
memory: 512M
cpus: '0.25'
healthcheck:
test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
redis:
image: redis:7.2-alpine
container_name: predictor-redis-prod
ports:
- "6379:6379"
volumes:
- redis_data:/data
networks:
- predictor-network
restart: unless-stopped
command: redis-server --appendonly yes --maxmemory 512mb --maxmemory-policy allkeys-lru --requirepass ${GSN_PREDICTOR_REDIS_PASSWORD}
healthcheck:
test: ["CMD", "redis-cli", "-a", "${GSN_PREDICTOR_REDIS_PASSWORD}", "ping"]
interval: 10s
timeout: 3s
retries: 5
start_period: 10s
volumes:
grib_data:
driver: local
redis_data:
driver: local
networks:
predictor-network:
driver: bridge
```
### Deploy to Production
```bash
# Deploy with production config
docker-compose -f docker-compose.prod.yml up -d
# Monitor deployment
docker-compose -f docker-compose.prod.yml logs -f
# Check health
curl http://localhost:8080/health
```
## Kubernetes Deployment
### Create Namespace
```yaml
# k8s/namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: predictor
```
### Redis Deployment
```yaml
# k8s/redis.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis
namespace: predictor
spec:
replicas: 1
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
containers:
- name: redis
image: redis:7.2-alpine
ports:
- containerPort: 6379
command: ["redis-server", "--appendonly", "yes", "--maxmemory", "512mb", "--maxmemory-policy", "allkeys-lru"]
volumeMounts:
- name: redis-data
mountPath: /data
resources:
limits:
memory: "512Mi"
cpu: "250m"
requests:
memory: "256Mi"
cpu: "100m"
livenessProbe:
exec:
command: ["redis-cli", "ping"]
initialDelaySeconds: 10
periodSeconds: 10
readinessProbe:
exec:
command: ["redis-cli", "ping"]
initialDelaySeconds: 5
periodSeconds: 5
volumes:
- name: redis-data
persistentVolumeClaim:
claimName: redis-pvc
---
apiVersion: v1
kind: Service
metadata:
name: redis
namespace: predictor
spec:
selector:
app: redis
ports:
- port: 6379
targetPort: 6379
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: redis-pvc
namespace: predictor
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
```
### Predictor Deployment
```yaml
# k8s/predictor.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: predictor
namespace: predictor
spec:
replicas: 2
selector:
matchLabels:
app: predictor
template:
metadata:
labels:
app: predictor
spec:
containers:
- name: predictor
image: predictor:latest
ports:
- containerPort: 8080
env:
- name: GSN_PREDICTOR_REDIS_HOST
value: "redis"
- name: GSN_PREDICTOR_REDIS_PORT
value: "6379"
- name: GSN_PREDICTOR_GRIB_DIR
value: "/tmp/grib"
- name: GSN_PREDICTOR_SCHEDULER_ENABLED
value: "true"
- name: GSN_PREDICTOR_GRIB_UPDATER_INTERVAL
value: "6h"
- name: GSN_PREDICTOR_GRIB_UPDATER_TIMEOUT
value: "45m"
volumeMounts:
- name: grib-data
mountPath: /tmp/grib
resources:
limits:
memory: "1Gi"
cpu: "500m"
requests:
memory: "512Mi"
cpu: "250m"
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 40
periodSeconds: 30
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 10
periodSeconds: 10
volumes:
- name: grib-data
emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
name: predictor
namespace: predictor
spec:
selector:
app: predictor
ports:
- port: 80
targetPort: 8080
type: LoadBalancer
```
### Deploy to Kubernetes
```bash
# Apply namespace
kubectl apply -f k8s/namespace.yaml
# Apply Redis
kubectl apply -f k8s/redis.yaml
# Wait for Redis to be ready
kubectl wait --for=condition=ready pod -l app=redis -n predictor
# Apply Predictor
kubectl apply -f k8s/predictor.yaml
# Check status
kubectl get pods -n predictor
kubectl get services -n predictor
```
## Monitoring and Logging
### Health Checks
The service includes built-in health checks:
```bash
# Application health
curl http://localhost:8080/health
# Docker health
docker inspect predictor | jq '.[0].State.Health'
# Kubernetes health
kubectl describe pod -l app=predictor -n predictor
```
### Logging
```bash
# Docker logs
docker-compose logs -f predictor
# Kubernetes logs
kubectl logs -f deployment/predictor -n predictor
```
### Metrics
Consider adding Prometheus metrics:
```yaml
# Add to docker-compose.yml
prometheus:
image: prom/prometheus:latest
ports:
- "9090:9090"
volumes:
- ./monitoring/prometheus.yml:/etc/prometheus/prometheus.yml
networks:
- predictor-network
```
## Backup and Recovery
### Redis Backup
```bash
# Create backup
docker exec predictor-redis redis-cli BGSAVE
# Copy backup file
docker cp predictor-redis:/data/dump.rdb ./backup/redis-$(date +%Y%m%d).rdb
```
### GRIB Data Backup
```bash
# Backup GRIB data
docker run --rm -v predictor_grib_data:/data -v $(pwd)/backup:/backup alpine tar czf /backup/grib-$(date +%Y%m%d).tar.gz -C /data .
```
### Automated Backup Script
```bash
#!/bin/bash
# scripts/backup.sh
BACKUP_DIR="./backup/$(date +%Y%m%d)"
mkdir -p $BACKUP_DIR
# Redis backup
docker exec predictor-redis redis-cli BGSAVE
sleep 5
docker cp predictor-redis:/data/dump.rdb $BACKUP_DIR/redis.rdb
# GRIB data backup
docker run --rm -v predictor_grib_data:/data -v $(pwd)/$BACKUP_DIR:/backup alpine tar czf /backup/grib.tar.gz -C /data .
echo "Backup completed: $BACKUP_DIR"
```
## Troubleshooting
### Common Issues
1. **Redis Connection Issues:**
```bash
# Check Redis status
docker-compose exec redis redis-cli ping
# Check network connectivity
docker-compose exec predictor wget -O- http://redis:6379
```
2. **GRIB Download Failures:**
```bash
# Check disk space
docker-compose exec predictor df -h /tmp/grib
# Check internet connectivity
docker-compose exec predictor wget -O- https://nomads.ncep.noaa.gov/
```
3. **Memory Issues:**
```bash
# Check memory usage
docker stats
# Check container logs
docker-compose logs predictor | grep -i memory
```
### Performance Tuning
1. **Redis Optimization:**
```bash
# Increase Redis memory
GSN_PREDICTOR_REDIS_MAXMEMORY=1gb
# Optimize Redis settings
redis-server --maxmemory 1gb --maxmemory-policy allkeys-lru
```
2. **GRIB Processing:**
```bash
# Increase parallel workers
GSN_PREDICTOR_GRIB_PARALLEL=8
# Optimize cache TTL
GSN_PREDICTOR_GRIB_CACHE_TTL=2h
```
3. **Container Resources:**
```yaml
# In docker-compose.yml
deploy:
resources:
limits:
memory: 2G
cpus: '1.0'
reservations:
memory: 1G
cpus: '0.5'
```
## Security Considerations
1. **Network Security:**
- Use internal networks for service communication
- Expose only necessary ports
- Use reverse proxy for external access
2. **Container Security:**
- Run as non-root user
- Use minimal base images
- Regular security updates
3. **Data Security:**
- Encrypt sensitive environment variables
- Use secrets management for passwords
- Regular backups
4. **Access Control:**
- Implement API authentication
- Use HTTPS in production
- Monitor access logs
```