Files
eveAI/documentation/scaleway-deployment-guide.md

10 KiB

EveAI Scaleway Deployment Guide

Overview

This guide covers the deployment of EveAI to Scaleway Kubernetes using a modular Kustomize structure with Helm integration for monitoring services.

Architecture

Managed Services (Scaleway)

  • PostgreSQL: Database service
  • Redis: Message broker and cache
  • MinIO: Object storage (S3-compatible)
  • Secret Manager: Secure storage for secrets

Kubernetes Services

  • Infrastructure: Ingress Controller, Cert-Manager, TLS certificates
  • Applications: EveAI services (app, api, workers, etc.)
  • Monitoring: Prometheus, Grafana, Pushgateway (via Helm)
  • Verification: Permanent cluster health monitoring service

Directory Structure

scaleway/manifests/
├── base/                           # Base configurations
│   ├── infrastructure/             # Core infrastructure
│   │   ├── 00-namespaces.yaml     # Namespace definitions
│   │   ├── 01-ingress-controller.yaml  # NGINX Ingress Controller
│   │   ├── 02-cert-manager.yaml   # Cert-Manager setup
│   │   └── 03-cluster-issuers.yaml    # Let's Encrypt issuers
│   ├── applications/               # Application services
│   │   └── verification/           # Verification service
│   │       ├── 00-configmaps.yaml # HTML content and nginx config
│   │       ├── 01-deployment.yaml # Deployment specification
│   │       └── 02-service.yaml    # Service definition
│   ├── monitoring/                 # Monitoring stack (Helm)
│   │   ├── kustomization.yaml     # Helm chart integration
│   │   └── values-monitoring.yaml # Prometheus stack values
│   ├── secrets/                    # Secret definitions
│   │   └── scaleway-secrets.yaml  # Scaleway Secret Manager integration
│   └── networking/                 # Network configuration
│       └── ingress-https.yaml     # HTTPS-only ingress
└── overlays/                       # Environment-specific configs
    ├── staging/                    # Staging environment
    │   └── kustomization.yaml     # Staging overlay
    └── production/                 # Production environment (future)
        └── kustomization.yaml     # Production overlay

Prerequisites

1. Scaleway Setup

  • Kubernetes cluster running
  • Managed services configured:
    • PostgreSQL database
    • Redis instance
    • MinIO object storage
  • Secrets stored in Scaleway Secret Manager:
    • eveai-app-keys
    • eveai-mistral
    • eveai-object-storage
    • eveai-openai
    • eveai-postgresql
    • eveai-redis
    • eveai-redis-certificate

2. Local Tools

# Install required tools
kubectl version --client
kustomize version
helm version

# Install Kustomize Helm plugin
kubectl kustomize --enable-helm

3. Cluster Access

# Configure kubectl for Scaleway cluster
scw k8s kubeconfig install <cluster-id>
kubectl cluster-info

Deployment Process

Phase 1: Infrastructure Foundation

Deploy core infrastructure components in order:

# 1. Deploy namespaces
kubectl apply -f scaleway/manifests/base/infrastructure/00-namespaces.yaml

# 2. Deploy ingress controller
kubectl apply -f scaleway/manifests/base/infrastructure/01-ingress-controller.yaml

# Wait for ingress controller to be ready
kubectl wait --namespace ingress-nginx \
  --for=condition=ready pod \
  --selector=app.kubernetes.io/component=controller \
  --timeout=300s

# 3. Install cert-manager CRDs (required first)
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.2/cert-manager.crds.yaml

# 4. Deploy cert-manager
kubectl apply -f scaleway/manifests/base/infrastructure/02-cert-manager.yaml

# Wait for cert-manager to be ready
kubectl wait --namespace cert-manager \
  --for=condition=ready pod \
  --selector=app.kubernetes.io/name=cert-manager \
  --timeout=300s

# 5. Deploy cluster issuers
kubectl apply -f scaleway/manifests/base/infrastructure/03-cluster-issuers.yaml

Phase 2: Monitoring Stack

Deploy monitoring services using Helm integration:

# Add Prometheus community Helm repository
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

# Deploy monitoring stack via Kustomize + Helm
kubectl kustomize --enable-helm scaleway/manifests/base/monitoring/ | kubectl apply -f -

# Verify monitoring deployment
kubectl get pods -n monitoring
kubectl get pvc -n monitoring

Phase 3: Application Services

Deploy verification service and secrets:

# Deploy secrets (update with actual Scaleway Secret Manager values first)
kubectl apply -f scaleway/manifests/base/secrets/scaleway-secrets.yaml

# Deploy verification service
kubectl apply -k scaleway/manifests/base/applications/verification/

# Deploy HTTPS ingress
kubectl apply -f scaleway/manifests/base/networking/ingress-https.yaml

Phase 4: Complete Staging Deployment

Deploy everything using the staging overlay:

# Deploy complete staging environment
kubectl apply -k scaleway/manifests/overlays/staging/

# Verify deployment
kubectl get all -n eveai-staging
kubectl get ingress -n eveai-staging
kubectl get certificates -n eveai-staging

Verification and Testing

1. Check Infrastructure

# Verify ingress controller
kubectl get pods -n ingress-nginx
kubectl get svc -n ingress-nginx

# Verify cert-manager
kubectl get pods -n cert-manager
kubectl get clusterissuers

# Check certificate status
kubectl describe certificate evie-staging-tls -n eveai-staging

2. Test Verification Service

# Get external IP
kubectl get svc -n ingress-nginx

# Test HTTPS access (replace with actual IP/domain)
curl -k https://evie-staging.askeveai.com/verify/health
curl -k https://evie-staging.askeveai.com/verify/info

3. Monitor Services

# Check monitoring stack
kubectl get pods -n monitoring
kubectl port-forward -n monitoring svc/monitoring-grafana 3000:80

# Access Grafana at http://localhost:3000
# Default credentials: admin/admin123

Secret Management

Updating Secrets from Scaleway Secret Manager

The secrets in scaleway-secrets.yaml use template placeholders. To use actual values:

  1. Manual approach: Replace template values with actual secrets
  2. Automated approach: Use a secret management tool like External Secrets Operator

Example manual update:

# Replace template placeholders like:
# password: "{{ .Values.database.password }}"
# 
# With actual values from Scaleway Secret Manager:
# password: "actual-password-from-scaleway"

For production, consider using External Secrets Operator to automatically sync from Scaleway Secret Manager:

# Install External Secrets Operator
helm repo add external-secrets https://charts.external-secrets.io
helm install external-secrets external-secrets/external-secrets -n external-secrets-system --create-namespace

Monitoring and Observability

Grafana Dashboards

  • URL: https://evie-staging.askeveai.com/monitoring (when ingress path is configured)
  • Credentials: admin/admin123 (change in production)
  • Pre-configured: EveAI-specific dashboards in /EveAI folder

Prometheus Metrics

  • Internal URL: http://monitoring-prometheus:9090
  • Scrapes: Kubernetes metrics, application metrics, Scaleway managed services

Pushgateway

  • Internal URL: http://monitoring-pushgateway:9091
  • Usage: For batch job metrics from EveAI workers

Troubleshooting

Common Issues

  1. Certificate not issued

    kubectl describe certificate evie-staging-tls -n eveai-staging
    kubectl logs -n cert-manager deployment/cert-manager
    
  2. Ingress not accessible

    kubectl describe ingress eveai-staging-ingress -n eveai-staging
    kubectl logs -n ingress-nginx deployment/ingress-nginx-controller
    
  3. Monitoring stack issues

    kubectl logs -n monitoring deployment/monitoring-prometheus-server
    kubectl get pvc -n monitoring  # Check storage
    
  4. Secret issues

    kubectl get secrets -n eveai-staging
    kubectl describe secret database-secrets -n eveai-staging
    

Useful Commands

# View all resources in staging
kubectl get all -n eveai-staging

# Check resource usage
kubectl top pods -n eveai-staging
kubectl top nodes

# View logs
kubectl logs -f deployment/verify-service -n eveai-staging

# Port forwarding for local access
kubectl port-forward -n eveai-staging svc/verify-service 8080:80

Scaling and Updates

Scaling Services

# Scale verification service
kubectl scale deployment verify-service --replicas=3 -n eveai-staging

# Update image
kubectl set image deployment/verify-service nginx=nginx:1.21-alpine -n eveai-staging

Rolling Updates

# Update using Kustomize
kubectl apply -k scaleway/manifests/overlays/staging/

# Check rollout status
kubectl rollout status deployment/verify-service -n eveai-staging

Production Deployment

For production deployment:

  1. Create scaleway/manifests/overlays/production/kustomization.yaml
  2. Update domain names and certificates
  3. Adjust resource limits and replicas
  4. Use production Let's Encrypt issuer
  5. Configure production monitoring and alerting
# Production deployment
kubectl apply -k scaleway/manifests/overlays/production/

Security Considerations

  1. Secrets: Use Scaleway Secret Manager integration
  2. TLS: HTTPS-only with automatic certificate renewal
  3. Network: Ingress-based routing with proper annotations
  4. RBAC: Kubernetes role-based access control
  5. Images: Use specific tags, not latest

Maintenance

Regular Tasks

  • Monitor certificate expiration
  • Update Helm charts and container images
  • Review resource usage and scaling
  • Backup monitoring data
  • Update secrets rotation

Monitoring Alerts

Configure alerts for:

  • Certificate expiration (< 30 days)
  • Pod failures and restarts
  • Resource usage thresholds
  • External service connectivity

Support

For issues and questions:

  1. Check logs using kubectl commands above
  2. Verify Scaleway managed services status
  3. Review Kubernetes events: kubectl get events -n eveai-staging
  4. Check monitoring dashboards for system health