Files
eveAI/documentation/Production Setup/phase-8-application-services.md
Josako 2a0c92b064 - Definition of extra eveai_ops service to run (db) jobs
- Definition of manifests for all jobs
- Definition of manifests for all eveai services
2025-09-03 15:20:54 +02:00

134 lines
5.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Phase 8: Application Services (Staging)
This guide describes how to deploy EveAI application services to the Scaleway Kubernetes cluster, building on Phases 17 in cluster-install.md.
## Prerequisites
- Ingress-NGINX running with external IP
- cert-manager installed and Certificate evie-staging-tls is READY (via HTTP ACME first, then HTTPS-only)
- External Secrets Operator installed; Kubernetes Secret eveai-secrets exists in namespace eveai-staging
- Verification service deployed and reachable via /verify
- Optional: Monitoring stack running, Pushgateway deployed or reachable; PUSH_GATEWAY_HOST/PORT available to apps (via eveai-secrets)
## What we deploy (structure)
- Frontend (web) services
- eveai-app → exposed at /admin
- eveai-api → exposed at /api
- eveai-chat-client → exposed at /client
- Backend worker services (internal)
- eveai-workers (queue: embeddings)
- eveai-chat-workers (queue: llm_interactions)
- eveai-entitlements (queue: entitlements)
- Ops Jobs (manual DB ops)
- 00-env-check
- 02-db-bootstrap-ext
- 03-db-migrate-public
- 04-db-migrate-tenant
- 05-seed-or-init-data
- 06-verify-minimal
Manifests are under:
- scaleway/manifests/base/applications/frontend/
- scaleway/manifests/base/applications/backend/
- scaleway/manifests/base/applications/ops/jobs/
- Aggregate kustomization: scaleway/manifests/base/applications/kustomization.yaml
## Step 1: Validate secrets
```bash
kubectl get secret eveai-secrets -n eveai-staging
kubectl get secret eveai-secrets -n eveai-staging -o jsonpath='{.data}' | jq 'keys'
```
Confirm presence of DB_*, REDIS_*, OPENAI_API_KEY, MISTRAL_API_KEY, JWT_SECRET_KEY, API_ENCRYPTION_KEY, MINIO_*, PUSH_GATEWAY_HOST, PUSH_GATEWAY_PORT.
## Step 2: Deploy Ops Jobs (manual pre-deploy)
Run the DB ops scripts manually in order. Each manifest uses generateName; use kubectl create.
```bash
kubectl create -f scaleway/manifests/base/applications/ops/jobs/00-env-check-job.yaml
kubectl wait --for=condition=complete job -n eveai-staging -l job-type=env-check --timeout=600s
kubectl create -f scaleway/manifests/base/applications/ops/jobs/02-db-bootstrap-ext-job.yaml
kubectl wait --for=condition=complete job -n eveai-staging -l job-type=db-bootstrap-ext --timeout=1800s
kubectl create -f scaleway/manifests/base/applications/ops/jobs/03-db-migrate-public-job.yaml
kubectl wait --for=condition=complete job -n eveai-staging -l job-type=db-migrate-public --timeout=1800s
kubectl create -f scaleway/manifests/base/applications/ops/jobs/04-db-migrate-tenant-job.yaml
kubectl wait --for=condition=complete job -n eveai-staging -l job-type=db-migrate-tenant --timeout=3600s
kubectl create -f scaleway/manifests/base/applications/ops/jobs/05-seed-or-init-data-job.yaml
kubectl wait --for=condition=complete job -n eveai-staging -l job-type=db-seed-or-init --timeout=1800s
kubectl create -f scaleway/manifests/base/applications/ops/jobs/06-verify-minimal-job.yaml
kubectl wait --for=condition=complete job -n eveai-staging -l job-type=db-verify-minimal --timeout=900s
```
View logs:
```bash
kubectl -n eveai-staging get jobs
kubectl -n eveai-staging logs job/<created-job-name>
```
## Step 3: Deploy backend workers
```bash
kubectl apply -k scaleway/manifests/base/applications/backend/
kubectl -n eveai-staging get deploy | egrep 'eveai-(workers|chat-workers|entitlements)'
# Optional: quick logs
kubectl -n eveai-staging logs deploy/eveai-workers --tail=100 || true
kubectl -n eveai-staging logs deploy/eveai-chat-workers --tail=100 || true
kubectl -n eveai-staging logs deploy/eveai-entitlements --tail=100 || true
```
## Step 4: Deploy frontend services
```bash
kubectl apply -k scaleway/manifests/base/applications/frontend/
kubectl -n eveai-staging get deploy,svc | egrep 'eveai-(app|api|chat-client)'
```
## Step 5: Verify Ingress routes
The HTTPS ingress has paths enabled for /admin, /api, /client. Verify:
```bash
kubectl -n eveai-staging describe ingress eveai-staging-ingress
curl -k https://evie-staging.askeveai.com/verify/health
curl -k https://evie-staging.askeveai.com/admin/healthz/ready
curl -k https://evie-staging.askeveai.com/api/healthz/ready
curl -k https://evie-staging.askeveai.com/client/healthz/ready
```
## Resources and probes (staging defaults)
- Web (app, api, chat-client):
- requests: 150m CPU, 256Mi RAM; limits: 500m CPU, 512Mi RAM; replicas: 1
- readiness/liveness: GET /healthz/ready
- Workers:
- eveai-workers: req 200m/512Mi, lim 1CPU/1Gi
- eveai-chat-workers: req 500m/1Gi, lim 2CPU/3Gi
- eveai-entitlements: req 100m/256Mi, lim 500m/512Mi
## Pushgateway usage
- Ensure PUSH_GATEWAY_HOST and PUSH_GATEWAY_PORT are provided (e.g., pushgateway.monitoring.svc.cluster.local:9091), typically via eveai-secrets or a ConfigMap.
- Apps will continue to push business metrics; Prometheus scrapes the Pushgateway.
## Bunny.net WAF (TODO)
- Configure Pull Zone for evie-staging.askeveai.com
- Set Origin to the LoadBalancer IP with HTTPS and Host header evie-staging.askeveai.com
- Define rate limits primarily on /api, looser on /client; enable bot filtering
- Only switch DNS (CNAME) to Bunny after TLS issuance completed directly against LoadBalancer
## Troubleshooting
```bash
kubectl get all -n eveai-staging
kubectl get events -n eveai-staging --sort-by=.lastTimestamp
kubectl describe ingress eveai-staging-ingress -n eveai-staging
kubectl logs -n eveai-staging deploy/eveai-api --tail=200
```
## Rollback / Cleanup
```bash
# Remove frontend/backend (keeps verification and other base resources)
kubectl delete -k scaleway/manifests/base/applications/frontend/
kubectl delete -k scaleway/manifests/base/applications/backend/
# Jobs are kept for history due to ttlSecondsAfterFinished; to delete immediately:
kubectl -n eveai-staging delete jobs --all
```