- Definition and Improvements to job-system

- Definition of k8s pods for application services
This commit is contained in:
Josako
2025-09-04 11:49:19 +02:00
parent 2a0c92b064
commit af8b5f54cd
16 changed files with 352 additions and 48 deletions

View File

@@ -465,7 +465,22 @@ kubectl -n tools port-forward svc/pgadmin-pgadmin4 8080:80
### Phase 8: RedisInsight Tool Deployment
### Phase 9: Ops Jobs Invocation (if required)
### Phase 9: Enable Scaleway Registry
1) Create docker pull secret via External Secrets (once):
```bash
kubectl apply -f scaleway/manifests/base/secrets/scaleway-registry-secret.yaml
kubectl -n eveai-staging get secret scaleway-registry-cred -o yaml | grep "type: kubernetes.io/dockerconfigjson"
```
2) Use the staging overlay to deploy apps with registry rewrite and imagePullSecrets:
```bash
kubectl apply -k scaleway/manifests/overlays/staging/
```
Notes:
- Base manifests keep generic images (josakola/...). The overlay rewrites them to rg.fr-par.scw.cloud/eveai-staging/josakola/...:staging and adds imagePullSecrets to all Pods.
- Staging uses imagePullPolicy: Always, so new pushes to :staging are pulled automatically.
### Phase 10: Ops Jobs Invocation (if required)
Run the DB ops scripts manually in order. Each manifest uses generateName; use kubectl create.
@@ -489,9 +504,13 @@ kubectl create -f scaleway/manifests/base/applications/ops/jobs/06-verify-minima
kubectl wait --for=condition=complete job -n eveai-staging -l job-type=db-verify-minimal --timeout=900s
```
View logs (you can see the created job name as a result from the create command):
```bash
kubectl -n eveai-staging get jobs
kubectl -n eveai-staging logs job/<created-job-name>
```
### Phase 10: Application Services Deployment
### Phase 11: Application Services Deployment

View File

@@ -30,7 +30,12 @@ Manifests are under:
- scaleway/manifests/base/applications/frontend/
- scaleway/manifests/base/applications/backend/
- scaleway/manifests/base/applications/ops/jobs/
- Aggregate kustomization: scaleway/manifests/base/applications/kustomization.yaml
- Aggregate kustomization (apps only): scaleway/manifests/base/applications/kustomization.yaml
Note:
- The staging Kustomize overlay deploys only frontend and backend apps.
- Ingress remains managed manually via scaleway/manifests/base/networking/ingress-https.yaml and your cluster-install.md guide.
- Ops Jobs are not part of the overlay and should be executed manually with kubectl create -f.
## Step 1: Validate secrets
```bash
@@ -41,6 +46,12 @@ Confirm presence of DB_*, REDIS_*, OPENAI_API_KEY, MISTRAL_API_KEY, JWT_SECRET_K
## Step 2: Deploy Ops Jobs (manual pre-deploy)
Run the DB ops scripts manually in order. Each manifest uses generateName; use kubectl create.
Notes for images:
- Ops Jobs now reference the private Scaleway registry directly and set imagePullPolicy: Always.
- Ensure the docker pull secret exists (scaleway-registry-cred) — see the Private registry section.
- After pushing a new :staging image, delete any previous Job (if present) and create a new one to force a fresh Pod pull.
```bash
kubectl create -f scaleway/manifests/base/applications/ops/jobs/00-env-check-job.yaml
kubectl wait --for=condition=complete job -n eveai-staging -l job-type=env-check --timeout=600s
@@ -66,6 +77,28 @@ kubectl -n eveai-staging get jobs
kubectl -n eveai-staging logs job/<created-job-name>
```
### Runtime environment for Ops Jobs
Each Ops Job sets the same non-secret runtime variables required by the shared bootstrap (start.sh/run.py):
- FLASK_APP=/app/scripts/run.py
- COMPONENT_NAME=eveai_ops
- PYTHONUNBUFFERED=1
- LOGLEVEL=debug (for staging)
- ROLE=web
- PORT=8080
- WORKERS=1
- WORKER_CLASS=gevent
- WORKER_CONN=100
- MAX_REQUESTS=1000
- MAX_REQUESTS_JITTER=100
Secrets (DB_*, REDIS_*, etc.) still come from `envFrom: secretRef: eveai-secrets`.
Tip: After pushing a new :staging image, delete any previous Job with the same label to force a fresh Pod and pull:
```bash
kubectl -n eveai-staging delete job -l component=ops,job-type=db-migrate-public || true
kubectl create -f scaleway/manifests/base/applications/ops/jobs/03-db-migrate-public-job.yaml
```
## Step 3: Deploy backend workers
```bash
kubectl apply -k scaleway/manifests/base/applications/backend/
@@ -84,11 +117,14 @@ kubectl apply -k scaleway/manifests/base/applications/frontend/
kubectl -n eveai-staging get deploy,svc | egrep 'eveai-(app|api|chat-client)'
```
## Step 5: Verify Ingress routes
The HTTPS ingress has paths enabled for /admin, /api, /client. Verify:
## Step 5: Verify Ingress routes (Ingress managed separately)
Ingress is intentionally not managed by the staging Kustomize overlay. Apply or update it manually using your existing manifest and handle it per your cluster-install.md guide:
```bash
kubectl apply -f scaleway/manifests/base/networking/ingress-https.yaml
kubectl -n eveai-staging describe ingress eveai-staging-ingress
```
Then verify the routes:
```bash
curl -k https://evie-staging.askeveai.com/verify/health
curl -k https://evie-staging.askeveai.com/admin/healthz/ready
curl -k https://evie-staging.askeveai.com/api/healthz/ready
@@ -108,6 +144,16 @@ curl -k https://evie-staging.askeveai.com/client/healthz/ready
- Ensure PUSH_GATEWAY_HOST and PUSH_GATEWAY_PORT are provided (e.g., pushgateway.monitoring.svc.cluster.local:9091), typically via eveai-secrets or a ConfigMap.
- Apps will continue to push business metrics; Prometheus scrapes the Pushgateway.
## Image tags strategy (staging/production channels)
- The push script now creates and pushes two tags per service:
- A versioned tag: :vX.Y.Z (e.g., :v1.2.3)
- An environment channel tag based on ENVIRONMENT: :staging or :production
- Recommendation for staging manifests:
- Refer to the channel tag (e.g., rg.fr-par.scw.cloud/eveai-staging/...:<staging>) and set imagePullPolicy: Always so new pushes are picked up without manifest changes.
- Production can later use immutable version tags or digests via a production overlay.
- Ensure PUSH_GATEWAY_HOST and PUSH_GATEWAY_PORT are provided (e.g., pushgateway.monitoring.svc.cluster.local:9091), typically via eveai-secrets or a ConfigMap.
- Apps will continue to push business metrics; Prometheus scrapes the Pushgateway.
## Bunny.net WAF (TODO)
- Configure Pull Zone for evie-staging.askeveai.com
- Set Origin to the LoadBalancer IP with HTTPS and Host header evie-staging.askeveai.com
@@ -131,3 +177,18 @@ kubectl delete -k scaleway/manifests/base/applications/backend/
# Jobs are kept for history due to ttlSecondsAfterFinished; to delete immediately:
kubectl -n eveai-staging delete jobs --all
```
## Private registry (Scaleway)
1) Create docker pull secret via External Secrets (once):
```bash
kubectl apply -f scaleway/manifests/base/secrets/scaleway-registry-secret.yaml
kubectl -n eveai-staging get secret scaleway-registry-cred -o yaml | grep "type: kubernetes.io/dockerconfigjson"
```
2) Use the staging overlay to deploy apps with registry rewrite and imagePullSecrets:
```bash
kubectl apply -k scaleway/manifests/overlays/staging/
```
Notes:
- Base manifests keep generic images (josakola/...). The overlay rewrites them to rg.fr-par.scw.cloud/eveai-staging/josakola/...:staging and adds imagePullSecrets to all Pods.
- Staging uses imagePullPolicy: Always, so new pushes to :staging are pulled automatically.

View File

@@ -5,4 +5,3 @@ resources:
- verification/
- frontend/
- backend/
- ops/jobs/

View File

@@ -18,12 +18,38 @@ spec:
job-type: env-check
spec:
restartPolicy: Never
imagePullSecrets:
- name: scaleway-registry-cred
containers:
- name: dbops
image: josakola/eveai_ops:latest
image: rg.fr-par.scw.cloud/eveai-staging/josakola/eveai_ops:staging
imagePullPolicy: Always
envFrom:
- secretRef:
name: eveai-secrets
env:
- name: FLASK_APP
value: "/app/scripts/run.py"
- name: COMPONENT_NAME
value: "eveai_ops"
- name: PYTHONUNBUFFERED
value: "1"
- name: LOGLEVEL
value: "debug"
- name: ROLE
value: "web"
- name: PORT
value: "8080"
- name: WORKERS
value: "1"
- name: WORKER_CLASS
value: "gevent"
- name: WORKER_CONN
value: "100"
- name: MAX_REQUESTS
value: "1000"
- name: MAX_REQUESTS_JITTER
value: "100"
command: ["/bin/bash","-lc","/app/scripts/dbops/00-env-check.sh"]
resources:
requests:

View File

@@ -19,12 +19,38 @@ spec:
job-type: db-bootstrap-ext
spec:
restartPolicy: Never
imagePullSecrets:
- name: scaleway-registry-cred
containers:
- name: dbops
image: josakola/eveai_ops:latest
image: rg.fr-par.scw.cloud/eveai-staging/josakola/eveai_ops:staging
imagePullPolicy: Always
envFrom:
- secretRef:
name: eveai-secrets
env:
- name: FLASK_APP
value: "/app/scripts/run.py"
- name: COMPONENT_NAME
value: "eveai_ops"
- name: PYTHONUNBUFFERED
value: "1"
- name: LOGLEVEL
value: "debug"
- name: ROLE
value: "web"
- name: PORT
value: "8080"
- name: WORKERS
value: "1"
- name: WORKER_CLASS
value: "gevent"
- name: WORKER_CONN
value: "100"
- name: MAX_REQUESTS
value: "1000"
- name: MAX_REQUESTS_JITTER
value: "100"
command: ["/bin/bash","-lc","/app/scripts/dbops/02-db-bootstrap-ext.sh"]
resources:
requests:

View File

@@ -19,12 +19,38 @@ spec:
job-type: db-migrate-public
spec:
restartPolicy: Never
imagePullSecrets:
- name: scaleway-registry-cred
containers:
- name: dbops
image: josakola/eveai_ops:latest
image: rg.fr-par.scw.cloud/eveai-staging/josakola/eveai_ops:staging
imagePullPolicy: Always
envFrom:
- secretRef:
name: eveai-secrets
env:
- name: FLASK_APP
value: "/app/scripts/run.py"
- name: COMPONENT_NAME
value: "eveai_ops"
- name: PYTHONUNBUFFERED
value: "1"
- name: LOGLEVEL
value: "debug"
- name: ROLE
value: "web"
- name: PORT
value: "8080"
- name: WORKERS
value: "1"
- name: WORKER_CLASS
value: "gevent"
- name: WORKER_CONN
value: "100"
- name: MAX_REQUESTS
value: "1000"
- name: MAX_REQUESTS_JITTER
value: "100"
command: ["/bin/bash","-lc","/app/scripts/dbops/03-db-migrate-public.sh"]
resources:
requests:

View File

@@ -19,12 +19,38 @@ spec:
job-type: db-migrate-tenant
spec:
restartPolicy: Never
imagePullSecrets:
- name: scaleway-registry-cred
containers:
- name: dbops
image: josakola/eveai_ops:latest
image: rg.fr-par.scw.cloud/eveai-staging/josakola/eveai_ops:staging
imagePullPolicy: Always
envFrom:
- secretRef:
name: eveai-secrets
env:
- name: FLASK_APP
value: "/app/scripts/run.py"
- name: COMPONENT_NAME
value: "eveai_ops"
- name: PYTHONUNBUFFERED
value: "1"
- name: LOGLEVEL
value: "debug"
- name: ROLE
value: "web"
- name: PORT
value: "8080"
- name: WORKERS
value: "1"
- name: WORKER_CLASS
value: "gevent"
- name: WORKER_CONN
value: "100"
- name: MAX_REQUESTS
value: "1000"
- name: MAX_REQUESTS_JITTER
value: "100"
command: ["/bin/bash","-lc","/app/scripts/dbops/04-db-migrate-tenant.sh"]
resources:
requests:

View File

@@ -19,12 +19,38 @@ spec:
job-type: db-seed-or-init
spec:
restartPolicy: Never
imagePullSecrets:
- name: scaleway-registry-cred
containers:
- name: dbops
image: josakola/eveai_ops:latest
image: rg.fr-par.scw.cloud/eveai-staging/josakola/eveai_ops:staging
imagePullPolicy: Always
envFrom:
- secretRef:
name: eveai-secrets
env:
- name: FLASK_APP
value: "/app/scripts/run.py"
- name: COMPONENT_NAME
value: "eveai_ops"
- name: PYTHONUNBUFFERED
value: "1"
- name: LOGLEVEL
value: "debug"
- name: ROLE
value: "web"
- name: PORT
value: "8080"
- name: WORKERS
value: "1"
- name: WORKER_CLASS
value: "gevent"
- name: WORKER_CONN
value: "100"
- name: MAX_REQUESTS
value: "1000"
- name: MAX_REQUESTS_JITTER
value: "100"
command: ["/bin/bash","-lc","/app/scripts/dbops/05-seed-or-init-data.sh"]
resources:
requests:

View File

@@ -19,12 +19,38 @@ spec:
job-type: db-verify-minimal
spec:
restartPolicy: Never
imagePullSecrets:
- name: scaleway-registry-cred
containers:
- name: dbops
image: josakola/eveai_ops:latest
image: rg.fr-par.scw.cloud/eveai-staging/josakola/eveai_ops:staging
imagePullPolicy: Always
envFrom:
- secretRef:
name: eveai-secrets
env:
- name: FLASK_APP
value: "/app/scripts/run.py"
- name: COMPONENT_NAME
value: "eveai_ops"
- name: PYTHONUNBUFFERED
value: "1"
- name: LOGLEVEL
value: "debug"
- name: ROLE
value: "web"
- name: PORT
value: "8080"
- name: WORKERS
value: "1"
- name: WORKER_CLASS
value: "gevent"
- name: WORKER_CONN
value: "100"
- name: MAX_REQUESTS
value: "1000"
- name: MAX_REQUESTS_JITTER
value: "100"
command: ["/bin/bash","-lc","/app/scripts/dbops/06-verify-minimal.sh"]
resources:
requests:

View File

@@ -0,0 +1,35 @@
apiVersion: external-secrets.io/v1
kind: ExternalSecret
metadata:
name: scaleway-registry-secret
namespace: eveai-staging
spec:
refreshInterval: 1h
secretStoreRef:
name: scaleway-cluster-secret-store
kind: ClusterSecretStore
target:
name: scaleway-registry-cred
creationPolicy: Owner
template:
type: kubernetes.io/dockerconfigjson
data:
.dockerconfigjson: |
{"auths":{ "{{ .SCW_REGISTRY_URL }}": {
"username":"{{ .SCW_REGISTRY_ACCESS_KEY }}",
"password":"{{ .SCW_REGISTRY_SECRET_KEY }}",
"auth":"{{ printf "%s:%s" .SCW_REGISTRY_ACCESS_KEY .SCW_REGISTRY_SECRET_KEY | b64enc }}"
}}}
data:
- secretKey: SCW_REGISTRY_URL
remoteRef:
key: name:eveai-registry
property: SCW_REGISTRY_URL
- secretKey: SCW_REGISTRY_ACCESS_KEY
remoteRef:
key: name:eveai-registry
property: SCW_REGISTRY_ACCESS_KEY
- secretKey: SCW_REGISTRY_SECRET_KEY
remoteRef:
key: name:eveai-registry
property: SCW_REGISTRY_SECRET_KEY

View File

@@ -1,29 +1,43 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: eveai-staging
# Reference base components
resources:
- ../../base/
- ../../base/applications/frontend
- ../../base/applications/backend
# Staging-specific configuration
namePrefix: ""
nameSuffix: ""
commonLabels:
environment: staging
managed-by: kustomize
# Images (can be overridden for staging-specific versions)
images:
- name: nginx
newTag: alpine
- name: josakola/eveai_ops
newName: rg.fr-par.scw.cloud/eveai-staging/josakola/eveai_ops
newTag: staging
- name: josakola/eveai_app
newName: rg.fr-par.scw.cloud/eveai-staging/josakola/eveai_app
newTag: staging
- name: josakola/eveai_api
newName: rg.fr-par.scw.cloud/eveai-staging/josakola/eveai_api
newTag: staging
- name: josakola/eveai_chat_client
newName: rg.fr-par.scw.cloud/eveai-staging/josakola/eveai_chat_client
newTag: staging
- name: josakola/eveai_workers
newName: rg.fr-par.scw.cloud/eveai-staging/josakola/eveai_workers
newTag: staging
- name: josakola/eveai_chat_workers
newName: rg.fr-par.scw.cloud/eveai-staging/josakola/eveai_chat_workers
newTag: staging
- name: josakola/eveai_entitlements
newName: rg.fr-par.scw.cloud/eveai-staging/josakola/eveai_entitlements
newTag: staging
# ConfigMap and Secret generators for staging-specific values
configMapGenerator:
- name: staging-config
literals:
- ENVIRONMENT=staging
- LOG_LEVEL=INFO
- DEBUG=false
# Note: Namespace is handled per resource to avoid conflicts
patches:
- target:
kind: Deployment
namespace: eveai-staging
patch: |-
- op: add
path: /spec/template/spec/imagePullSecrets
value:
- name: scaleway-registry-cred
- op: add
path: /spec/template/spec/containers/0/imagePullPolicy
value: Always

View File

@@ -198,12 +198,17 @@ for SERVICE in "${SERVICE_ARRAY[@]}"; do
# Construct image names
LOCAL_VERSION_IMAGE="$LOCAL_REGISTRY/$ACCOUNT/$SERVICE:$VERSION"
SCALEWAY_VERSION_IMAGE="$SCALEWAY_REGISTRY/$ACCOUNT/$SERVICE:$VERSION"
ENV_TAG="$ENVIRONMENT"
SCALEWAY_ENV_IMAGE="$SCALEWAY_REGISTRY/$ACCOUNT/$SERVICE:$ENV_TAG"
echo " 📥 Source: $LOCAL_VERSION_IMAGE"
echo " 📤 Target: $SCALEWAY_VERSION_IMAGE"
echo " 📤 Target (version): $SCALEWAY_VERSION_IMAGE"
echo " 🏷️ Extra tag (environment): $SCALEWAY_ENV_IMAGE"
if [[ "$DRY_RUN" == true ]]; then
echo " 🔍 [DRY RUN] Would push $LOCAL_VERSION_IMAGE to $SCALEWAY_VERSION_IMAGE"
echo " 🔍 [DRY RUN] Would push $LOCAL_VERSION_IMAGE to:"
echo " - $SCALEWAY_VERSION_IMAGE"
echo " - $SCALEWAY_ENV_IMAGE (environment channel tag)"
PROCESSED_SERVICES+=("$SERVICE")
continue
fi
@@ -225,26 +230,41 @@ for SERVICE in "${SERVICE_ARRAY[@]}"; do
fi
# Tag for Scaleway registry (direct push with same version tag)
echo " 🏷️ Tagging for Scaleway registry..."
echo " 🏷️ Tagging for Scaleway registry (version)..."
if ! podman tag "$LOCAL_VERSION_IMAGE" "$SCALEWAY_VERSION_IMAGE"; then
echo " ❌ Failed to tag $LOCAL_VERSION_IMAGE as $SCALEWAY_VERSION_IMAGE"
FAILED_SERVICES+=("$SERVICE")
continue
fi
# Push to Scaleway registry
echo " 📤 Pushing to Scaleway registry..."
# Push version tag to Scaleway registry
echo " 📤 Pushing version tag to Scaleway registry..."
if ! podman push "$SCALEWAY_VERSION_IMAGE"; then
echo " ❌ Failed to push $SCALEWAY_VERSION_IMAGE"
FAILED_SERVICES+=("$SERVICE")
continue
fi
# Tag and push environment channel tag
echo " 🏷️ Tagging environment channel ($ENV_TAG)..."
if ! podman tag "$LOCAL_VERSION_IMAGE" "$SCALEWAY_ENV_IMAGE"; then
echo " ❌ Failed to tag $LOCAL_VERSION_IMAGE as $SCALEWAY_ENV_IMAGE"
FAILED_SERVICES+=("$SERVICE")
continue
fi
echo " 📤 Pushing environment tag to Scaleway registry..."
if ! podman push "$SCALEWAY_ENV_IMAGE"; then
echo " ❌ Failed to push $SCALEWAY_ENV_IMAGE"
FAILED_SERVICES+=("$SERVICE")
continue
fi
# Clean up local Scaleway tag
echo " 🧹 Cleaning up local Scaleway tag..."
# Clean up local Scaleway tags
echo " 🧹 Cleaning up local Scaleway tags..."
podman rmi "$SCALEWAY_VERSION_IMAGE" 2>/dev/null || true
podman rmi "$SCALEWAY_ENV_IMAGE" 2>/dev/null || true
echo " ✅ Successfully pushed $SERVICE version $VERSION to Scaleway"
echo " ✅ Successfully pushed $SERVICE as $VERSION and :$ENV_TAG to Scaleway"
PROCESSED_SERVICES+=("$SERVICE")
done

View File

@@ -10,7 +10,7 @@ for v in "${REQUIRED_VARS[@]}"; do : "${!v:?$v required}"; done
export PROJECT_DIR="${PROJECT_DIR:-/app}"
export FLASK_APP="${FLASK_APP:-${PROJECT_DIR}/scripts/run.py}"
export COMPONENT_NAME="${COMPONENT_NAME:-eveai_app}"
export COMPONENT_NAME="${COMPONENT_NAME:-eveai_ops}"
export PYTHONPATH="${PYTHONPATH:-${PROJECT_DIR}:${PYTHONPATH-}}"
export PGPASSWORD="$DB_PASS"

View File

@@ -10,7 +10,7 @@ for v in "${REQUIRED_VARS[@]}"; do : "${!v:?$v required}"; done
export PROJECT_DIR="${PROJECT_DIR:-/app}"
export FLASK_APP="${FLASK_APP:-${PROJECT_DIR}/scripts/run.py}"
export COMPONENT_NAME="${COMPONENT_NAME:-eveai_app}"
export COMPONENT_NAME="${COMPONENT_NAME:-eveai_ops}"
export PYTHONPATH="${PYTHONPATH:-${PROJECT_DIR}:${PYTHONPATH-}}"
export PGPASSWORD="$DB_PASS"

View File

@@ -10,7 +10,7 @@ SCRIPT_PATH="${PROJECT_DIR}/scripts/initialize_data.py"
[[ -f "$SCRIPT_PATH" ]] || fail "Seed/init script not found: $SCRIPT_PATH"
export FLASK_APP="${FLASK_APP:-${PROJECT_DIR}/scripts/run.py}"
export COMPONENT_NAME="${COMPONENT_NAME:-eveai_app}"
export COMPONENT_NAME="${COMPONENT_NAME:-eveai_ops}"
export PYTHONPATH="${PYTHONPATH:-${PROJECT_DIR}:${PYTHONPATH-}}"
log "Running initialize_data.py (idempotent one-off per environment)..."

View File

@@ -4,7 +4,7 @@ from datetime import datetime as dt, timezone as tz
from flask_security import hash_password
from uuid import uuid4
from eveai_app import create_app
from eveai_ops import create_app
from common.models.user import User, Tenant, Role, RolesUsers
from common.extensions import db, minio_client
from common.utils.database import Database