Files
eveAI/k8s/K8S_SERVICE_MANAGEMENT_README.md
2025-08-18 11:44:23 +02:00

305 lines
7.8 KiB
Markdown

# Kubernetes Service Management System
## Overview
This implementation provides a comprehensive Kubernetes service management system inspired by your `podman_env_switch.sh` workflow. It allows you to easily manage EveAI services across different environments with simple, memorable commands.
## 🚀 Quick Start
```bash
# Switch to dev environment
source k8s/k8s_env_switch.sh dev
# Start all services
kup
# Check status
kps
# Start individual services
kup-api
kup-workers
# Stop services (keeping data)
kdown apps
# View logs
klogs eveai-app
```
## 📁 File Structure
```
k8s/
├── k8s_env_switch.sh # Main script (like podman_env_switch.sh)
├── scripts/
│ ├── k8s-functions.sh # Core service management functions
│ ├── service-groups.sh # Service group definitions
│ ├── dependency-checks.sh # Dependency validation
│ └── logging-utils.sh # Logging utilities
├── dev/ # Dev environment configs
│ ├── setup-dev-cluster.sh # Existing cluster setup
│ ├── deploy-all-services.sh # Existing deployment script
│ └── *.yaml # Service configurations
└── test-k8s-functions.sh # Test script
```
## 🔧 Environment Setup
### Supported Environments
- `dev` - Development (current focus)
- `test` - Testing (future)
- `bugfix` - Bug fixes (future)
- `integration` - Integration testing (future)
- `prod` - Production (future)
### Environment Variables Set
- `K8S_ENVIRONMENT` - Current environment
- `K8S_VERSION` - Service version
- `K8S_CLUSTER` - Cluster name
- `K8S_NAMESPACE` - Kubernetes namespace
- `K8S_CONFIG_DIR` - Configuration directory
- `K8S_LOG_DIR` - Log directory
## 📋 Service Groups
### Infrastructure
- `redis` - Redis cache
- `minio` - MinIO object storage
### Apps (Individual Management)
- `eveai-app` - Main application
- `eveai-api` - API service
- `eveai-chat-client` - Chat client
- `eveai-workers` - Celery workers (2 replicas)
- `eveai-chat-workers` - Chat workers (2 replicas)
- `eveai-beat` - Celery scheduler
- `eveai-entitlements` - Entitlements service
### Static
- `static-files` - Static file server
- `eveai-ingress` - Ingress controller
### Monitoring
- `prometheus` - Metrics collection
- `grafana` - Dashboards
- `flower` - Celery monitoring
## 🎯 Core Commands
### Service Group Management
```bash
kup [group] # Start service group
kdown [group] # Stop service group, keep data
kstop [group] # Stop service group without removal
kstart [group] # Start stopped service group
krefresh [group] # Restart service group
```
**Groups:** `infrastructure`, `apps`, `static`, `monitoring`, `all`
### Individual App Service Management
```bash
# Start individual services
kup-app # Start eveai-app
kup-api # Start eveai-api
kup-chat-client # Start eveai-chat-client
kup-workers # Start eveai-workers
kup-chat-workers # Start eveai-chat-workers
kup-beat # Start eveai-beat
kup-entitlements # Start eveai-entitlements
# Stop individual services
kdown-app # Stop eveai-app (keep data)
kstop-api # Stop eveai-api (without removal)
kstart-workers # Start stopped eveai-workers
```
### Status & Monitoring
```bash
kps # Show service status overview
klogs [service] # View service logs
klogs eveai-app # View specific service logs
```
### Cluster Management
```bash
cluster-start # Start cluster
cluster-stop # Stop cluster (Kind limitation note)
cluster-delete # Delete cluster (with confirmation)
cluster-status # Show cluster status
```
## 🔍 Dependency Management
The system automatically checks dependencies:
### Infrastructure Dependencies
- All app services require `redis` and `minio` to be running
- Automatic checks before starting app services
### App Dependencies
- `eveai-workers` and `eveai-chat-workers` require `eveai-api`
- `eveai-beat` requires `redis`
- Dependency validation with helpful error messages
### Deployment Order
1. Infrastructure (redis, minio)
2. Core apps (eveai-app, eveai-api, eveai-chat-client, eveai-entitlements)
3. Workers (eveai-workers, eveai-chat-workers, eveai-beat)
4. Static files and ingress
5. Monitoring services
## 📝 Logging System
### Log Files (in `$HOME/k8s-logs/dev/`)
- `k8s-operations.log` - All operations
- `service-errors.log` - Error messages
- `kubectl-commands.log` - kubectl command history
- `dependency-checks.log` - Dependency validation results
### Log Management
```bash
# View recent logs (after sourcing the script)
show_recent_logs operations # Recent operations
show_recent_logs errors # Recent errors
show_recent_logs kubectl # Recent kubectl commands
# Clear logs
clear_logs all # Clear all logs
clear_logs errors # Clear error logs
```
## 💡 Usage Examples
### Daily Development Workflow
```bash
# Start your day
source k8s/k8s_env_switch.sh dev
# Check what's running
kps
# Start infrastructure if needed
kup infrastructure
# Start specific apps you're working on
kup-api
kup-app
# Check logs while developing
klogs eveai-api
# Restart a service after changes
kstop-api
kstart-api
# or
krefresh apps
# End of day - stop services but keep data
kdown all
```
### Debugging Workflow
```bash
# Check service status
kps
# Check dependencies
show_dependency_status
# View recent errors
show_recent_logs errors
# Check specific service details
show_service_status eveai-api
# Restart problematic service
krefresh apps
```
### Testing New Features
```bash
# Stop specific service
kdown-workers
# Deploy updated version
kup-workers
# Monitor logs
klogs eveai-workers
# Check if everything is working
kps
```
## 🔧 Integration with Existing Scripts
### Enhanced deploy-all-services.sh
The existing script can be extended with new options:
```bash
./deploy-all-services.sh --group apps
./deploy-all-services.sh --service eveai-api
./deploy-all-services.sh --check-deps
```
### Compatibility
- All existing scripts continue to work unchanged
- New system provides additional management capabilities
- Logging integrates with existing workflow
## 🧪 Testing
Run the test suite to validate functionality:
```bash
./k8s/test-k8s-functions.sh
```
The test validates:
- ✅ Environment switching
- ✅ Function definitions
- ✅ Service group configurations
- ✅ Basic command execution
- ✅ Logging system
- ✅ Dependency checking
## 🚨 Important Notes
### Kind Cluster Limitations
- Kind clusters cannot be "stopped", only deleted
- `cluster-stop` provides information about this limitation
- Use `cluster-delete` to completely remove a cluster
### Data Persistence
- `kdown` and `kstop` preserve all persistent data (PVCs)
- Only `--delete-all` mode removes deployments completely
- Logs are always preserved in `$HOME/k8s-logs/`
### Multi-Environment Support
- Currently focused on `dev` environment
- Framework ready for `test`, `bugfix`, `integration`, `prod`
- Environment-specific configurations will be created as needed
## 🎉 Benefits
### Familiar Workflow
- Commands mirror your `podman_env_switch.sh` pattern
- Short, memorable function names (`kup`, `kdown`, etc.)
- Environment switching with `source` command
### Individual Service Control
- Start/stop any app service independently
- Dependency checking prevents issues
- Granular control over your development environment
### Comprehensive Logging
- All operations logged for debugging
- Environment-specific log directories
- Easy access to recent operations and errors
### Production Ready
- Proper error handling and validation
- Graceful degradation when tools are missing
- Extensible for multiple environments
The system is now ready for use! Start with `source k8s/k8s_env_switch.sh dev` and explore the available commands.