feature/multi-cloud (#14)

Co-authored-by: Danijel Simeunovic <danijel.simeunovic@fortedigital.com>
Reviewed-on: #14
This commit was merged in pull request #14.
This commit is contained in:
2026-04-24 08:48:53 +00:00
parent 65598c9297
commit 8505481291
102 changed files with 1715 additions and 147 deletions

View File

@@ -12,11 +12,11 @@
## Overview
This Kubernetes cluster uses a **GitOps approach** powered by **ArgoCD**, where Git repositories serve as the single source of truth for both infrastructure and application deployments. The cluster is running on **UpCloud Managed Kubernetes** but is designed to be cloud-agnostic.
This Kubernetes cluster uses a **GitOps approach** powered by **ArgoCD**, where Git repositories serve as the single source of truth for both infrastructure and application deployments. The cluster setup is **cloud-agnostic**, with ready-to-use configurations for **UpCloud**, **AWS EKS**, **Azure AKS**, and **GCP GKE**.
### Key Characteristics
- **Environment**: Production (internal use only)
- **Cluster Type**: Multi-cluster (upc-dev, upc-prod) via Kustomize overlays
- **Cluster Type**: Multi-cloud, multi-cluster via Kustomize overlays (UpCloud, AWS, Azure, GCP)
- **GitOps Tool**: ArgoCD
- **Deployment Pattern**: App-of-Apps
- **Secret Management**: Sealed Secrets (kubeseal)
@@ -63,7 +63,7 @@ This Kubernetes cluster uses a **GitOps approach** powered by **ArgoCD**, where
┌────────────────────────────────┐
│ Kubernetes Clusters │
│ (UpCloud: upc-dev, upc-prod)
│ (UpCloud, AWS, Azure, GCP)
│ │
│ ┌──────────────────────────┐ │
│ │ ArgoCD │ │
@@ -131,26 +131,22 @@ launchpad/
│ │ ├── renovate.yaml
│ │ ├── ... # All other Application manifests
│ │ └── secrets.yaml
│ ├── overlays/ # Per-cluster overrides
│ ├── overlays/ # Per-cluster Kustomize overrides
│ │ ├── upc-dev/ # UpCloud Dev (uses base as-is)
│ │ ── upc-prod/ # UpCloud Prod (patches value paths)
│ │ ── upc-prod/ # UpCloud Prod (patches value paths)
│ │ ├── eks-dev/ # AWS EKS Dev
│ │ ├── eks-prod/ # AWS EKS Prod
│ │ ├── aks-dev/ # Azure AKS Dev
│ │ ├── aks-prod/ # Azure AKS Prod
│ │ ├── gke-dev/ # GCP GKE Dev
│ │ └── gke-prod/ # GCP GKE Prod
│ ├── dashboards/ # Grafana dashboard ConfigMaps
│ └── values/ # Helm value overrides for infra
│ ├── base/ # Shared values (all clusters)
│ ├── traefik-values.yaml
│ ├── keycloak-values.yaml
│ ├── grafana-values.yaml
│ ├── prometheus-values.yaml
│ │ ├── gitea-values.yaml
│ │ └── ...
│ ├── upc-dev/ # upc-dev cluster-specific values
│ │ ├── traefik-values.yaml
│ │ ├── keycloak-values.yaml
│ │ └── grafana-values.yaml
│ └── upc-prod/ # upc-prod cluster-specific values
│ ├── traefik-values.yaml
│ ├── keycloak-values.yaml
│ └── grafana-values.yaml
│ ├── base/ # Cloud-agnostic shared values
├── upc-{dev,prod}/ # UpCloud: storage class, LB, pricing
├── aws-{dev,prod}/ # AWS: gp3, NLB, CUR pricing
├── aks-{dev,prod}/ # Azure: managed-csi-premium, Standard LB
└── gcp-{dev,prod}/ # GCP: premium-rwo, L4 LB
├── apps/ # Business Application ArgoCD manifests (Kustomize)
│ ├── base/ # Base app manifests
@@ -287,7 +283,7 @@ app-repository/
### The App-of-Apps Pattern
```
_app-of-apps-{upc-dev,upc-prod}.yaml (Root, per cluster)
_app-of-apps-{cluster}.yaml (Root, per cluster — e.g. upc-dev, eks-prod, gke-dev)
├── infrastructure-apps (manages infra/)
│ ├── cluster-resources-application
@@ -377,6 +373,15 @@ patches:
value: $values/infra/values/upc-prod/traefik-values.yaml
```
Cloud-specific values (storage classes, load balancer annotations, cost model) are isolated in per-cluster value files. Base values are fully cloud-agnostic:
| Cloud | Storage Class | Load Balancer | OpenCost Provider |
|-------|--------------|---------------|-------------------|
| **UpCloud** | `upcloud-block-storage-maxiops` | UpCloud LB (ProxyProtocol v2) | Custom pricing |
| **AWS EKS** | `gp3` (EBS CSI) | NLB (ProxyProtocol v2) | AWS CUR |
| **Azure AKS** | `managed-csi-premium` | Standard LB (`externalTrafficPolicy: Local`) | Azure Billing API |
| **GCP GKE** | `premium-rwo` (PD CSI) | L4 passthrough NLB | GCP Cloud Billing |
**Benefits**:
- Single source of truth for Application definitions
- Cluster-specific values isolated per overlay
@@ -658,6 +663,6 @@ Notifications include:
---
**Last Updated**: 2026-03-16
**Last Updated**: 2026-04-22
**Maintained By**: Platform Team
**Questions?**: Contact #platform-support on Slack

View File

@@ -37,7 +37,7 @@ Bootstrap a new cluster from scratch:
#### Prerequisites
1. **Kubernetes cluster running** (UpCloud or any K8s cluster)
1. **Kubernetes cluster running** (UpCloud, AWS EKS, Azure AKS, GCP GKE, or any K8s cluster)
2. **kubectl configured** with admin access
3. **Repositories cloned** locally
@@ -54,11 +54,13 @@ kubectl get nodes
git clone https://git.forteapps.net/Forte/launchpad
cd launchpad
# 2. Set cluster name (optional)
export CLUSTER_NAME="prod-cluster-01"
# 2. Run bootstrap script with cluster target
# Available clusters: upc-dev, upc-prod, eks-dev, eks-prod,
# aks-dev, aks-prod, gke-dev, gke-prod
./bootstrap.sh upc-dev
# 3. Run bootstrap script
./bootstrap.sh
# Cluster config is loaded from clusters/<cluster>.yaml
# (cloudProvider, trustedIPs, domain, etc.)
```
**What Happens:**
@@ -1262,13 +1264,21 @@ spec:
### Backup Strategy
**Current State**: No automated backups
**Current State**: Gitea daily backups to S3-compatible storage
**What Needs Backup**:
- ❌ Cluster state (not backed up - recreate via GitOps)
- ❌ Persistent volumes (currently not critical)
- ✅ Git repositories (Gitea provides backup)
- ⚠️ Secrets (sealed secrets in Git, unseal keys need safekeeping)
**What Is Backed Up**:
- ✅ Gitea repositories + database: Daily CronJob (`cluster-resources/gitea-backup-cronjob.yaml`) uploads to S3-compatible storage with 7-day retention
- ✅ Git repositories: Full cluster config recoverable from Git
- ⚠️ Secrets: Sealed secrets in Git; unseal keys need safekeeping
**What Is NOT Backed Up**:
- ❌ Cluster state (recreate via GitOps)
- ❌ Other persistent volumes (Prometheus, Loki, Tempo data)
**Per-cloud backup scripts** (manual restore helpers):
- UpCloud/AWS: `scripts/gitea-backup.sh` / `scripts/gitea-backup-eks.sh` (MinIO CLI, S3-compatible)
- Azure: `scripts/gitea-backup-aks.sh` (Azure CLI + Blob Storage)
- GCP: `scripts/gitea-backup-gke.sh` (gsutil + GCS)
### Cluster Rebuild
@@ -1370,6 +1380,9 @@ kubectl get pods -n argocd
```bash
# UpCloud: Upgrade via control panel or CLI
# AWS EKS: eksctl upgrade cluster / AWS Console
# Azure AKS: az aks upgrade / Azure Portal
# GCP GKE: gcloud container clusters upgrade / Cloud Console
# After upgrade, verify cluster
kubectl version
@@ -1507,18 +1520,35 @@ git push
### Multi-Cluster Setup
The repository supports multiple clusters via Kustomize overlays:
The repository supports multiple clusters across multiple clouds via Kustomize overlays:
**Active clusters:**
- **upc-dev** (default): `infra/overlays/upc-dev/` — uses base Applications as-is
- **upc-prod**: `infra/overlays/upc-prod/` — patches value file paths from `upc-dev` to `upc-prod`
Each cluster has its own:
- Root app-of-apps file: `_app-of-apps-upc-dev.yaml` / `_app-of-apps-upc-prod.yaml`
- Cluster-specific Helm values: `infra/values/upc-dev/` / `infra/values/upc-prod/`
- Sealed secrets: `secrets/upc-dev/` (others as needed)
- Apps overlay: `apps/overlays/upc-dev/` / `apps/overlays/upc-prod/`
**Cloud-ready templates (fill in `clusters/*.yaml` before use):**
- **eks-dev** / **eks-prod**: AWS EKS with NLB, gp3 storage, AWS CUR pricing
- **aks-dev** / **aks-prod**: Azure AKS with Standard LB, managed-csi-premium storage
- **gke-dev** / **gke-prod**: GCP GKE with L4 LB, premium-rwo storage
To add a new cluster, create a new overlay directory (e.g., `infra/overlays/upc-staging/`) with patches that swap the value file paths.
Each cluster has its own:
- Root app-of-apps: `_app-of-apps-{cluster}.yaml`
- Cluster config: `clusters/{cluster}.yaml` (domain, trustedIPs, cloudProvider)
- Kustomize overlay: `infra/overlays/{cluster}/kustomization.yaml`
- Helm value overrides: `infra/values/{cluster}/` (traefik, gitea, opencost)
- Sealed secrets: `secrets/{cluster}/` (as needed)
- Apps overlay: `apps/overlays/{cluster}/`
Cloud-specific values handled per-cluster:
| Concern | UpCloud | AWS EKS | Azure AKS | GCP GKE |
|---------|---------|---------|-----------|---------|
| **Storage class** | `upcloud-block-storage-maxiops` | `gp3` | `managed-csi-premium` | `premium-rwo` |
| **Load balancer** | UpCloud LB + ProxyProtocol v2 | NLB + ProxyProtocol v2 | Standard LB + `externalTrafficPolicy: Local` | L4 passthrough NLB |
| **Cost monitoring** | Custom pricing | AWS CUR | Azure Billing API | GCP Cloud Billing |
| **Backup storage** | UpCloud S3-compat | AWS S3 (native) | Azure Blob Storage | GCS |
To add a new cluster, create a new overlay directory (e.g., `infra/overlays/eks-staging/`) with patches that swap the value file paths, and a matching `clusters/eks-staging.yaml`.
### Blue-Green Deployments
@@ -1661,6 +1691,6 @@ echo "Remember to delete: $SECRET_FILE"
---
**Last Updated**: 2026-03-16
**Last Updated**: 2026-04-22
**Maintained By**: Platform Team
**Emergency Contact**: #platform-support on Slack

View File

@@ -180,7 +180,7 @@ Reference for:
┌──────────────────────────────────────────────────────────────┐
│ Kubernetes Clusters (UpCloud: upc-dev, upc-prod)
│ Kubernetes Clusters (UpCloud, AWS, Azure, GCP)
│ ┌──────────────────────────────────────────────────────┐ │
│ │ Infrastructure: Traefik, Cert-Manager, Kyverno │ │
│ ├──────────────────────────────────────────────────────┤ │
@@ -194,7 +194,7 @@ Reference for:
### Key Technologies
- **GitOps**: ArgoCD
- **Kubernetes**: UpCloud Managed Kubernetes (multi-cluster: upc-dev, upc-prod)
- **Kubernetes**: Multi-cloud (UpCloud, AWS EKS, Azure AKS, GCP GKE)
- **Ingress**: Traefik v2
- **Certificates**: Cert-Manager + Let's Encrypt
- **Policies**: Kyverno
@@ -299,11 +299,16 @@ docs/
## 🔄 Documentation Versions
**Current Version**: 1.0.0
**Last Updated**: 2026-03-16
**Last Updated**: 2026-04-22
**Maintained By**: Platform Team
### Changelog
- **v1.1.0 (2026-04-22)**: Multi-cloud support
- Cloud-agnostic base values (storage, LB, pricing moved to per-cluster overlays)
- Added AWS EKS, Azure AKS, GCP GKE configurations
- Per-cloud backup scripts
- Updated all documentation
- **v1.0.0 (2026-03-16)**: Initial comprehensive documentation release
- GitOps Architecture guide
- Developer Onboarding guide

View File

@@ -9,6 +9,7 @@
- [Kyverno Policies](#kyverno-policies)
- [Configuration Reference](#configuration-reference)
- [API Endpoints](#api-endpoints)
- [Cloud Overlay Pattern](#cloud-overlay-pattern)
- [Glossary](#glossary)
---
@@ -19,9 +20,10 @@
| Component | Value |
|-----------|-------|
| **Provider** | UpCloud Managed Kubernetes |
| **Environment** | Production (internal use) |
| **Cluster Count** | Multi-cluster (upc-dev, upc-prod) |
| **Provider** | Multi-cloud (UpCloud, AWS EKS, Azure AKS, GCP GKE) |
| **Environment** | Dev + Production per cloud |
| **Active clusters** | UpCloud (upc-dev, upc-prod) |
| **Cloud-ready templates** | EKS, AKS, GKE (dev + prod each) |
| **GitOps Tool** | ArgoCD |
| **Ingress Controller** | Traefik v2 |
| **Certificate Management** | Cert-Manager + Let's Encrypt |
@@ -42,7 +44,7 @@ Internet
[DNS: *.forteapps.net]
[UpCloud LoadBalancer]
[Cloud Load Balancer]
[Traefik Ingress Controller]
@@ -92,16 +94,34 @@ launchpad/
│ ├── sealedsecrets.yaml
│ ├── secrets.yaml
│ ├── renovate.yaml
│ ├── base/ # ArgoCD Application manifests (Kustomize base)
│ │ ├── gitea.yaml
│ │ ├── opencost.yaml
│ │ ├── traefik-application.yaml
│ │ ├── keycloak.yaml
│ │ ├── grafana.yaml
│ │ └── ...
│ ├── overlays/
│ │ └── upc-prod/
│ │ └── kustomization.yaml # Patches upc-dev → upc-prod valueFile paths
│ └── values/
│ ├── argocd-values.yaml
├── prometheus-values.yaml
├── grafana-values.yaml
├── loki-values.yaml
├── tempo-values.yaml
│ ├── gitea-values.yaml
├── gitea-actions-values.yaml
├── fluent-bit-values.yaml
└── renovate-values.yaml
│ ├── base/ # Cloud-agnostic Helm values
│ ├── gitea-values.yaml
│ ├── opencost-values.yaml
│ ├── prometheus-values.yaml
│ └── ...
│ ├── upc-dev/ # UpCloud dev overlay values
│ ├── traefik-values.yaml
│ ├── keycloak-values.yaml
│ ├── grafana-values.yaml
│ │ ├── gitea-values.yaml
│ │ └── opencost-values.yaml
│ └── upc-prod/ # UpCloud prod overlay values
│ ├── traefik-values.yaml
│ ├── keycloak-values.yaml
│ ├── grafana-values.yaml
│ ├── gitea-values.yaml
│ └── opencost-values.yaml
├── apps/ # Business applications
│ ├── mcp10x.yaml
@@ -128,12 +148,39 @@ launchpad/
│ └── auth-sidecar-injector.yaml
├── secrets/ # Application secrets (sealed)
│ ├── argocd-mcp-credentials.yaml
│ ├── dot-ai-secrets.yaml
│ ├── gitea-credentials-sealed.yaml
│ ├── gitea-runner-token-sealed.yaml
│ ├── mcp10x-credentials-sealed.yaml
└── musicman-credentials.yaml
│ ├── base/ # All SealedSecrets (shared across clouds)
│ ├── kustomization.yaml
│ ├── argocd-forte-helm-secret-sealed.yaml
│ ├── argocd-mcp-credentials.yaml
│ ├── argocdmcp-auth-oidc-sealed.yaml
│ ├── dot-ai-secrets.yaml
│ │ ├── forte10x-app-credentials-sealed.yaml
│ │ ├── gitea-backup-s3-sealed.yaml
│ │ ├── gitea-credentials-sealed.yaml
│ │ ├── gitea-runner-token-sealed.yaml
│ │ ├── gitea-smtp-secret-sealed.yaml
│ │ ├── keycloak-credentials-sealed.yaml
│ │ ├── musicman-auth-oidc-sealed.yaml
│ │ ├── musicman-credentials.yaml
│ │ └── renovate-env-sealed.yaml
│ └── overlays/ # Per-cloud overlays (reference base)
│ ├── aks-dev/kustomization.yaml
│ ├── aks-prod/kustomization.yaml
│ ├── eks-dev/kustomization.yaml
│ ├── eks-prod/kustomization.yaml
│ ├── gke-dev/kustomization.yaml
│ ├── gke-prod/kustomization.yaml
│ ├── upc-dev/kustomization.yaml
│ └── upc-prod/kustomization.yaml
├── scripts/ # Operational helper scripts
│ ├── gitea-backup.sh # S3 backup helper (list/download)
│ ├── gitea-restore.sh
│ └── backup/ # Per-cloud backup reference scripts
│ ├── s3-minio.sh # S3-compatible (UpCloud, MinIO, Wasabi)
│ ├── aws-s3.sh # Native AWS S3
│ ├── azure-blob.sh # Azure Blob Storage
│ └── gcp-gcs.sh # GCP Cloud Storage
├── private/ # Local-only (Git-ignored)
│ ├── *.yaml
@@ -686,6 +733,10 @@ spec:
**Chart**: `sealed-secrets/sealed-secrets-controller`
**Namespace**: `kube-system`
**Directory Structure**: `secrets/base/` contains all SealedSecrets with a `kustomization.yaml`. Per-cloud overlays in `secrets/overlays/<cloud>/` reference the base via Kustomize. The ArgoCD `secrets` Application points to the active overlay (e.g., `secrets/overlays/upc-dev`), and `infra/overlays/upc-prod` patches the path to `secrets/overlays/upc-prod`.
To add cloud-specific secrets, create a new SealedSecret in the overlay directory and add it to the overlay's `kustomization.yaml`.
**Public Certificate**:
```bash
kubeseal --fetch-cert \
@@ -1602,14 +1653,22 @@ Recommended resource allocation:
### Storage Classes
Default storage class used: **UpCloud default** (varies by provider)
Storage classes are cloud-specific and configured in per-cluster value overrides (`infra/values/{cluster}/gitea-values.yaml`):
| Cloud | Storage Class | Driver |
|-------|--------------|--------|
| **UpCloud** | `upcloud-block-storage-maxiops` | UpCloud CSI |
| **AWS EKS** | `gp3` | EBS CSI |
| **Azure AKS** | `managed-csi-premium` | Azure Disk CSI |
| **GCP GKE** | `premium-rwo` | PD CSI |
```yaml
# Example: base values omit storageClass (set in per-cluster overlay)
persistence:
enabled: true
storageClass: "" # Uses default
accessMode: ReadWriteOnce
size: 5Gi
# storageClass set by infra/values/{cluster}/gitea-values.yaml
```
---
@@ -1673,6 +1732,88 @@ POST /loki/api/v1/push
---
## Cloud Overlay Pattern
### Overview
Cloud-specific configuration (StorageClass, LoadBalancer annotations, pricing models, etc.) lives in per-cloud overlay value files, **not** in `base/`. Adding a new cloud provider only requires a new overlay directory — no base changes.
### Supported Clouds
| Cloud | Dev overlay | Prod overlay | StorageClass | LB type |
|-------|-----------|-------------|-------------|---------|
| **UpCloud** | `upc-dev` | `upc-prod` | `upcloud-block-storage-maxiops` | UpCloud LB (proxy protocol v2) |
| **Azure AKS** | `aks-dev` | `aks-prod` | `managed-csi-premium` | Azure LB |
| **AWS EKS** | `eks-dev` | `eks-prod` | `gp3` | AWS NLB (proxy protocol) |
| **GCP GKE** | `gke-dev` | `gke-prod` | `premium-rwo` | GCP NEG |
Bootstrap any cluster with: `./bootstrap.sh <cluster>` (e.g., `./bootstrap.sh aks-dev`)
### How It Works
Each ArgoCD Application uses **multi-source Helm values** with two value files:
```yaml
# infra/base/gitea.yaml (example)
helm:
valueFiles:
- $values/infra/values/base/gitea-values.yaml # [0] cloud-agnostic
- $values/infra/values/upc-dev/gitea-values.yaml # [1] cloud-specific (default: upc-dev)
```
The `upc-prod` Kustomize overlay patches index `[1]` to swap the cloud-specific file:
```yaml
# infra/overlays/upc-prod/kustomization.yaml
- target:
kind: Application
name: gitea
patch: |
- op: replace
path: /spec/sources/0/helm/valueFiles/1
value: $values/infra/values/upc-prod/gitea-values.yaml
```
### Components Using Cloud Overlays
| Component | Cloud-specific config | Overlay value file |
|-----------|----------------------|-------------------|
| **Traefik** | LB annotations, proxy protocol IPs | `traefik-values.yaml` |
| **Keycloak** | Hostname, TLS settings | `keycloak-values.yaml` |
| **Grafana** | Hostname, datasource URLs | `grafana-values.yaml` |
| **Gitea** | StorageClass (persistence + PostgreSQL) | `gitea-values.yaml` |
| **OpenCost** | Custom pricing model (CPU/RAM/storage rates) | `opencost-values.yaml` |
### Backup CronJob
The `gitea-backup` CronJob uses a generic `s3` alias for `minio/mc`. The actual endpoint and credentials come from the `gitea-backup-s3` Sealed Secret, which is per-cloud. Reference scripts for different cloud providers are in `scripts/backup/`:
| Script | Provider | Tool |
|--------|----------|------|
| `s3-minio.sh` | S3-compatible (UpCloud, MinIO, Wasabi) | `minio/mc` |
| `aws-s3.sh` | AWS S3 | `aws` CLI |
| `azure-blob.sh` | Azure Blob Storage | `az` CLI |
| `gcp-gcs.sh` | GCP Cloud Storage | `gsutil` |
### Adding a New Cloud Provider
To add support for a new cloud (e.g., `oci-dev` for Oracle Cloud):
1. **Cluster config**: `clusters/oci-dev.yaml` — clusterName, domain, trustedIPs, cloudProvider
2. **Overlay value files** in `infra/values/oci-dev/`:
- `traefik-values.yaml` — LB annotations, proxy protocol config
- `keycloak-values.yaml` — hostname
- `grafana-values.yaml` — hostname
- `gitea-values.yaml``storageClass` for persistence + PostgreSQL
- `opencost-values.yaml` — pricing model or cloud billing integration
3. **Kustomize overlay**: `infra/overlays/oci-dev/kustomization.yaml` — patch `valueFiles[1]` for each Application
4. **App-of-apps**: `_app-of-apps-oci-dev.yaml` — points to `infra/overlays/oci-dev`
5. **Secrets overlay**: `secrets/overlays/oci-dev/kustomization.yaml` — references `../../base`, add cloud-specific SealedSecrets if needed
6. **Secrets patch**: Add patch to `infra/overlays/oci-dev/kustomization.yaml` to swap secrets path to `secrets/overlays/oci-dev`
7. **Bootstrap**: `./bootstrap.sh oci-dev`
---
## Glossary
### Terms
@@ -1805,6 +1946,6 @@ team: platform
---
**Last Updated**: 2026-04-16
**Last Updated**: 2026-04-22
**Maintained By**: Platform Team
**Version**: 1.0.0