Files
launchpad/docs/GITOPS-ARCHITECTURE.md
2026-04-22 21:48:02 +02:00

26 KiB

GitOps Architecture & Repository Guide

Table of Contents


Overview

This Kubernetes cluster uses a GitOps approach powered by ArgoCD, where Git repositories serve as the single source of truth for both infrastructure and application deployments. The cluster setup is cloud-agnostic, with ready-to-use configurations for UpCloud, AWS EKS, Azure AKS, and GCP GKE.

Key Characteristics

  • Environment: Production (internal use only)
  • Cluster Type: Multi-cloud, multi-cluster via Kustomize overlays (UpCloud, AWS, Azure, GCP)
  • GitOps Tool: ArgoCD
  • Deployment Pattern: App-of-Apps
  • Secret Management: Sealed Secrets (kubeseal)
  • Ingress: Traefik with Let's Encrypt TLS
  • Monitoring: Prometheus + Grafana + Loki + Tempo + Fluent-Bit
  • Policy Engine: Kyverno
  • Notifications: Slack integration for sync status

Architecture Diagram

┌─────────────────────────────────────────────────────────────────────────┐
│                          Developer Workflow                              │
└─────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
┌─────────────────────┐      ┌──────────────────┐      ┌─────────────────┐
│  Application Code   │      │   Helm Charts    │      │   Helm Values   │
│  Repositories       │──────│   Repository     │──────│   Repository    │
│  (Source Code)      │      │   (Templates)    │      │  (Config/Env)   │
└─────────────────────┘      └──────────────────┘      └─────────────────┘
         │                            │                          │
         │                            │                          │
    GitHub Actions                    │                          │
    Build & Push Image                │                          │
         │                            │                          │
         │                            │                          │
         └────────► Update image tag ─┴──────────────────────────┘
                    in helm-prod-values                               │
                                                                 │
                                                                 ▼
                                           ┌────────────────────────────────┐
                                           │   Config Repository            │
                                           │   (ArgoCD Applications)        │
                                           │   git.forteapps.net/Forte/     │
                                           │   launchpad                    │
                                           └────────────────────────────────┘
                                                        │
                                                        │
                                         ArgoCD monitors & syncs
                                                        │
                                                        ▼
                                           ┌────────────────────────────────┐
                                           │   Kubernetes Clusters          │
                                           │   (UpCloud, AWS, Azure, GCP)    │
                                           │                                │
                                           │  ┌──────────────────────────┐  │
                                           │  │    ArgoCD                │  │
                                           │  │    (GitOps Controller)   │  │
                                           │  └──────────────────────────┘  │
                                           │                                │
                                           │  ┌──────────────────────────┐  │
                                           │  │  Infrastructure Layer    │  │
                                           │  │  - Traefik (Ingress)     │  │
                                           │  │  - Cert-Manager (TLS)    │  │
                                           │  │  - Kyverno (Policies)    │  │
                                           │  │  - Sealed Secrets        │  │
                                           │  └──────────────────────────┘  │
                                           │                                │
                                           │  ┌──────────────────────────┐  │
                                           │  │  Monitoring Stack        │  │
                                           │  │  - Prometheus            │  │
                                           │  │  - Grafana               │  │
                                           │  │  - Loki                  │  │
                                           │  │  - Tempo                 │  │
                                           │  │  - Fluent-Bit            │  │
                                           │  └──────────────────────────┘  │
                                           │                                │
                                           │  ┌──────────────────────────┐  │
                                           │  │  Application Layer       │  │
                                           │  │  - mcp10x                │  │
                                           │  │  - musicman              │  │
                                           │  │  - dot-ai-stack          │  │
                                           │  │  - argo-mcp              │  │
                                           │  └──────────────────────────┘  │
                                           └────────────────────────────────┘
                                                        │
                                                        │
                                                        ▼
                                              ┌──────────────────┐
                                              │  Slack Channel   │
                                              │  (Notifications) │
                                              └──────────────────┘

Repository Structure

1. Config Repository (Current Repo)

Repository: https://git.forteapps.net/Forte/launchpad Purpose: GitOps configuration - ArgoCD Applications and cluster resources Location: C:\dev\k8s\launchpad

launchpad/
├── bootstrap.sh                      # Cluster initialization script
├── _app-of-apps-upc-dev.yaml        # Root ArgoCD Application (upc-dev cluster)
├── _app-of-apps-upc-prod.yaml       # Root ArgoCD Application (upc-prod cluster)
│
├── infra/                            # Infrastructure ArgoCD Applications (Kustomize)
│   ├── base/                         # Base Application manifests (upc-dev defaults)
│   │   ├── kustomization.yaml
│   │   ├── traefik-application.yaml
│   │   ├── keycloak.yaml
│   │   ├── grafana.yaml
│   │   ├── gitea.yaml
│   │   ├── gitea-actions.yaml
│   │   ├── tempo.yaml
│   │   ├── renovate.yaml
│   │   ├── ...                       # All other Application manifests
│   │   └── secrets.yaml
│   ├── overlays/                     # Per-cluster Kustomize overrides
│   │   ├── upc-dev/                  # UpCloud Dev (uses base as-is)
│   │   ├── upc-prod/                # UpCloud Prod (patches value paths)
│   │   ├── eks-dev/                  # AWS EKS Dev
│   │   ├── eks-prod/                # AWS EKS Prod
│   │   ├── aks-dev/               # Azure AKS Dev
│   │   ├── aks-prod/              # Azure AKS Prod
│   │   ├── gke-dev/                 # GCP GKE Dev
│   │   └── gke-prod/               # GCP GKE Prod
│   ├── dashboards/                   # Grafana dashboard ConfigMaps
│   └── values/                       # Helm value overrides for infra
│       ├── base/                     # Cloud-agnostic shared values
│       ├── upc-{dev,prod}/           # UpCloud: storage class, LB, pricing
│       ├── aws-{dev,prod}/           # AWS: gp3, NLB, CUR pricing
│       ├── aks-{dev,prod}/         # Azure: managed-csi-premium, Standard LB
│       └── gcp-{dev,prod}/           # GCP: premium-rwo, L4 LB
│
├── apps/                             # Business Application ArgoCD manifests (Kustomize)
│   ├── base/                         # Base app manifests
│   │   ├── kustomization.yaml
│   │   ├── dot-ai-stack.yaml
│   │   └── ...
│   └── overlays/
│       ├── upc-dev/                  # Uses base as-is
│       └── upc-prod/                # Patches value paths
│
├── cluster-resources/                # Cluster-wide Kubernetes resources
│   ├── ...
│   └── policies/                     # Kyverno policies
│
├── secrets/                          # Application secrets (sealed, per-cluster)
│   └── upc-dev/                      # Secrets for upc-dev cluster
│
├── private/                          # Local-only files (NOT in Git)
│
└── docs/                             # Documentation

Key Points:

  • _app-of-apps-upc-dev.yaml and _app-of-apps-upc-prod.yaml are the per-cluster root Applications
  • Kustomize overlays in infra/overlays/ render base Applications with per-cluster patches
  • Helm values are split: values/base/ (shared) + values/upc-dev/ or values/upc-prod/ (cluster-specific)
  • apps/ follows the same base/overlays pattern for business applications
  • Changes pushed to this repo trigger automatic syncs in ArgoCD
  • private/ folder contains local-only files (Git-ignored)

2. Helm Charts Repository

Repository: https://git.forteapps.net/Forte/forte-helm Purpose: Reusable Helm chart templates for Forte applications Location: C:\dev\k8s\forte-helm

forte-helm/
└── forteapp/                         # Generic Forte application chart
    ├── Chart.yaml                    # Chart metadata (v0.1.0)
    ├── values.yaml                   # Default values (base template)
    ├── templates/
    │   ├── _helpers.tpl              # Template helpers
    │   ├── namespace.yaml
    │   ├── deployment.yaml           # Main app deployment
    │   ├── service.yaml
    │   ├── ingressroute.yaml         # Traefik IngressRoute
    │   ├── certificate.yaml          # Cert-Manager Certificate
    │   ├── configmap.yaml
    │   ├── secret-auth-tokens.yaml
    │   ├── hpa.yaml                  # Horizontal Pod Autoscaler
    │   ├── database-statefulset.yaml # Optional PostgreSQL DB
    │   └── database-service.yaml
    └── README.md

Key Points:

  • Single generic chart (forteapp) used by all Forte applications
  • Supports optional PostgreSQL database (StatefulSet)
  • Configurable authentication (token-based or OIDC)
  • Traefik IngressRoute with automatic TLS via Cert-Manager
  • Designed for microservices with similar patterns

3. Helm Values Repository

Repository: git@github.com:fortedigital/helm-prod-values.git Purpose: Environment-specific configuration for each application Location: C:\dev\k8s\helm-prod-values

helm-prod-values/
├── mcp10x/
│   └── values.yaml                   # MCP 10X configuration
├── musicman/
│   └── values.yaml                   # Music Man configuration
└── argocd-mcp/
    └── values.yaml                   # ArgoCD MCP configuration

Key Points:

  • Each app has its own folder with values.yaml
  • Contains environment-specific settings (image tags, env vars, resources, etc.)
  • Referenced by ArgoCD Applications using multi-source pattern
  • Image tags are updated here by CI/CD pipelines
  • Secrets are referenced by name (actual secrets stored as SealedSecrets)

Example (mcp10x/values.yaml):

app:
  image:
    repository: ghcr.io/fortedigital/10x
    tag: 2.0.4                         # Updated by CI/CD
  extraEnv:
    - name: PORT
      value: "3000"
  envSecretName: "app-credentials"     # References SealedSecret

ingress:
  enabled: true
  host: mcp10x.forteapps.net           # Public domain

4. Application Source Code Repositories

Purpose: Application source code with CI/CD pipelines Examples: Various private repositories

Typical Structure:

app-repository/
├── src/                              # Application source code
├── Dockerfile                        # Container build definition
├── .github/
│   └── workflows/
│       └── build-and-deploy.yml      # GitHub Actions workflow
└── package.json / requirements.txt   # Dependencies

CI/CD Workflow (GitHub Actions):

  1. Trigger on push to main branch
  2. Build Docker image
  3. Tag with version (e.g., v2.0.4)
  4. Push to container registry (GHCR, Docker Hub, etc.)
  5. Update image tag in helm-prod-values repository
  6. ArgoCD detects change and syncs automatically

GitOps Workflow

The App-of-Apps Pattern

_app-of-apps-{cluster}.yaml (Root, per cluster — e.g. upc-dev, eks-prod, gke-dev)
    │
    ├── infrastructure-apps (manages infra/)
    │   ├── cluster-resources-application
    │   ├── traefik-application
    │   ├── cert-manager-application
    │   ├── kyverno
    │   ├── prometheus
    │   ├── grafana
    │   ├── tempo
    │   └── ... (other infra apps)
    │
    └── enterprise-apps (manages apps/)
        ├── mcp10x
        ├── musicman
        ├── dot-ai-stack
        └── argo-mcp

How It Works:

  1. Bootstrap script installs ArgoCD and applies _app-of-apps-upc-dev.yaml (or upc-prod)
  2. ArgoCD creates the root Application which monitors the appropriate infra/overlays/ folder
  3. Kustomize renders base Applications with cluster-specific patches
  4. enterprise-apps Application monitors the cluster's apps/overlays/ folder
  5. ArgoCD continuously syncs (every 60s) and auto-heals drift

Sync Waves & Ordering

Applications deploy in order using argocd.argoproj.io/sync-wave annotations:

Wave -1: Namespaces (created first)
Wave  0: Kyverno (policies ready before resources)
Wave  1: Cluster resources, infrastructure apps
Wave  2+: Business applications

Example:

metadata:
  annotations:
    argocd.argoproj.io/sync-wave: "1"

Multi-Source Pattern

Applications like mcp10x and musicman use multiple sources:

spec:
  sources:
  - repoURL: https://git.forteapps.net/Forte/forte-helm
    path: forteapp                     # Helm chart templates
    helm:
      valueFiles:
      - $values/mcp10x/values.yaml     # Reference to second source

  - repoURL: git@github.com:fortedigital/helm-prod-values.git
    targetRevision: HEAD
    ref: values                        # Named reference

Benefits:

  • Chart templates separated from configuration
  • Single chart reused across all apps
  • Easy to update all apps by changing the chart
  • Environment-specific values isolated in separate repo

Multi-Cluster Pattern

Kustomize overlays enable deploying the same Applications across clusters with different configurations:

# infra/base/ contains default (upc-dev) Applications
# Helm values are layered: base + cluster-specific
valueFiles:
- $values/infra/values/base/traefik-values.yaml    # Shared config
- $values/infra/values/upc-dev/traefik-values.yaml  # Cluster-specific

# infra/overlays/upc-prod/kustomization.yaml patches the second valueFile
patches:
- target:
    kind: Application
    name: traefik
  patch: |
    - op: replace
      path: /spec/sources/0/helm/valueFiles/1
      value: $values/infra/values/upc-prod/traefik-values.yaml

Cloud-specific values (storage classes, load balancer annotations, cost model) are isolated in per-cluster value files. Base values are fully cloud-agnostic:

Cloud Storage Class Load Balancer OpenCost Provider
UpCloud upcloud-block-storage-maxiops UpCloud LB (ProxyProtocol v2) Custom pricing
AWS EKS gp3 (EBS CSI) NLB (ProxyProtocol v2) AWS CUR
Azure AKS managed-csi-premium Standard LB (externalTrafficPolicy: Local) Azure Billing API
GCP GKE premium-rwo (PD CSI) L4 passthrough NLB GCP Cloud Billing

Benefits:

  • Single source of truth for Application definitions
  • Cluster-specific values isolated per overlay
  • Easy to add new clusters by creating a new overlay
  • Base values shared across all clusters reduce duplication

CI/CD Pipeline

Continuous Integration

Application Repositories contain GitHub Actions workflows:

name: Build and Deploy

on:
  push:
    branches: [ main ]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Build Docker image
        run: docker build -t ghcr.io/fortedigital/app:$VERSION .

      - name: Push to registry
        run: docker push ghcr.io/fortedigital/app:$VERSION

      - name: Update Helm values
        run: |
          git clone git@github.com:fortedigital/helm-prod-values.git
          cd helm-prod-values/app
          sed -i "s/tag: .*/tag: $VERSION/" values.yaml
          git commit -am "Update app to $VERSION"
          git push

Continuous Deployment

ArgoCD automatically syncs when changes are detected:

  1. Config Repo Change:

    • Developer updates apps/myapp.yaml
    • Pushes to launchpad repo
    • ArgoCD detects change (60s reconciliation)
    • Syncs application to cluster
  2. Helm Values Change:

    • CI/CD updates helm-prod-values/myapp/values.yaml
    • ArgoCD detects change
    • Pulls new Helm chart with updated values
    • Applies to cluster
  3. Sync Policy:

    syncPolicy:
      automated:
        prune: true        # Remove deleted resources
        selfHeal: true     # Revert manual changes
      retry:
        limit: 5           # Retry up to 5 times
        backoff:
          duration: 5s
          maxDuration: 3m
    

Deployment Validation

Before applying, ArgoCD:

  • Validates YAML syntax
  • Checks Kubernetes schema
  • Runs server-side dry-run
  • Verifies resource quotas
  • Applies Kyverno policies

After applying:

  • Waits for resources to become healthy
  • Sends Slack notification (success/failure)
  • Tracks sync status in UI

Security Model

Secret Management

Sealed Secrets encrypt secrets for safe Git storage:

# Developer creates plain secret locally
kubectl create secret generic app-creds \
  --from-literal=API_KEY=secret123 \
  --dry-run=client -o yaml > private/app-creds.yaml

# Seal the secret using kubeseal
kubeseal --format=yaml \
  --cert=pub-cert.pem \
  < private/app-creds.yaml \
  > secrets/app-creds-sealed.yaml

# Commit sealed secret to Git
git add secrets/app-creds-sealed.yaml
git commit -m "Add app credentials"

Storage:

  • Sealed secrets committed to Git
  • Plain secrets kept in private/ (Git-ignored) or discarded
  • ⚠️ Secret rotation process not yet established

Kyverno Policies

Policy Engine enforces security rules:

  1. Secret Cloning: Automatically clones secrets to new namespaces

    # cluster-resources/policies/secret-cloner.yaml
    # Secrets labeled "allowedToBeCloned: true" are synced
    
  2. Default Namespace Blocker: Prevents use of default namespace

  3. Bare Pod Cleaner: Removes pods without controllers (Deployments/StatefulSets)

  4. Deployment Verifier: Ensures pods have proper controllers

  5. Auth Sidecar Injector: Injects authentication proxy based on annotations

Repository Access

Private Repository Credentials stored as SealedSecrets:

# cluster-resources/forte10x-repo-credentials-sealed.yaml

ArgoCD uses these to access private Helm values repositories.

Network Security

Traefik Ingress with TLS:

  • All HTTP traffic redirects to HTTPS
  • Let's Encrypt automatic certificate renewal
  • Cert-Manager manages certificate lifecycle
  • Per-application IngressRoutes with dedicated certificates

Authentication

Application-Level Auth (optional):

  • Token-based authentication (static tokens)
  • OIDC integration (Keycloak, Okta, etc.)
  • Auth sidecar injected via Kyverno policy
  • Tokens stored in SealedSecrets

Example:

# In deployment.yaml template
annotations:
  policies.forteapps.io/auth: "true"
  policies.forteapps.io/auth-token-secret-name: "app-tokens"

Monitoring & Observability

Stack Components

  1. Prometheus: Metrics collection and storage
  2. Grafana: Metrics visualization and dashboards
  3. Loki: Log aggregation
  4. Tempo: Distributed tracing (OTLP)
  5. Fluent-Bit: Log shipping from pods to Loki
  6. Trivy: Container vulnerability scanning

Slack Notifications

All ArgoCD applications send notifications to shared Slack channel:

metadata:
  annotations:
    notifications.argoproj.io/subscribe.on-sync-succeeded.slack: ""
    notifications.argoproj.io/subscribe.on-sync-failed.slack: ""
    notifications.argoproj.io/subscribe.on-degraded.slack: ""

Notifications include:

  • Sync succeeded
  • Sync failed
  • ⚠️ Application degraded

Disaster Recovery

Cluster Rebuild

Current State: No backup routines exist yet. Cluster can be rebuilt from Git.

Rebuild Process:

  1. Provision new Kubernetes cluster
  2. Clone launchpad repository
  3. Run ./bootstrap.sh
  4. ArgoCD installs and syncs all applications
  5. Manually recreate unsealed secrets and seal them

Data Loss:

  • Currently: Data loss is acceptable (internal use)
  • Future: One stateful application may require backup strategy

GitOps Advantages for DR

Infrastructure as Code: Entire cluster defined in Git Reproducible: Cluster can be rebuilt identically Auditable: All changes tracked in Git history Rollback: Easy to revert to previous Git commit Multi-Cluster: Same config can deploy to multiple clusters


Best Practices

Repository Organization

DO:

  • Separate infrastructure (infra/) from applications (apps/)
  • Use sync waves to control deployment order
  • Keep secrets in private/ folder (Git-ignored)
  • Commit only sealed secrets to Git
  • Use multi-source pattern for chart/values separation

DON'T:

  • Commit plain secrets to Git
  • Mix infrastructure and application configs
  • Hard-code environment-specific values in charts
  • Manually modify resources in cluster (use Git)

GitOps Workflow

DO:

  • All changes through Git (single source of truth)
  • Use PR reviews for production changes
  • Test changes in isolated namespaces first
  • Monitor ArgoCD sync status
  • Respond to Slack notifications

DON'T:

  • Use kubectl apply directly (breaks GitOps)
  • Ignore sync failures
  • Bypass ArgoCD for "quick fixes"
  • Edit resources in place (kubectl edit)

Application Development

DO:

  • Follow the forteapp chart pattern
  • Use semantic versioning for image tags
  • Update helm-prod-values via CI/CD
  • Test locally with Docker Compose
  • Document environment variables

DON'T:

  • Use latest image tag
  • Hard-code configuration in code
  • Skip local testing
  • Deploy untested images to production

Next Steps

📖 Continue to:


Last Updated: 2026-04-22 Maintained By: Platform Team Questions?: Contact #platform-support on Slack