Co-authored-by: Danijel Simeunovic <danijel.simeunovic@fortedigital.com> Reviewed-on: #14
26 KiB
GitOps Architecture & Repository Guide
Table of Contents
Overview
This Kubernetes cluster uses a GitOps approach powered by ArgoCD, where Git repositories serve as the single source of truth for both infrastructure and application deployments. The cluster setup is cloud-agnostic, with ready-to-use configurations for UpCloud, AWS EKS, Azure AKS, and GCP GKE.
Key Characteristics
- Environment: Production (internal use only)
- Cluster Type: Multi-cloud, multi-cluster via Kustomize overlays (UpCloud, AWS, Azure, GCP)
- GitOps Tool: ArgoCD
- Deployment Pattern: App-of-Apps
- Secret Management: Sealed Secrets (kubeseal)
- Ingress: Traefik with Let's Encrypt TLS
- Monitoring: Prometheus + Grafana + Loki + Tempo + Fluent-Bit
- Policy Engine: Kyverno
- Notifications: Slack integration for sync status
Architecture Diagram
┌─────────────────────────────────────────────────────────────────────────┐
│ Developer Workflow │
└─────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Application Code │ │ Helm Charts │ │ Helm Values │
│ Repositories │──────│ Repository │──────│ Repository │
│ (Source Code) │ │ (Templates) │ │ (Config/Env) │
└─────────────────────┘ └──────────────────┘ └─────────────────┘
│ │ │
│ │ │
GitHub Actions │ │
Build & Push Image │ │
│ │ │
│ │ │
└────────► Update image tag ─┴──────────────────────────┘
in helm-prod-values │
│
▼
┌────────────────────────────────┐
│ Config Repository │
│ (ArgoCD Applications) │
│ git.forteapps.net/Forte/ │
│ launchpad │
└────────────────────────────────┘
│
│
ArgoCD monitors & syncs
│
▼
┌────────────────────────────────┐
│ Kubernetes Clusters │
│ (UpCloud, AWS, Azure, GCP) │
│ │
│ ┌──────────────────────────┐ │
│ │ ArgoCD │ │
│ │ (GitOps Controller) │ │
│ └──────────────────────────┘ │
│ │
│ ┌──────────────────────────┐ │
│ │ Infrastructure Layer │ │
│ │ - Traefik (Ingress) │ │
│ │ - Cert-Manager (TLS) │ │
│ │ - Kyverno (Policies) │ │
│ │ - Sealed Secrets │ │
│ └──────────────────────────┘ │
│ │
│ ┌──────────────────────────┐ │
│ │ Monitoring Stack │ │
│ │ - Prometheus │ │
│ │ - Grafana │ │
│ │ - Loki │ │
│ │ - Tempo │ │
│ │ - Fluent-Bit │ │
│ └──────────────────────────┘ │
│ │
│ ┌──────────────────────────┐ │
│ │ Application Layer │ │
│ │ - mcp10x │ │
│ │ - musicman │ │
│ │ - dot-ai-stack │ │
│ │ - argo-mcp │ │
│ └──────────────────────────┘ │
└────────────────────────────────┘
│
│
▼
┌──────────────────┐
│ Slack Channel │
│ (Notifications) │
└──────────────────┘
Repository Structure
1. Config Repository (Current Repo)
Repository: https://git.forteapps.net/Forte/launchpad
Purpose: GitOps configuration - ArgoCD Applications and cluster resources
Location: C:\dev\k8s\launchpad
launchpad/
├── bootstrap.sh # Cluster initialization script
├── _app-of-apps-upc-dev.yaml # Root ArgoCD Application (upc-dev cluster)
├── _app-of-apps-upc-prod.yaml # Root ArgoCD Application (upc-prod cluster)
│
├── infra/ # Infrastructure ArgoCD Applications (Kustomize)
│ ├── base/ # Base Application manifests (upc-dev defaults)
│ │ ├── kustomization.yaml
│ │ ├── traefik-application.yaml
│ │ ├── keycloak.yaml
│ │ ├── grafana.yaml
│ │ ├── gitea.yaml
│ │ ├── gitea-actions.yaml
│ │ ├── tempo.yaml
│ │ ├── renovate.yaml
│ │ ├── ... # All other Application manifests
│ │ └── secrets.yaml
│ ├── overlays/ # Per-cluster Kustomize overrides
│ │ ├── upc-dev/ # UpCloud Dev (uses base as-is)
│ │ ├── upc-prod/ # UpCloud Prod (patches value paths)
│ │ ├── eks-dev/ # AWS EKS Dev
│ │ ├── eks-prod/ # AWS EKS Prod
│ │ ├── aks-dev/ # Azure AKS Dev
│ │ ├── aks-prod/ # Azure AKS Prod
│ │ ├── gke-dev/ # GCP GKE Dev
│ │ └── gke-prod/ # GCP GKE Prod
│ ├── dashboards/ # Grafana dashboard ConfigMaps
│ └── values/ # Helm value overrides for infra
│ ├── base/ # Cloud-agnostic shared values
│ ├── upc-{dev,prod}/ # UpCloud: storage class, LB, pricing
│ ├── aws-{dev,prod}/ # AWS: gp3, NLB, CUR pricing
│ ├── aks-{dev,prod}/ # Azure: managed-csi-premium, Standard LB
│ └── gcp-{dev,prod}/ # GCP: premium-rwo, L4 LB
│
├── apps/ # Business Application ArgoCD manifests (Kustomize)
│ ├── base/ # Base app manifests
│ │ ├── kustomization.yaml
│ │ ├── dot-ai-stack.yaml
│ │ └── ...
│ └── overlays/
│ ├── upc-dev/ # Uses base as-is
│ └── upc-prod/ # Patches value paths
│
├── cluster-resources/ # Cluster-wide Kubernetes resources
│ ├── ...
│ └── policies/ # Kyverno policies
│
├── secrets/ # Application secrets (sealed, per-cluster)
│ └── upc-dev/ # Secrets for upc-dev cluster
│
├── private/ # Local-only files (NOT in Git)
│
└── docs/ # Documentation
Key Points:
_app-of-apps-upc-dev.yamland_app-of-apps-upc-prod.yamlare the per-cluster root Applications- Kustomize overlays in
infra/overlays/render base Applications with per-cluster patches - Helm values are split:
values/base/(shared) +values/upc-dev/orvalues/upc-prod/(cluster-specific) apps/follows the same base/overlays pattern for business applications- Changes pushed to this repo trigger automatic syncs in ArgoCD
private/folder contains local-only files (Git-ignored)
2. Helm Charts Repository
Repository: https://git.forteapps.net/Forte/forte-helm
Purpose: Reusable Helm chart templates for Forte applications
Location: C:\dev\k8s\forte-helm
forte-helm/
└── forteapp/ # Generic Forte application chart
├── Chart.yaml # Chart metadata (v0.1.0)
├── values.yaml # Default values (base template)
├── templates/
│ ├── _helpers.tpl # Template helpers
│ ├── namespace.yaml
│ ├── deployment.yaml # Main app deployment
│ ├── service.yaml
│ ├── ingressroute.yaml # Traefik IngressRoute
│ ├── certificate.yaml # Cert-Manager Certificate
│ ├── configmap.yaml
│ ├── secret-auth-tokens.yaml
│ ├── hpa.yaml # Horizontal Pod Autoscaler
│ ├── database-statefulset.yaml # Optional PostgreSQL DB
│ └── database-service.yaml
└── README.md
Key Points:
- Single generic chart (
forteapp) used by all Forte applications - Supports optional PostgreSQL database (StatefulSet)
- Configurable authentication (token-based or OIDC)
- Traefik IngressRoute with automatic TLS via Cert-Manager
- Designed for microservices with similar patterns
3. Helm Values Repository
Repository: git@github.com:fortedigital/helm-prod-values.git
Purpose: Environment-specific configuration for each application
Location: C:\dev\k8s\helm-prod-values
helm-prod-values/
├── mcp10x/
│ └── values.yaml # MCP 10X configuration
├── musicman/
│ └── values.yaml # Music Man configuration
└── argocd-mcp/
└── values.yaml # ArgoCD MCP configuration
Key Points:
- Each app has its own folder with
values.yaml - Contains environment-specific settings (image tags, env vars, resources, etc.)
- Referenced by ArgoCD Applications using multi-source pattern
- Image tags are updated here by CI/CD pipelines
- Secrets are referenced by name (actual secrets stored as SealedSecrets)
Example (mcp10x/values.yaml):
app:
image:
repository: ghcr.io/fortedigital/10x
tag: 2.0.4 # Updated by CI/CD
extraEnv:
- name: PORT
value: "3000"
envSecretName: "app-credentials" # References SealedSecret
ingress:
enabled: true
host: mcp10x.forteapps.net # Public domain
4. Application Source Code Repositories
Purpose: Application source code with CI/CD pipelines Examples: Various private repositories
Typical Structure:
app-repository/
├── src/ # Application source code
├── Dockerfile # Container build definition
├── .github/
│ └── workflows/
│ └── build-and-deploy.yml # GitHub Actions workflow
└── package.json / requirements.txt # Dependencies
CI/CD Workflow (GitHub Actions):
- Trigger on push to
mainbranch - Build Docker image
- Tag with version (e.g.,
v2.0.4) - Push to container registry (GHCR, Docker Hub, etc.)
- Update image tag in
helm-prod-valuesrepository - ArgoCD detects change and syncs automatically
GitOps Workflow
The App-of-Apps Pattern
_app-of-apps-{cluster}.yaml (Root, per cluster — e.g. upc-dev, eks-prod, gke-dev)
│
├── infrastructure-apps (manages infra/)
│ ├── cluster-resources-application
│ ├── traefik-application
│ ├── cert-manager-application
│ ├── kyverno
│ ├── prometheus
│ ├── grafana
│ ├── tempo
│ └── ... (other infra apps)
│
└── enterprise-apps (manages apps/)
├── mcp10x
├── musicman
├── dot-ai-stack
└── argo-mcp
How It Works:
- Bootstrap script installs ArgoCD and applies
_app-of-apps-upc-dev.yaml(orupc-prod) - ArgoCD creates the root Application which monitors the appropriate
infra/overlays/folder - Kustomize renders base Applications with cluster-specific patches
enterprise-appsApplication monitors the cluster'sapps/overlays/folder- ArgoCD continuously syncs (every 60s) and auto-heals drift
Sync Waves & Ordering
Applications deploy in order using argocd.argoproj.io/sync-wave annotations:
Wave -1: Namespaces (created first)
Wave 0: Kyverno (policies ready before resources)
Wave 1: Cluster resources, infrastructure apps
Wave 2+: Business applications
Example:
metadata:
annotations:
argocd.argoproj.io/sync-wave: "1"
Multi-Source Pattern
Applications like mcp10x and musicman use multiple sources:
spec:
sources:
- repoURL: https://git.forteapps.net/Forte/forte-helm
path: forteapp # Helm chart templates
helm:
valueFiles:
- $values/mcp10x/values.yaml # Reference to second source
- repoURL: git@github.com:fortedigital/helm-prod-values.git
targetRevision: HEAD
ref: values # Named reference
Benefits:
- Chart templates separated from configuration
- Single chart reused across all apps
- Easy to update all apps by changing the chart
- Environment-specific values isolated in separate repo
Multi-Cluster Pattern
Kustomize overlays enable deploying the same Applications across clusters with different configurations:
# infra/base/ contains default (upc-dev) Applications
# Helm values are layered: base + cluster-specific
valueFiles:
- $values/infra/values/base/traefik-values.yaml # Shared config
- $values/infra/values/upc-dev/traefik-values.yaml # Cluster-specific
# infra/overlays/upc-prod/kustomization.yaml patches the second valueFile
patches:
- target:
kind: Application
name: traefik
patch: |
- op: replace
path: /spec/sources/0/helm/valueFiles/1
value: $values/infra/values/upc-prod/traefik-values.yaml
Cloud-specific values (storage classes, load balancer annotations, cost model) are isolated in per-cluster value files. Base values are fully cloud-agnostic:
| Cloud | Storage Class | Load Balancer | OpenCost Provider |
|---|---|---|---|
| UpCloud | upcloud-block-storage-maxiops |
UpCloud LB (ProxyProtocol v2) | Custom pricing |
| AWS EKS | gp3 (EBS CSI) |
NLB (ProxyProtocol v2) | AWS CUR |
| Azure AKS | managed-csi-premium |
Standard LB (externalTrafficPolicy: Local) |
Azure Billing API |
| GCP GKE | premium-rwo (PD CSI) |
L4 passthrough NLB | GCP Cloud Billing |
Benefits:
- Single source of truth for Application definitions
- Cluster-specific values isolated per overlay
- Easy to add new clusters by creating a new overlay
- Base values shared across all clusters reduce duplication
CI/CD Pipeline
Continuous Integration
Application Repositories contain GitHub Actions workflows:
name: Build and Deploy
on:
push:
branches: [ main ]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Build Docker image
run: docker build -t ghcr.io/fortedigital/app:$VERSION .
- name: Push to registry
run: docker push ghcr.io/fortedigital/app:$VERSION
- name: Update Helm values
run: |
git clone git@github.com:fortedigital/helm-prod-values.git
cd helm-prod-values/app
sed -i "s/tag: .*/tag: $VERSION/" values.yaml
git commit -am "Update app to $VERSION"
git push
Continuous Deployment
ArgoCD automatically syncs when changes are detected:
-
Config Repo Change:
- Developer updates
apps/myapp.yaml - Pushes to
launchpadrepo - ArgoCD detects change (60s reconciliation)
- Syncs application to cluster
- Developer updates
-
Helm Values Change:
- CI/CD updates
helm-prod-values/myapp/values.yaml - ArgoCD detects change
- Pulls new Helm chart with updated values
- Applies to cluster
- CI/CD updates
-
Sync Policy:
syncPolicy: automated: prune: true # Remove deleted resources selfHeal: true # Revert manual changes retry: limit: 5 # Retry up to 5 times backoff: duration: 5s maxDuration: 3m
Deployment Validation
Before applying, ArgoCD:
- ✅ Validates YAML syntax
- ✅ Checks Kubernetes schema
- ✅ Runs server-side dry-run
- ✅ Verifies resource quotas
- ✅ Applies Kyverno policies
After applying:
- ✅ Waits for resources to become healthy
- ✅ Sends Slack notification (success/failure)
- ✅ Tracks sync status in UI
Security Model
Secret Management
Sealed Secrets encrypt secrets for safe Git storage:
# Developer creates plain secret locally
kubectl create secret generic app-creds \
--from-literal=API_KEY=secret123 \
--dry-run=client -o yaml > private/app-creds.yaml
# Seal the secret using kubeseal
kubeseal --format=yaml \
--cert=pub-cert.pem \
< private/app-creds.yaml \
> secrets/app-creds-sealed.yaml
# Commit sealed secret to Git
git add secrets/app-creds-sealed.yaml
git commit -m "Add app credentials"
Storage:
- ✅ Sealed secrets committed to Git
- ❌ Plain secrets kept in
private/(Git-ignored) or discarded - ⚠️ Secret rotation process not yet established
Kyverno Policies
Policy Engine enforces security rules:
-
Secret Cloning: Automatically clones secrets to new namespaces
# cluster-resources/policies/secret-cloner.yaml # Secrets labeled "allowedToBeCloned: true" are synced -
Default Namespace Blocker: Prevents use of
defaultnamespace -
Bare Pod Cleaner: Removes pods without controllers (Deployments/StatefulSets)
-
Deployment Verifier: Ensures pods have proper controllers
-
Auth Sidecar Injector: Injects authentication proxy based on annotations
Repository Access
Private Repository Credentials stored as SealedSecrets:
# cluster-resources/forte10x-repo-credentials-sealed.yaml
ArgoCD uses these to access private Helm values repositories.
Network Security
Traefik Ingress with TLS:
- All HTTP traffic redirects to HTTPS
- Let's Encrypt automatic certificate renewal
- Cert-Manager manages certificate lifecycle
- Per-application IngressRoutes with dedicated certificates
Authentication
Application-Level Auth (optional):
- Token-based authentication (static tokens)
- OIDC integration (Keycloak, Okta, etc.)
- Auth sidecar injected via Kyverno policy
- Tokens stored in SealedSecrets
Example:
# In deployment.yaml template
annotations:
policies.forteapps.io/auth: "true"
policies.forteapps.io/auth-token-secret-name: "app-tokens"
Monitoring & Observability
Stack Components
- Prometheus: Metrics collection and storage
- Grafana: Metrics visualization and dashboards
- Loki: Log aggregation
- Tempo: Distributed tracing (OTLP)
- Fluent-Bit: Log shipping from pods to Loki
- Trivy: Container vulnerability scanning
Slack Notifications
All ArgoCD applications send notifications to shared Slack channel:
metadata:
annotations:
notifications.argoproj.io/subscribe.on-sync-succeeded.slack: ""
notifications.argoproj.io/subscribe.on-sync-failed.slack: ""
notifications.argoproj.io/subscribe.on-degraded.slack: ""
Notifications include:
- ✅ Sync succeeded
- ❌ Sync failed
- ⚠️ Application degraded
Disaster Recovery
Cluster Rebuild
Current State: No backup routines exist yet. Cluster can be rebuilt from Git.
Rebuild Process:
- Provision new Kubernetes cluster
- Clone
launchpadrepository - Run
./bootstrap.sh - ArgoCD installs and syncs all applications
- Manually recreate unsealed secrets and seal them
Data Loss:
- Currently: Data loss is acceptable (internal use)
- Future: One stateful application may require backup strategy
GitOps Advantages for DR
✅ Infrastructure as Code: Entire cluster defined in Git ✅ Reproducible: Cluster can be rebuilt identically ✅ Auditable: All changes tracked in Git history ✅ Rollback: Easy to revert to previous Git commit ✅ Multi-Cluster: Same config can deploy to multiple clusters
Best Practices
Repository Organization
✅ DO:
- Separate infrastructure (
infra/) from applications (apps/) - Use sync waves to control deployment order
- Keep secrets in
private/folder (Git-ignored) - Commit only sealed secrets to Git
- Use multi-source pattern for chart/values separation
❌ DON'T:
- Commit plain secrets to Git
- Mix infrastructure and application configs
- Hard-code environment-specific values in charts
- Manually modify resources in cluster (use Git)
GitOps Workflow
✅ DO:
- All changes through Git (single source of truth)
- Use PR reviews for production changes
- Test changes in isolated namespaces first
- Monitor ArgoCD sync status
- Respond to Slack notifications
❌ DON'T:
- Use
kubectl applydirectly (breaks GitOps) - Ignore sync failures
- Bypass ArgoCD for "quick fixes"
- Edit resources in place (
kubectl edit)
Application Development
✅ DO:
- Follow the
forteappchart pattern - Use semantic versioning for image tags
- Update helm-prod-values via CI/CD
- Test locally with Docker Compose
- Document environment variables
❌ DON'T:
- Use
latestimage tag - Hard-code configuration in code
- Skip local testing
- Deploy untested images to production
Next Steps
📖 Continue to:
- Developer Guide - Learn how to deploy and manage applications
- Operations Runbook - Common operational tasks
- Technical Reference - Detailed component documentation
Last Updated: 2026-04-22 Maintained By: Platform Team Questions?: Contact #platform-support on Slack