# GitOps Architecture & Repository Guide ## Table of Contents - [Overview](#overview) - [Architecture Diagram](#architecture-diagram) - [Repository Structure](#repository-structure) - [GitOps Workflow](#gitops-workflow) - [CI/CD Pipeline](#cicd-pipeline) - [Security Model](#security-model) --- ## Overview This Kubernetes cluster uses a **GitOps approach** powered by **ArgoCD**, where Git repositories serve as the single source of truth for both infrastructure and application deployments. The cluster is running on **UpCloud Managed Kubernetes** but is designed to be cloud-agnostic. ### Key Characteristics - **Environment**: Production (internal use only) - **Cluster Type**: Single cluster, single environment - **GitOps Tool**: ArgoCD - **Deployment Pattern**: App-of-Apps - **Secret Management**: Sealed Secrets (kubeseal) - **Ingress**: Traefik with Let's Encrypt TLS - **Monitoring**: Prometheus + Grafana + Loki + Fluent-Bit - **Policy Engine**: Kyverno - **Notifications**: Slack integration for sync status --- ## Architecture Diagram ``` ┌─────────────────────────────────────────────────────────────────────────┐ │ Developer Workflow │ └─────────────────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────┐ ┌──────────────────┐ ┌─────────────────┐ │ Application Code │ │ Helm Charts │ │ Helm Values │ │ Repositories │──────│ Repository │──────│ Repository │ │ (Source Code) │ │ (Templates) │ │ (Config/Env) │ └─────────────────────┘ └──────────────────┘ └─────────────────┘ │ │ │ │ │ │ GitHub Actions │ │ Build & Push Image │ │ │ │ │ │ │ │ └────────► Update image tag ─┴──────────────────────────┘ in helm-values │ │ ▼ ┌────────────────────────────────┐ │ Config Repository │ │ (ArgoCD Applications) │ │ github.com/fortedigital/ │ │ sturdy-adventure │ └────────────────────────────────┘ │ │ ArgoCD monitors & syncs │ ▼ ┌────────────────────────────────┐ │ Kubernetes Cluster │ │ (UpCloud Managed) │ │ │ │ ┌──────────────────────────┐ │ │ │ ArgoCD │ │ │ │ (GitOps Controller) │ │ │ └──────────────────────────┘ │ │ │ │ ┌──────────────────────────┐ │ │ │ Infrastructure Layer │ │ │ │ - Traefik (Ingress) │ │ │ │ - Cert-Manager (TLS) │ │ │ │ - Kyverno (Policies) │ │ │ │ - Sealed Secrets │ │ │ └──────────────────────────┘ │ │ │ │ ┌──────────────────────────┐ │ │ │ Monitoring Stack │ │ │ │ - Prometheus │ │ │ │ - Grafana │ │ │ │ - Loki │ │ │ │ - Fluent-Bit │ │ │ └──────────────────────────┘ │ │ │ │ ┌──────────────────────────┐ │ │ │ Application Layer │ │ │ │ - mcp10x │ │ │ │ - musicman │ │ │ │ - dot-ai-stack │ │ │ │ - argo-mcp │ │ │ └──────────────────────────┘ │ └────────────────────────────────┘ │ │ ▼ ┌──────────────────┐ │ Slack Channel │ │ (Notifications) │ └──────────────────┘ ``` --- ## Repository Structure ### 1. **Config Repository** (Current Repo) **Repository**: `https://github.com/fortedigital/sturdy-adventure.git` **Purpose**: GitOps configuration - ArgoCD Applications and cluster resources **Location**: `C:\dev\k8s\launchpad` ``` sturdy-adventure/ ├── bootstrap.sh # Cluster initialization script ├── _app-of-apps.yaml # Root ArgoCD Application (App-of-Apps pattern) │ ├── infra/ # Infrastructure ArgoCD Applications │ ├── enterprise-apps.yaml # Parent app managing all apps in apps/ │ ├── cluster-resources-application.yaml │ ├── traefik-application.yaml │ ├── cert-manager-application.yaml │ ├── kyverno.yaml │ ├── kyverno-policies.yaml │ ├── prometheus.yaml │ ├── grafana.yaml │ ├── loki.yaml │ ├── fluent-bit.yaml │ ├── trivy.yaml │ ├── sealedsecrets.yaml │ ├── secrets.yaml │ └── values/ # Helm value overrides for infra │ ├── argocd-values.yaml │ ├── prometheus-values.yaml │ ├── grafana-values.yaml │ ├── loki-values.yaml │ └── fluent-bit-values.yaml │ ├── apps/ # Business Application ArgoCD manifests │ ├── mcp10x.yaml # MCP 10X application │ ├── musicman.yaml # Music Man application │ ├── dot-ai-stack.yaml # Dot AI Stack │ └── argo-mcp.yaml # ArgoCD MCP server │ ├── cluster-resources/ # Cluster-wide Kubernetes resources │ ├── cert-manager-namespace.yaml │ ├── secrets-namespace.yaml │ ├── letsencrypt-issuer.yaml # Let's Encrypt ClusterIssuer │ ├── kyverno-config.yaml │ ├── argocd-notifications-secret-sealed.yaml │ ├── forte10x-repo-credentials-sealed.yaml │ ├── mcp10x-repo-credentials-sealed.yaml │ └── policies/ # Kyverno policies │ ├── deployment-verifier.yaml │ ├── label-checker.yaml │ ├── bare-pod-cleaner.yaml │ ├── replicaset-cleaner.yaml │ ├── default-ns-blocker.yaml │ ├── secret-cloner.yaml │ └── auth-sidecar-injector.yaml │ ├── secrets/ # Application secrets (sealed) │ ├── argocd-mcp-credentials.yaml │ ├── dot-ai-secrets.yaml │ ├── mcp10x-credentials-sealed.yaml │ └── musicman-credentials.yaml │ ├── private/ # Local-only files (NOT in Git) │ ├── *.yaml # Unsealed secrets │ └── *.sh # Helper scripts │ └── docs/ # Documentation ├── GITOPS-ARCHITECTURE.md # This file ├── DEVELOPER-GUIDE.md ├── OPERATIONS-RUNBOOK.md └── REFERENCE.md ``` **Key Points**: - `_app-of-apps.yaml` is the root Application that ArgoCD monitors - `infra/enterprise-apps.yaml` auto-discovers all apps in `apps/` folder - Changes pushed to this repo trigger automatic syncs in ArgoCD - `private/` folder contains local-only files (Git-ignored) --- ### 2. **Helm Charts Repository** **Repository**: `https://github.com/fortedigital/forte-helm` **Purpose**: Reusable Helm chart templates for Forte applications **Location**: `C:\dev\k8s\forte-helm` ``` forte-helm/ └── forteapp/ # Generic Forte application chart ├── Chart.yaml # Chart metadata (v0.1.0) ├── values.yaml # Default values (base template) ├── templates/ │ ├── _helpers.tpl # Template helpers │ ├── namespace.yaml │ ├── deployment.yaml # Main app deployment │ ├── service.yaml │ ├── ingressroute.yaml # Traefik IngressRoute │ ├── certificate.yaml # Cert-Manager Certificate │ ├── configmap.yaml │ ├── secret-auth-tokens.yaml │ ├── hpa.yaml # Horizontal Pod Autoscaler │ ├── database-statefulset.yaml # Optional PostgreSQL DB │ └── database-service.yaml └── README.md ``` **Key Points**: - Single generic chart (`forteapp`) used by all Forte applications - Supports optional PostgreSQL database (StatefulSet) - Configurable authentication (token-based or OIDC) - Traefik IngressRoute with automatic TLS via Cert-Manager - Designed for microservices with similar patterns --- ### 3. **Helm Values Repository** **Repository**: `git@github.com:fortedigital/helm-values.git` **Purpose**: Environment-specific configuration for each application **Location**: `C:\dev\k8s\helm-prod-values` ``` helm-prod-values/ ├── mcp10x/ │ └── values.yaml # MCP 10X configuration ├── musicman/ │ └── values.yaml # Music Man configuration ├── mcpcoder/ │ └── values.yaml # MCP Coder configuration └── argocd-mcp/ └── values.yaml # ArgoCD MCP configuration ``` **Key Points**: - Each app has its own folder with `values.yaml` - Contains environment-specific settings (image tags, env vars, resources, etc.) - Referenced by ArgoCD Applications using multi-source pattern - Image tags are updated here by CI/CD pipelines - Secrets are referenced by name (actual secrets stored as SealedSecrets) **Example** (`mcp10x/values.yaml`): ```yaml app: image: repository: ghcr.io/fortedigital/10x tag: 2.0.4 # Updated by CI/CD extraEnv: - name: PORT value: "3000" envSecretName: "app-credentials" # References SealedSecret ingress: enabled: true host: mcp10x.forteapps.net # Public domain ``` --- ### 4. **Application Source Code Repositories** **Purpose**: Application source code with CI/CD pipelines **Examples**: Various private repositories **Typical Structure**: ``` app-repository/ ├── src/ # Application source code ├── Dockerfile # Container build definition ├── .github/ │ └── workflows/ │ └── build-and-deploy.yml # GitHub Actions workflow └── package.json / requirements.txt # Dependencies ``` **CI/CD Workflow** (GitHub Actions): 1. Trigger on push to `main` branch 2. Build Docker image 3. Tag with version (e.g., `v2.0.4`) 4. Push to container registry (GHCR, Docker Hub, etc.) 5. Update image tag in `helm-values` repository 6. ArgoCD detects change and syncs automatically --- ## GitOps Workflow ### The App-of-Apps Pattern ``` _app-of-apps.yaml (Root) │ ├── infrastructure-apps (manages infra/) │ ├── cluster-resources-application │ ├── traefik-application │ ├── cert-manager-application │ ├── kyverno │ ├── prometheus │ ├── grafana │ └── ... (other infra apps) │ └── enterprise-apps (manages apps/) ├── mcp10x ├── musicman ├── dot-ai-stack └── argo-mcp ``` **How It Works**: 1. Bootstrap script installs ArgoCD and applies `_app-of-apps.yaml` 2. ArgoCD creates the root Application which monitors `infra/` folder 3. Each YAML in `infra/` becomes a child Application 4. `enterprise-apps.yaml` monitors `apps/` folder and auto-discovers applications 5. ArgoCD continuously syncs (every 60s) and auto-heals drift ### Sync Waves & Ordering Applications deploy in order using `argocd.argoproj.io/sync-wave` annotations: ``` Wave -1: Namespaces (created first) Wave 0: Kyverno (policies ready before resources) Wave 1: Cluster resources, infrastructure apps Wave 2+: Business applications ``` Example: ```yaml metadata: annotations: argocd.argoproj.io/sync-wave: "1" ``` ### Multi-Source Pattern Applications like `mcp10x` and `musicman` use multiple sources: ```yaml spec: sources: - repoURL: https://github.com/fortedigital/forte-helm path: forteapp # Helm chart templates helm: valueFiles: - $values/mcp10x/values.yaml # Reference to second source - repoURL: git@github.com:fortedigital/helm-values.git targetRevision: HEAD ref: values # Named reference ``` **Benefits**: - Chart templates separated from configuration - Single chart reused across all apps - Easy to update all apps by changing the chart - Environment-specific values isolated in separate repo --- ## CI/CD Pipeline ### Continuous Integration **Application Repositories** contain GitHub Actions workflows: ```yaml name: Build and Deploy on: push: branches: [ main ] jobs: build: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - name: Build Docker image run: docker build -t ghcr.io/fortedigital/app:$VERSION . - name: Push to registry run: docker push ghcr.io/fortedigital/app:$VERSION - name: Update Helm values run: | git clone git@github.com:fortedigital/helm-values.git cd helm-values/app sed -i "s/tag: .*/tag: $VERSION/" values.yaml git commit -am "Update app to $VERSION" git push ``` ### Continuous Deployment **ArgoCD** automatically syncs when changes are detected: 1. **Config Repo Change**: - Developer updates `apps/myapp.yaml` - Pushes to `sturdy-adventure` repo - ArgoCD detects change (60s reconciliation) - Syncs application to cluster 2. **Helm Values Change**: - CI/CD updates `helm-values/myapp/values.yaml` - ArgoCD detects change - Pulls new Helm chart with updated values - Applies to cluster 3. **Sync Policy**: ```yaml syncPolicy: automated: prune: true # Remove deleted resources selfHeal: true # Revert manual changes retry: limit: 5 # Retry up to 5 times backoff: duration: 5s maxDuration: 3m ``` ### Deployment Validation Before applying, ArgoCD: - ✅ Validates YAML syntax - ✅ Checks Kubernetes schema - ✅ Runs server-side dry-run - ✅ Verifies resource quotas - ✅ Applies Kyverno policies After applying: - ✅ Waits for resources to become healthy - ✅ Sends Slack notification (success/failure) - ✅ Tracks sync status in UI --- ## Security Model ### Secret Management **Sealed Secrets** encrypt secrets for safe Git storage: ```bash # Developer creates plain secret locally kubectl create secret generic app-creds \ --from-literal=API_KEY=secret123 \ --dry-run=client -o yaml > private/app-creds.yaml # Seal the secret using kubeseal kubeseal --format=yaml \ --cert=pub-cert.pem \ < private/app-creds.yaml \ > secrets/app-creds-sealed.yaml # Commit sealed secret to Git git add secrets/app-creds-sealed.yaml git commit -m "Add app credentials" ``` **Storage**: - ✅ Sealed secrets committed to Git - ❌ Plain secrets kept in `private/` (Git-ignored) or discarded - ⚠️ Secret rotation process not yet established ### Kyverno Policies **Policy Engine** enforces security rules: 1. **Secret Cloning**: Automatically clones secrets to new namespaces ```yaml # cluster-resources/policies/secret-cloner.yaml # Secrets labeled "allowedToBeCloned: true" are synced ``` 2. **Default Namespace Blocker**: Prevents use of `default` namespace 3. **Bare Pod Cleaner**: Removes pods without controllers (Deployments/StatefulSets) 4. **Deployment Verifier**: Ensures pods have proper controllers 5. **Auth Sidecar Injector**: Injects authentication proxy based on annotations ### Repository Access **Private Repository Credentials** stored as SealedSecrets: ```yaml # cluster-resources/forte10x-repo-credentials-sealed.yaml ``` ArgoCD uses these to access private Helm values repositories. ### Network Security **Traefik Ingress** with TLS: - All HTTP traffic redirects to HTTPS - Let's Encrypt automatic certificate renewal - Cert-Manager manages certificate lifecycle - Per-application IngressRoutes with dedicated certificates ### Authentication **Application-Level Auth** (optional): - Token-based authentication (static tokens) - OIDC integration (Keycloak, Okta, etc.) - Auth sidecar injected via Kyverno policy - Tokens stored in SealedSecrets Example: ```yaml # In deployment.yaml template annotations: policies.forteapps.io/auth: "true" policies.forteapps.io/auth-token-secret-name: "app-tokens" ``` --- ## Monitoring & Observability ### Stack Components 1. **Prometheus**: Metrics collection and storage 2. **Grafana**: Metrics visualization and dashboards 3. **Loki**: Log aggregation 4. **Fluent-Bit**: Log shipping from pods to Loki 5. **Trivy**: Container vulnerability scanning ### Slack Notifications All ArgoCD applications send notifications to shared Slack channel: ```yaml metadata: annotations: notifications.argoproj.io/subscribe.on-sync-succeeded.slack: "" notifications.argoproj.io/subscribe.on-sync-failed.slack: "" notifications.argoproj.io/subscribe.on-degraded.slack: "" ``` Notifications include: - ✅ Sync succeeded - ❌ Sync failed - ⚠️ Application degraded --- ## Disaster Recovery ### Cluster Rebuild **Current State**: No backup routines exist yet. Cluster can be rebuilt from Git. **Rebuild Process**: 1. Provision new Kubernetes cluster 2. Clone `sturdy-adventure` repository 3. Run `./bootstrap.sh` 4. ArgoCD installs and syncs all applications 5. Manually recreate unsealed secrets and seal them **Data Loss**: - Currently: Data loss is acceptable (internal use) - Future: One stateful application may require backup strategy ### GitOps Advantages for DR ✅ **Infrastructure as Code**: Entire cluster defined in Git ✅ **Reproducible**: Cluster can be rebuilt identically ✅ **Auditable**: All changes tracked in Git history ✅ **Rollback**: Easy to revert to previous Git commit ✅ **Multi-Cluster**: Same config can deploy to multiple clusters --- ## Best Practices ### Repository Organization ✅ **DO**: - Separate infrastructure (`infra/`) from applications (`apps/`) - Use sync waves to control deployment order - Keep secrets in `private/` folder (Git-ignored) - Commit only sealed secrets to Git - Use multi-source pattern for chart/values separation ❌ **DON'T**: - Commit plain secrets to Git - Mix infrastructure and application configs - Hard-code environment-specific values in charts - Manually modify resources in cluster (use Git) ### GitOps Workflow ✅ **DO**: - All changes through Git (single source of truth) - Use PR reviews for production changes - Test changes in isolated namespaces first - Monitor ArgoCD sync status - Respond to Slack notifications ❌ **DON'T**: - Use `kubectl apply` directly (breaks GitOps) - Ignore sync failures - Bypass ArgoCD for "quick fixes" - Edit resources in place (`kubectl edit`) ### Application Development ✅ **DO**: - Follow the `forteapp` chart pattern - Use semantic versioning for image tags - Update helm-values via CI/CD - Test locally with Docker Compose - Document environment variables ❌ **DON'T**: - Use `latest` image tag - Hard-code configuration in code - Skip local testing - Deploy untested images to production --- ## Next Steps 📖 Continue to: - **[Developer Guide](DEVELOPER-GUIDE.md)** - Learn how to deploy and manage applications - **[Operations Runbook](OPERATIONS-RUNBOOK.md)** - Common operational tasks - **[Technical Reference](REFERENCE.md)** - Detailed component documentation --- **Last Updated**: 2026-03-16 **Maintained By**: Platform Team **Questions?**: Contact #platform-support on Slack