# Technical Reference ## Table of Contents - [Architecture Components](#architecture-components) - [Repository Reference](#repository-reference) - [Helm Chart Reference](#helm-chart-reference) - [ArgoCD Configuration](#argocd-configuration) - [Infrastructure Components](#infrastructure-components) - [Kyverno Policies](#kyverno-policies) - [Configuration Reference](#configuration-reference) - [API Endpoints](#api-endpoints) - [Cloud Overlay Pattern](#cloud-overlay-pattern) - [Glossary](#glossary) --- ## Architecture Components ### Cluster Specifications | Component | Value | |-----------|-------| | **Provider** | Multi-cloud (UpCloud, AWS EKS, Azure AKS, GCP GKE) | | **Environment** | Dev + Production per cloud | | **Active clusters** | UpCloud (upc-dev, upc-prod) | | **Cloud-ready templates** | EKS, AKS, GKE (dev + prod each) | | **GitOps Tool** | ArgoCD | | **Ingress Controller** | Traefik v2 | | **Certificate Management** | Cert-Manager + Let's Encrypt | | **Policy Engine** | Kyverno | | **Secret Management** | Sealed Secrets (Bitnami) | | **Monitoring** | Prometheus + Grafana | | **Logging** | Loki + Fluent-Bit | | **Tracing** | Tempo (OTLP) | | **Container Scanning** | Trivy | | **Version Control** | Gitea | ### Network Architecture ``` Internet │ ▼ [DNS: *.forteapps.net] │ ▼ [Cloud Load Balancer] │ ▼ [Traefik Ingress Controller] │ ├──► IngressRoute (TLS termination via Cert-Manager) │ ├──► Service (ClusterIP) │ │ │ └──► Pod (Application Container) │ └──► Service (Database - ClusterIP) │ └──► StatefulSet (PostgreSQL) ``` --- ## Repository Reference ### Config Repository: `launchpad` **URL**: `https://git.forteapps.net/Forte/launchpad` #### Directory Structure ``` launchpad/ ├── bootstrap.sh # Cluster initialization (ArgoCD + GitOps) ├── _app-of-apps-{cluster}.yaml # Root ArgoCD Application (per cluster) │ ├── .tofu/ # Infrastructure provisioning (OpenTofu) │ ├── platforms/ # Per-platform IaC │ │ ├── aks/ # Azure: modules/cluster/, dev/, prod/, workload/ │ │ ├── eks/ # AWS: same structure │ │ ├── gke/ # GCP │ │ └── upc/ # UpCloud │ ├── configs/ # Platform credentials (git-ignored) │ └── scripts/ # setup-cluster.sh, teardown-cluster.sh, get-kubeconfig.sh │ ├── clusters/ # Cluster metadata YAML │ ├── aks-dev.yaml │ ├── upc-dev.yaml │ └── ... │ ├── infra/ # Infrastructure applications (Kustomize) │ ├── base/ # One subdirectory per component │ │ ├── kustomization.yaml # Aggregates all component subdirectories │ │ ├── traefik-application/ │ │ │ ├── kustomization.yaml │ │ │ └── traefik-application.yaml │ │ ├── keycloak/ │ │ │ ├── kustomization.yaml │ │ │ └── keycloak.yaml │ │ ├── grafana/ │ │ ├── prometheus/ │ │ ├── loki/ │ │ ├── tempo/ │ │ ├── gitea/ │ │ ├── opencost/ │ │ ├── ... # Each component in own directory │ │ └── secrets/ │ ├── overlays/ # Per-cluster: include all or cherry-pick │ │ ├── upc-dev/ # resources: [../../base] (all components) │ │ ├── upc-prod/ # resources: [../../base] + patches │ │ ├── aks-dev/ # resources: [../../base/grafana, ...] (selective) │ │ └── .../ # 8 clusters total │ └── values/ │ ├── base/ # Cloud-agnostic Helm values │ │ ├── gitea-values.yaml │ │ ├── opencost-values.yaml │ │ ├── prometheus-values.yaml │ │ └── ... │ ├── upc-dev/ # UpCloud dev overlay values │ │ ├── traefik-values.yaml │ │ ├── keycloak-values.yaml │ │ ├── grafana-values.yaml │ │ ├── gitea-values.yaml │ │ └── opencost-values.yaml │ └── upc-prod/ # UpCloud prod overlay values │ ├── traefik-values.yaml │ ├── keycloak-values.yaml │ ├── grafana-values.yaml │ ├── gitea-values.yaml │ └── opencost-values.yaml │ ├── apps/ # Business applications (Kustomize) │ ├── base/ # One subdirectory per app │ │ ├── kustomization.yaml │ │ ├── musicman/ │ │ ├── mcp10x/ │ │ ├── dot-ai-stack/ │ │ ├── ts-mcp/ │ │ └── argo-mcp/ │ └── overlays/ # Per-cluster: include all or cherry-pick │ ├── upc-dev/ │ ├── upc-prod/ │ └── aks-dev/ # Selective apps only │ ├── cluster-resources/ # Cluster-level resources │ ├── cert-manager-namespace.yaml │ ├── secrets-namespace.yaml │ ├── letsencrypt-issuer.yaml │ ├── kyverno-config.yaml │ ├── argocd-notifications-secret-sealed.yaml │ ├── forte10x-repo-credentials-sealed.yaml │ ├── mcp10x-repo-credentials-sealed.yaml │ └── policies/ │ ├── deployment-verifier.yaml │ ├── label-checker.yaml │ ├── bare-pod-cleaner.yaml │ ├── replicaset-cleaner.yaml │ ├── default-ns-blocker.yaml │ ├── secret-cloner.yaml │ ├── keycloak-client-cloner.yaml │ └── auth-sidecar-injector.yaml │ ├── secrets/ # Application secrets (sealed) │ ├── base/ # All SealedSecrets (shared across clouds) │ │ ├── kustomization.yaml │ │ ├── argocd-forte-helm-secret-sealed.yaml │ │ ├── argocd-mcp-credentials.yaml │ │ ├── argocdmcp-auth-oidc-sealed.yaml │ │ ├── dot-ai-secrets.yaml │ │ ├── forte10x-app-credentials-sealed.yaml │ │ ├── gitea-backup-s3-sealed.yaml │ │ ├── gitea-credentials-sealed.yaml │ │ ├── gitea-runner-token-sealed.yaml │ │ ├── gitea-smtp-secret-sealed.yaml │ │ ├── keycloak-credentials-sealed.yaml │ │ ├── musicman-auth-oidc-sealed.yaml │ │ ├── musicman-credentials.yaml │ │ └── renovate-env-sealed.yaml │ └── overlays/ # Per-cloud overlays (reference base) │ ├── aks-dev/kustomization.yaml │ ├── aks-prod/kustomization.yaml │ ├── eks-dev/kustomization.yaml │ ├── eks-prod/kustomization.yaml │ ├── gke-dev/kustomization.yaml │ ├── gke-prod/kustomization.yaml │ ├── upc-dev/kustomization.yaml │ └── upc-prod/kustomization.yaml │ ├── scripts/ # Operational helper scripts │ ├── gitea-backup.sh # S3 backup helper (list/download) │ ├── gitea-restore.sh │ └── backup/ # Per-cloud backup reference scripts │ ├── s3-minio.sh # S3-compatible (UpCloud, MinIO, Wasabi) │ ├── aws-s3.sh # Native AWS S3 │ ├── azure-blob.sh # Azure Blob Storage │ └── gcp-gcs.sh # GCP Cloud Storage │ ├── private/ # Local-only (Git-ignored) │ ├── *.yaml │ └── *.sh │ └── docs/ # Documentation ├── GITOPS-ARCHITECTURE.md ├── DEVELOPER-GUIDE.md ├── OPERATIONS-RUNBOOK.md └── REFERENCE.md ``` #### Key Files **`bootstrap.sh`** ```bash #!/bin/zsh # Initializes cluster with ArgoCD ArgoCd() { helm upgrade --install argocd argo-cd \ --repo https://argoproj.github.io/argo-helm \ --namespace argocd --create-namespace \ --values infra/values/base/argocd-values.yaml \ --set notifications.context.clusterName="$CLUSTER_NAME" \ --timeout 60s --atomic kubectl apply -f _app-of-apps-upc-dev.yaml -n argocd # or _app-of-apps-upc-prod.yaml } ``` **`_app-of-apps-upc-dev.yaml`** / **`_app-of-apps-upc-prod.yaml`** ```yaml apiVersion: argoproj.io/v1alpha1 kind: Application metadata: name: infrastructure-apps namespace: argocd spec: project: default source: repoURL: ssh://git@git.forteapps.net:2222/Forte/launchpad.git path: infra destination: server: https://kubernetes.default.svc namespace: default syncPolicy: automated: prune: true selfHeal: true ``` --- ### Helm Charts Repository: `forte-helm` **URL**: `https://git.forteapps.net/Forte/forte-helm` #### Chart: `forteapp` **Version**: 0.1.0 **App Version**: 1.0.0 **Type**: application ##### Templates | Template | Purpose | |----------|---------| | `_helpers.tpl` | Template helper functions | | `namespace.yaml` | Namespace resource | | `deployment.yaml` | Main application Deployment | | `service.yaml` | ClusterIP Service | | `ingressroute.yaml` | Traefik IngressRoute | | `certificate.yaml` | Cert-Manager Certificate | | `configmap.yaml` | Application ConfigMap | | `secret-auth-tokens.yaml` | Authentication tokens | | `hpa.yaml` | Horizontal Pod Autoscaler | | `database-statefulset.yaml` | Optional PostgreSQL StatefulSet | | `database-service.yaml` | PostgreSQL Service | ##### Default Values Schema ```yaml app: image: repository: "" # Required tag: "" # Required pullPolicy: IfNotPresent containerPort: 3000 replicaCount: 1 resources: requests: cpu: 100m memory: 128Mi limits: cpu: 500m memory: 512Mi hpa: enabled: false minReplicas: 2 maxReplicas: 10 targetCPUUtilizationPercentage: 70 extraEnv: [] # - name: KEY # value: "value" envSecretName: "" # Reference to Secret nodeEnv: production db: enabled: false name: postgres image: repository: postgres tag: "16-alpine" service: type: ClusterIP port: 5432 targetPort: 5432 persistence: enabled: true storageClass: "" accessMode: ReadWriteOnce size: 5Gi resources: requests: memory: "256Mi" cpu: "250m" limits: memory: "1Gi" cpu: "1000m" extraEnv: [] envSecretName: "" livenessProbe: exec: command: - pg_isready - -U - db_user - -d - db_name initialDelaySeconds: 30 periodSeconds: 10 readinessProbe: exec: command: - pg_isready - -U - db_user - -d - db_name initialDelaySeconds: 5 periodSeconds: 5 service: type: ClusterIP port: 3000 ingress: enabled: false host: "" entrypoint: websecure tls: enabled: true secretName: "" clusterIssuer: letsencrypt-prod auth: enabled: false # Enable authentication sidecar injection type: token # Authentication mode: "token" or "oidc" # Token-based authentication configuration tokens: [] # List of valid bearer tokens (hex strings, 32+ bytes recommended) # - d4f88f6d9292c10cc3e21c4aad56d2be485db532b54fe961d738e1137d247823 # - 8803f621acc3898df1d7a8f514bc3602551a0681a8f747bd4e43c3c5849d57a7 # OIDC authentication configuration oidc: authority: "" # OIDC provider URL (e.g., https://auth.example.com/realms/master) clientId: "" # OIDC client ID registered with provider scopes: "openid,profile,email" # OAuth scopes (comma-separated) callbackPath: /auth/callback # OAuth callback path (default: /auth/callback) # Note: Client secret must be in 'auth-oidc' Secret (client-secret key) # Cookie secret must be in 'auth-oidc' Secret (cookie-secret key) configmap: [] # Application ConfigMap key-value pairs # KEY: value # DB_HOST: postgres # DB_PORT: "5432" ``` --- ### Helm Values Repository: `helm-prod-values` **URL**: `https://git.forteapps.net/Forte/helm-prod-values.git` #### Structure ``` helm-prod-values/ ├── mcp10x/ │ └── values.yaml ├── musicman/ │ └── values.yaml └── argocd-mcp/ └── values.yaml ``` #### Example: `mcp10x/values.yaml` ```yaml app: image: repository: ghcr.io/fortedigital/10x tag: 2.0.4 # Updated by CI/CD extraEnv: - name: PORT value: "3000" - name: SKILLS_DIR value: "/app/skills" - name: FLOWCASE_ENDPOINT value: "https://forte.cvpartner.com/api/" envSecretName: "app-credentials" auth: enabled: false tokens: - d4f88f6d9292c10cc3e21c4aad56d2be485db532b54fe961d738e1137d247823 ingress: enabled: true host: mcp10x.forteapps.net ``` --- ## Helm Chart Reference ### Template Functions #### `forteapp.fullname` ```yaml {{ include "forteapp.fullname" . }} # Output: ``` #### `forteapp.labels` ```yaml {{ include "forteapp.labels" . }} # Output: # app.kubernetes.io/name: forteapp # app.kubernetes.io/instance: # app.kubernetes.io/version: # app.kubernetes.io/managed-by: Helm ``` #### `forteapp.selectorLabels` ```yaml {{ include "forteapp.selectorLabels" . }} # Output: # app.kubernetes.io/name: forteapp # app.kubernetes.io/instance: ``` ### Deployment Specification ```yaml apiVersion: apps/v1 kind: Deployment metadata: name: {{ include "forteapp.fullname" . }} labels: {{- include "forteapp.labels" . | nindent 4 }} spec: replicas: {{ .Values.app.replicaCount }} selector: matchLabels: {{- include "forteapp.selectorLabels" . | nindent 6 }} template: metadata: annotations: policies.forteapps.io/auth: {{ .Values.auth.enabled | quote }} labels: {{- include "forteapp.selectorLabels" . | nindent 8 }} spec: containers: - name: app image: "{{ .Values.app.image.repository }}:{{ .Values.app.image.tag }}" imagePullPolicy: {{ .Values.app.image.pullPolicy }} ports: - name: http containerPort: {{ .Values.app.image.containerPort }} env: - name: NODE_ENV value: {{ .Values.app.nodeEnv | quote }} {{- with .Values.app.extraEnv }} {{- toYaml . | nindent 8 }} {{- end }} {{- if .Values.app.envSecretName }} envFrom: - secretRef: name: {{ .Values.app.envSecretName }} {{- end }} resources: {{- toYaml .Values.app.resources | nindent 10 }} securityContext: readOnlyRootFilesystem: true allowPrivilegeEscalation: false ``` ### IngressRoute Specification ```yaml apiVersion: traefik.io/v1alpha1 kind: IngressRoute metadata: name: {{ include "forteapp.fullname" . }} spec: entryPoints: - {{ .Values.ingress.entrypoint }} routes: - match: Host(`{{ .Values.ingress.host }}`) kind: Rule services: - name: {{ include "forteapp.fullname" . }} port: {{ .Values.service.port }} {{- if .Values.ingress.tls.enabled }} tls: secretName: {{ default .Release.Name .Values.ingress.tls.secretName }}-tls {{- end }} ``` ### Certificate Specification ```yaml apiVersion: cert-manager.io/v1 kind: Certificate metadata: name: {{ include "forteapp.fullname" . }}-tls spec: secretName: {{ default .Release.Name .Values.ingress.tls.secretName }}-tls issuerRef: name: {{ .Values.ingress.tls.clusterIssuer }} kind: ClusterIssuer dnsNames: - {{ .Values.ingress.host }} ``` --- ## ArgoCD Configuration ### Application Manifest Schema ```yaml apiVersion: argoproj.io/v1alpha1 kind: Application metadata: name: namespace: argocd annotations: argocd.argoproj.io/sync-wave: "1" notifications.argoproj.io/subscribe.on-sync-succeeded.slack: "" notifications.argoproj.io/subscribe.on-sync-failed.slack: "" notifications.argoproj.io/subscribe.on-degraded.slack: "" labels: app.kubernetes.io/name: app.kubernetes.io/part-of: apps app.kubernetes.io/managed-by: argocd finalizers: - resources-finalizer.argocd.argoproj.io spec: project: default # Multi-source configuration sources: - repoURL: https://git.forteapps.net/Forte/forte-helm path: forteapp targetRevision: HEAD helm: valueFiles: - $values//values.yaml - repoURL: git@github.com:fortedigital/helm-prod-values.git targetRevision: HEAD ref: values destination: server: https://kubernetes.default.svc namespace: syncPolicy: automated: prune: true selfHeal: true allowEmpty: false syncOptions: - CreateNamespace=true - Validate=true - ServerSideApply=true - Replace=false retry: limit: 5 backoff: duration: 5s factor: 2 maxDuration: 3m ignoreDifferences: - group: apps kind: Deployment jsonPointers: - /spec/replicas ``` ### Sync Waves | Wave | Components | Purpose | |------|------------|---------| | `-1` | Namespaces | Create namespaces first | | `0` | Kyverno | Install policy engine | | `1` | Cluster resources, infrastructure | Base infrastructure | | `2+` | Applications | Business applications | ### Sync Options | Option | Description | |--------|-------------| | `CreateNamespace=true` | Automatically create target namespace | | `Validate=true` | Validate resources before applying | | `ServerSideApply=true` | Use server-side apply (safer) | | `Replace=false` | Don't use kubectl replace | | `Prune=true` | Delete resources not in Git | ### Retry Policy ```yaml retry: limit: 5 # Max retry attempts backoff: duration: 5s # Initial backoff factor: 2 # Exponential factor maxDuration: 3m # Max backoff time ``` **Retry Schedule**: 1. 5 seconds 2. 10 seconds 3. 20 seconds 4. 40 seconds 5. 80 seconds (capped at 3 minutes) ### Global Settings (`argocd-cm`) | Setting | Value | Purpose | |---------|-------|---------| | `application.resourceTrackingMethod` | `annotation` | Track resources via annotations | | `timeout.reconciliation` | `60s` | Reconciliation interval | | `admin.enabled` | `false` | Admin login disabled (SSO-only) | | `url` | `https://argocd.forteapps.net` | External URL for ArgoCD UI | **Git Submodule Disable**: Set via `configs.params` (NOT `repoServer.env` — that causes strategic merge conflicts with chart's `valueFrom` entries): ```yaml configs: params: "reposerver.enable.git.submodule": "false" ``` This writes to `argocd-cmd-params-cm` ConfigMap, which the chart already reads via `valueFrom`. Submodules (e.g., `shared-prompts`) are not needed for K8s manifest generation. **Break-Glass Admin Access**: Admin login is disabled (`admin.enabled: false`). The admin password remains in `argocd-secret`. To re-enable temporarily: ```bash # Enable admin login kubectl patch cm argocd-cm -n argocd -p '{"data":{"admin.enabled":"true"}}' # Log in as admin, do what's needed, then disable again kubectl patch cm argocd-cm -n argocd -p '{"data":{"admin.enabled":"false"}}' ``` ArgoCD picks up ConfigMap changes within the reconciliation timeout (60s). Note: ArgoCD will revert this on next sync — this is intentional (temporary access only). **OIDC Authentication** (Keycloak): ```yaml configs: cm: oidc.config: | name: Forte SSO issuer: https://id.forteapps.net/realms/forte clientID: argocd clientSecret: $oidc.clientSecret requestedScopes: ["openid", "email", "profile"] rbacConfig: policy.csv: | g, ArgoCD Admins, role:admin g, ArgoCD Viewers, role:readonly # Deny users not in any declared KC group policy.default: "" scopes: '[groups]' ``` **Access Control**: Only users in Keycloak groups `ArgoCD Admins` or `ArgoCD Viewers` can access ArgoCD. Users not in either group are denied (empty `policy.default`). Assign users to groups in Keycloak admin console. - ArgoCD does NOT add `openid` implicitly — must include in `requestedScopes` - Do NOT add `groups` as a scope — the KC groups mapper emits the claim regardless - `$oidc.clientSecret` references the `oidc.clientSecret` key in `argocd-secret` - OIDC secret is synced by CronJob `argocd-oidc-sync` (see `cluster-resources/argocd-oidc-secret-sync.yaml`) - The CronJob bridges `argocd-oidc-credentials` (from KC registrar) → `argocd-secret` every 2 min - Safe for fresh deploys: no-ops if source secret doesn't exist yet **Ingress** (Traefik + TLS): ```yaml server: ingress: enabled: true ingressClassName: traefik annotations: cert-manager.io/cluster-issuer: letsencrypt-prod tls: true extraArgs: - --insecure configs: params: "server.insecure": true ``` TLS terminates at Traefik; ArgoCD runs in insecure mode behind the proxy. --- ## Infrastructure Components ### Homepage (Platform Dashboard) **Chart**: `jameswynn/homepage` **Namespace**: `homepage` **URL**: `https://start.forteapps.net` Platform dashboard that auto-discovers deployed apps via Kubernetes service annotations. **Discovery mechanism**: Services annotated with `gethomepage.dev/enabled: "true"` appear in the dashboard. Apps not deployed = annotations absent = not shown. Fully dynamic per environment. **Annotated services**: | Service | Namespace | Group | Widget | |---------|-----------|-------|--------| | `gitea-http` | `gitea` | DevOps | `gitea` | | `argocd-server` | `argocd` | DevOps | `argocd` | | `keycloak` | `keycloak` | Identity | none | | `grafana` | `monitoring` | Monitoring | `grafana` | | `karpor-server` | `karpor` | DevOps | none | **Adding a new app**: Annotate the app's Service in its Helm values: ```yaml service: annotations: gethomepage.dev/enabled: "true" gethomepage.dev/name: "My App" gethomepage.dev/description: "What it does" gethomepage.dev/group: "GroupName" gethomepage.dev/icon: "icon-name" # https://github.com/walkxcode/dashboard-icons gethomepage.dev/href: "https://myapp.forteapps.net" # Optional live widget: gethomepage.dev/widget.type: "myapp" gethomepage.dev/widget.url: "https://myapp.forteapps.net" # gethomepage.dev/widget.key: "{{HOMEPAGE_VAR_MYAPP_TOKEN}}" ``` **Widget API credentials**: Inject via env vars into the Homepage pod: ```yaml # In homepage-values.yaml per environment env: - name: HOMEPAGE_VAR_GRAFANA_TOKEN valueFrom: secretKeyRef: name: homepage-widget-credentials key: grafana-token ``` Then reference as `gethomepage.dev/widget.key: "{{HOMEPAGE_VAR_GRAFANA_TOKEN}}"`. **Values files**: - `infra/values/base/homepage-values.yaml` — RBAC, kubernetes mode, layout - `infra/values/{env}/homepage-values.yaml` — hostname per environment --- ### Traefik **Chart**: `traefik/traefik` **Version**: Latest **Namespace**: `traefik` **Configuration**: ```yaml # infra/base/traefik-application.yaml replicas: 2 service: type: LoadBalancer ingressRoute: dashboard: enabled: false ports: web: redirectTo: websecure # HTTP → HTTPS redirect websecure: tls: enabled: true ``` **Endpoints**: - HTTP: `:80` → Redirects to HTTPS - HTTPS: `:443` ### Cert-Manager **Chart**: `jetstack/cert-manager` **Namespace**: `cert-manager` **ClusterIssuer**: ```yaml apiVersion: cert-manager.io/v1 kind: ClusterIssuer metadata: name: letsencrypt-prod spec: acme: server: https://acme-v02.api.letsencrypt.org/directory email: admin@forteapps.net privateKeySecretRef: name: letsencrypt-prod-key solvers: - http01: ingress: class: traefik ``` ### Kyverno **Chart**: `kyverno/kyverno` **Namespace**: `kyverno` **Policies**: - Secret cloner - Default namespace blocker - Bare pod cleaner - ReplicaSet cleaner - Deployment verifier - Auth sidecar injector ### Sealed Secrets **Chart**: `sealed-secrets/sealed-secrets-controller` **Namespace**: `kube-system` **Directory Structure**: `secrets/base/` contains all SealedSecrets with a `kustomization.yaml`. Per-cloud overlays in `secrets/overlays//` reference the base via Kustomize. The ArgoCD `secrets` Application points to the active overlay (e.g., `secrets/overlays/upc-dev`), and `infra/overlays/upc-prod` patches the path to `secrets/overlays/upc-prod`. To add cloud-specific secrets, create a new SealedSecret in the overlay directory and add it to the overlay's `kustomization.yaml`. **Public Certificate**: ```bash kubeseal --fetch-cert \ --controller-name=sealed-secrets-controller \ --controller-namespace=kube-system \ > pub-cert.pem ``` ### Prometheus **Chart**: `prometheus-community/prometheus` **Namespace**: `monitoring` **Configuration**: ```yaml server: persistentVolume: enabled: true size: 10Gi alertmanager: enabled: false nodeExporter: enabled: true kubeStateMetrics: enabled: true ``` ### Grafana **Chart**: `grafana/grafana` **Namespace**: `monitoring` **Datasources**: - Prometheus - Loki - Tempo **Ingress**: Exposed via Traefik at `https://grafana.forteapps.net` with cert-manager TLS. **OIDC Authentication** (Keycloak): - Uses `grafana.ini.auth.generic_oauth` with KC `grafana` client - Secret `grafana-oidc-credentials` synced by KC registrar, loaded via `envFromSecrets` - SSO-only mode: `auth.disable_login_form: true` + `auth.generic_oauth.auto_login: true` - Role mapping via JMESPath on `resource_access.grafana.roles` claim (requires KC client role mapper) - Roles: KC client roles `Admin`/`Editor` map to Grafana roles; default is `Viewer` ### Loki **Chart**: `grafana/loki-stack` **Namespace**: `monitoring` **Configuration**: ```yaml loki: persistence: enabled: true size: 10Gi promtail: enabled: false # Using Fluent-Bit instead ``` ### Tempo **Chart**: `grafana/tempo` **Version**: 1.24.4 **Namespace**: `monitoring` **Purpose**: Distributed tracing backend receiving OTLP traces from Traefik and other instrumented services. **Configuration**: ```yaml tempo: storage: trace: backend: local local: path: /var/tempo/traces receivers: otlp: protocols: grpc: endpoint: "0.0.0.0:4317" http: endpoint: "0.0.0.0:4318" persistence: enabled: true size: 10Gi ``` **Endpoints**: - gRPC OTLP receiver: `:4317` - HTTP OTLP receiver: `:4318` - Query API: `:3200` **Grafana Integration**: - Trace-to-logs correlation with Loki (by namespace, pod, container) - Trace-to-metrics correlation with Prometheus (by service name) - Service graph and node graph visualization ### Fluent-Bit **Chart**: `fluent/fluent-bit` **Namespace**: `monitoring` **Output**: Loki ### Gitea **Chart**: `gitea/gitea` **Version**: 12.5.0 (app v1.25.4) **Namespace**: `gitea` **Purpose**: Self-hosted Git repository hosting with pull requests, issues, CI/CD (Gitea Actions), container registry, and package registry. **Configuration**: ```yaml # infra/base/gitea.yaml + infra/values/base/gitea-values.yaml ingress: host: git.forteapps.net tls: cert-manager (letsencrypt-prod) gitea: admin: existingSecret: gitea-credentials config: service: DISABLE_REGISTRATION: true ALLOW_ONLY_EXTERNAL_REGISTRATION: true actions: ENABLED: true packages: ENABLED: true metrics: ENABLED: true postgresql: enabled: true persistence: 8Gi (upcloud-block-storage-maxiops) ``` **Authentication**: Keycloak OIDC via `forte` realm (client ID: `gitea`). Protocol mapper: `email_verified` hardcoded claim (`true`, boolean) on ID token, Access token, and Userinfo. **External User Sync**: Disabled (`cron.sync_external_users.ENABLED: false`). This Gitea cron job is designed for LDAP and deactivates OIDC-only users because it cannot enumerate them — causing "Sign-in prohibited" errors after the sync runs. **Email Notifications**: Enabled (`ENABLE_NOTIFY_MAIL: true`). SMTP credentials injected via `gitea-smtp-secret` using `additionalConfigFromEnvs` with `GITEA__mailer__USER` / `GITEA__mailer__PASSWD` environment variables. **Auto-Watch**: Disabled (`AUTO_WATCH_ON_CHANGES: false`, `AUTO_WATCH_NEW_REPOS: false`). Prevents contributors from being auto-subscribed to repo notifications on push, reducing email noise from CI bots (e.g., ai-review PR comments). Users who were already watching before this change need to manually unwatch or switch to "Only participating". **Endpoints**: - Web UI: `https://git.forteapps.net` - SSH: port 22 (ClusterIP) - Metrics: `/metrics` (Prometheus scrape) **Secrets**: - `gitea-credentials` (SealedSecret) — admin password - `gitea-oidc-credentials` (registrar-managed) — OIDC client ID + secret - `gitea-smtp-secret` (SealedSecret) — SMTP username + password ### Gitea Actions Runners **Chart**: `actions` (from `https://dl.gitea.com/charts`) **Namespace**: `gitea` **Sync Wave**: 2 (deploys after Gitea) **Purpose**: Act runners execute Gitea Actions CI/CD workflows. Deployed as a StatefulSet with a Docker-in-Docker sidecar for container-based job execution. **Configuration**: ```yaml # infra/base/gitea-actions.yaml + infra/values/base/gitea-actions-values.yaml replicaCount: 3 runner: labels: - "ubuntu-latest:docker://node:20-bookworm" - "ubuntu-22.04:docker://node:20-bookworm" existingSecret: gitea-runner-token gitea: instance: url: http://gitea-http.gitea.svc.cluster.local:3000 dind: enabled: true # Docker-in-Docker sidecar (privileged) ``` **Resources**: | Container | CPU Request | Memory Request | CPU Limit | Memory Limit | |-----------|-------------|----------------|-----------|--------------| | Runner | 250m | 256Mi | 1 | 1Gi | | DinD sidecar | 250m | 256Mi | 1 | 1Gi | **Secrets**: `gitea-runner-token` (SealedSecret) containing `token` (instance-level runner registration token from `/admin/runners`) **Setup Steps**: 1. Get runner registration token from Gitea admin panel (`/admin/runners`) 2. Fill in `private/gitea-runner-token.yaml` with the token 3. Seal: `kubeseal --format yaml < private/gitea-runner-token.yaml > secrets/gitea-runner-token-sealed.yaml` 4. Commit and push — ArgoCD deploys runners automatically **Verification**: - `kubectl get statefulset -n gitea` — 3/3 runners ready - Gitea admin panel (`/admin/runners`) — runners show as Online - Create test workflow in `.gitea/workflows/test.yml` — job executes ### Vaultwarden **Chart**: `guerzon/vaultwarden` **Version**: 0.36.4 (app v1.36.0-alpine) **Namespace**: `vaultwarden` **Purpose**: Self-hosted Bitwarden-compatible password manager. **Configuration**: ```yaml # infra/overlays/upc-dev/vaultwarden/ + infra/values/ domain: "https://bitwarden.forteapps.net" ingress: enabled: true class: "traefik" tls: true tlsSecret: vaultwarden-tls hostname: bitwarden.forteapps.net additionalAnnotations: cert-manager.io/cluster-issuer: letsencrypt-prod database: type: postgresql host: vaultwarden-postgresql # StatefulSet in overlay existingSecret: prod-db-creds storage: data: 5Gi (ReadWriteOnce) attachments: 5Gi (ReadWriteOnce) ``` **TLS**: cert-manager auto-provisions Let's Encrypt certificate via `letsencrypt-prod` ClusterIssuer (same pattern as Gitea, Grafana, etc). **SSO**: Keycloak OIDC via `forte` realm (client ID: `vaultwarden`). Self-service client config Secret (`keycloak-client-vaultwarden`) triggers registrar to create KC client and sync credentials to `vaultwarden-oidc-credentials`. PKCE enabled. **Endpoints**: - Web UI: `https://bitwarden.forteapps.net` **Database**: Separate ArgoCD Application `vaultwarden-postgresql` (sync-wave `"0"`) deploys PostgreSQL 16 StatefulSet + SealedSecret before Vaultwarden (wave `"1"`). 2Gi PVC. Chart does NOT include a PostgreSQL subchart — must be provisioned separately. **Secrets**: - `prod-db-creds` (SealedSecret) — PostgreSQL credentials (`pgusername`, `pgpassword`) + SMTP credentials - `vaultwarden-oidc-credentials` (registrar-managed) — OIDC client ID + secret - `vaultwarden-tls` — auto-managed by cert-manager ### AI Code Review (ai-review) **Type**: Gitea Actions workflow (`.gitea/workflows/ai-review.yaml`) **Trigger**: `pull_request` events (`opened`, `synchronize`) **Runner**: `ubuntu-latest` (container: `nikitafilonov/ai-review:latest`) **Purpose**: Automated AI-powered code review on pull requests using Claude (Anthropic). Posts inline comments on changed lines and a PR summary comment highlighting infrastructure impact. **Architecture**: - Uses [xai-review](https://github.com/nicktechnologies/xai-review) Docker image - Shared configuration and prompts live in the `shared-prompts` Git submodule (→ `Forte/ai-review-prompts`) - Review mode: `ONLY_ADDED_WITH_CONTEXT` — reviews only new/changed lines plus surrounding context (token-efficient) - Agent mode: disabled (one-shot review, no multi-turn reasoning) - LLM: Claude Sonnet (`claude-sonnet-4-20250514`) **Shared Prompts Structure** (submodule: `Forte/ai-review-prompts`): ``` shared-prompts/ base/ security.md # org-wide security rules (all profiles) iac/ .ai-review.yaml # IaC/GitOps profile config inline.md # inline review prompt summary.md # PR summary prompt # future profiles: backend/, frontend/, etc. ``` **Configuration** (`shared-prompts/iac/.ai-review.yaml`): ```yaml llm: provider: CLAUDE model: claude-sonnet-4-20250514 vcs: provider: GITEA review: mode: ONLY_ADDED_WITH_CONTEXT agent: enabled: false prompt: inline_prompt_files: # concatenated in order - ./shared-prompts/base/security.md - ./shared-prompts/iac/inline.md summary_prompt_files: - ./shared-prompts/iac/summary.md ignore: - "*.sealed.yaml" - "*.lock" - "docs/**" ``` **Custom Prompts** (IaC profile): - `shared-prompts/base/security.md` — org-wide security rules, concatenated before every inline review prompt - `shared-prompts/iac/inline.md` — IaC-specific inline review (YAML, Helm, K8s manifests, shell scripts), max 7 comments - `shared-prompts/iac/summary.md` — PR summary: affected services/namespaces, infrastructure impact, security flags **Prompt composition**: ai-review does not support Jinja includes. Instead, list multiple files under `inline_prompt_files` / `summary_prompt_files` — they are concatenated in order with double newlines. **Adding a new profile**: Create a new directory (e.g., `backend/`) with its own `.ai-review.yaml`, `inline.md`, and `summary.md`. The `inline_prompt_files` list should include `base/security.md` first, then the profile-specific prompt. Reference it in the consuming repo's workflow: `AI_REVIEW_CONFIG_FILE_YAML=./shared-prompts/backend/.ai-review.yaml` **Required Secrets** (configure in Gitea repo or org settings): | Secret | Purpose | |--------|---------| | `ANTHROPIC_API_KEY` | Claude API key (from Anthropic console) | | `AI_REVIEW_TOKEN` | Gitea API token with `write:repository` + `read:repository` scopes (use a bot/service account) | **Setup Steps**: 1. Create a Gitea bot/service account and generate an API token with `write:repository` + `read:repository` scopes 2. Add `AI_REVIEW_TOKEN` secret in Gitea repo settings → Actions → Secrets 3. Add `ANTHROPIC_API_KEY` secret with your Anthropic API key 4. Ensure the `shared-prompts` submodule is initialized (`git submodule update --init`) 5. Push the workflow file — it triggers automatically on PR creation/update **Verification**: - Open a PR with infrastructure changes → workflow runs → inline comments + summary appear - Check Gitea Actions tab for workflow run status and logs - Monitor Anthropic usage dashboard for token consumption ### Keycloak Browser Flow (IdP Auto-Redirect) **File**: `infra/values/base/keycloak-values.yaml` (inside `forte-realm.json`) The realm uses a custom browser authentication flow (`browser-auto-idp`) that skips the Keycloak login page and redirects directly to the Entra ID identity provider. **Flow executions**: | Priority | Authenticator | Requirement | Purpose | |----------|--------------|-------------|---------| | 10 | `auth-cookie` | ALTERNATIVE | Reuse existing session (no redirect) | | 20 | `identity-provider-redirector` | ALTERNATIVE | Auto-redirect to `forte-entra` IdP | **Key fields in realm JSON**: - `"browserFlow": "browser-auto-idp"` — overrides the default `browser` flow at realm level - `"authenticationFlows"` — defines the custom flow with its executions - `"authenticatorConfig"` — sets `defaultProvider: "forte-entra"` on the redirector **Why custom flow**: The default KC browser flow shows a username/password form with an IdP button. Since all authentication is via Entra ID, the custom flow eliminates this step. The `auth-cookie` execution preserves session reuse so returning users aren't redirected again. **Important**: The `forte-entra` identity provider must exist in Keycloak (currently configured manually in the KC admin console). If the IdP alias changes, update the `defaultProvider` value in the realm JSON. --- ### Keycloak Client Registrar **Type**: CronJob (deployed via Keycloak Helm chart `extraDeploy`) **Namespace**: `keycloak` **Schedule**: `*/2 * * * *` (every 2 minutes) **Purpose**: Handles two responsibilities: 1. **Legacy sync** — extracts secrets from Keycloak clients with `k8s.secret.sync: "true"` attribute (same as former PostSync syncer) 2. **Self-service registration** — processes config Secrets (cloned by Kyverno) to register new OIDC clients and sync their credentials **How It Works**: *Legacy path (existing clients like Gitea):* 1. Authenticates to Keycloak Admin API using admin credentials from `keycloak-credentials` secret 2. Queries all clients in the `forte` realm 3. Filters clients with `k8s.secret.sync: "true"` attribute 4. For each matching client, retrieves the auto-generated secret via Keycloak Admin API 5. Creates/updates a K8s Secret in the target namespace (from `k8s.secret.namespace` attribute) 6. Always writes a central copy to the `secrets` namespace *Self-service path (new clients):* 1. Lists Secrets in `keycloak` namespace with label `keycloak.forteapps.net/client-config=true` 2. For each config Secret, parses `client.json` and computes a config hash 3. Skips if hash matches annotation and credential Secret already exists 4. Creates or updates the Keycloak client via Admin API 5. Fetches the generated client secret 6. Upserts credential Secret in target namespace + central `secrets` namespace 7. Annotates config Secret with sync status, config hash, and timestamp **Resources**: - `ServiceAccount`: `keycloak-client-registrar` (namespace: `keycloak`) - `ClusterRole`: `keycloak-client-registrar` - Secrets: `get`, `list`, `create`, `update`, `patch` - Namespaces: `get`, `list` - `ClusterRoleBinding`: `keycloak-client-registrar` - `CronJob`: `keycloak-client-registrar` - **Schedule**: `*/2 * * * *` (every 2 minutes) - **Concurrency Policy**: `Forbid` (prevents concurrent runs) - **Backoff Limit**: 3 retries per job - **History**: 1 successful job, 3 failed jobs retained - **Resources**: 50m CPU / 64Mi memory (requests), 200m CPU / 128Mi memory (limits) **Container**: Alpine 3.20 with `curl` and `jq` installed **Kyverno Policy**: `keycloak-client-config-cloner` — clones labeled Secrets from app namespaces to `keycloak` namespace (see [Kyverno Policies](#kyverno-policies)) **Legacy Client Attributes** (set in `forte-realm.json`): | Attribute | Required | Default | Description | |-----------|----------|---------|-------------| | `k8s.secret.sync` | Yes | — | Set to `"true"` to enable syncing | | `k8s.secret.namespace` | Yes | — | Target K8s namespace | | `k8s.secret.name` | Yes | — | Name of the K8s Secret | | `k8s.secret.client-id-key` | No | `client-id` | Field name for client ID in the Secret | | `k8s.secret.client-secret-key` | No | `client-secret` | Field name for client secret in the Secret | **Self-Service Config Secret Schema**: ```yaml apiVersion: v1 kind: Secret metadata: name: keycloak-client- namespace: labels: keycloak.forteapps.net/client-config: "true" stringData: client.json: | { "clientId": "", "name": "", "redirectUris": ["https://.forteapps.net/*"], "webOrigins": ["https://.forteapps.net"], "defaultClientScopes": ["openid", "email", "profile"], "protocolMappers": [], "secret": { "namespace": "", "name": "-oidc-credentials", "keys": { "clientId": "client-id", "clientSecret": "client-secret" } } } ``` **Created Credential Secret Format**: ```yaml apiVersion: v1 kind: Secret metadata: name: namespace: labels: app.kubernetes.io/managed-by: keycloak-client-registrar type: Opaque data: : : ``` **Config Secret Annotations** (set by registrar): | Annotation | Description | |-----------|-------------| | `keycloak.forteapps.net/config-hash` | SHA-256 hash of client.json for change detection | | `keycloak.forteapps.net/sync-status` | `synced` or `error` | | `keycloak.forteapps.net/last-sync` | ISO 8601 timestamp of last successful sync | **Verification**: ```bash # Check CronJob status kubectl get cronjobs -n keycloak # View latest registrar logs kubectl logs -n keycloak job/$(kubectl get jobs -n keycloak --sort-by=.metadata.creationTimestamp -o jsonpath='{.items[-1].metadata.name}') # Verify created secret kubectl get secret -n -o yaml # Check config Secret annotations (self-service) kubectl get secret keycloak-client- -n keycloak -o jsonpath='{.metadata.annotations}' ``` **See**: [Developer Guide - Adding a New Keycloak Client](DEVELOPER-GUIDE.md#adding-a-new-keycloak-client) ### Karpor **Chart**: `karpor` from `https://kusionstack.github.io/charts` **Version**: 0.7.6 (app v0.6.4) **Namespace**: `karpor` **Sync Wave**: 1 **Purpose**: Kubernetes visualization and intelligence tool. Provides cross-cluster resource search, compliance checking, and topology visualization. Gives platform engineers a unified view of all cluster resources and their relationships. **Architecture** (4 components): - **Server** — main Karpor API/UI (port 7443) - **Syncer** — syncs cluster state into the search index - **ElasticSearch** — search backend for resource indexing - **etcd** — persistent key-value store (10Gi PVC) **Configuration** (`infra/values/base/karpor-values.yaml`): - `namespaceEnabled: false` — ArgoCD manages namespace creation - Default resource limits tuned for small clusters - ElasticSearch: 2 CPU / 4Gi memory (the heaviest component) - AI features available but not enabled (requires `server.ai.authToken` + backend config) **Access**: Port-forward to reach the UI: ```bash kubectl port-forward svc/karpor-release-server -n karpor 7443:7443 # Open https://localhost:7443 ``` ### Renovate **Chart**: `renovate` (OCI: `ghcr.io/renovatebot/charts`) **Version**: 46.109.0 (app v43.113.0) **Namespace**: `renovate` **Sync Wave**: 2 **Purpose**: Automated dependency update bot. Runs as a CronJob that scans Gitea repositories for outdated dependencies and creates pull requests with updates. **Configuration**: ```yaml # infra/base/renovate.yaml + infra/values/base/renovate-values.yaml cronjob: schedule: "@daily" concurrencyPolicy: Forbid renovate: config: platform: gitea endpoint: https://git.forteapps.net autodiscover: true gitAuthor: "Renovate Bot " packageRules: - matchRepositories: ["**/10x"] assignees: ["edvard.unsvag"] reviewers: ["edvard.unsvag"] - matchRepositories: ["**/auth-sidecar"] assignees: ["danijel.simeunovic"] reviewers: ["danijel.simeunovic"] - matchRepositories: ["**/forte-helm"] assignees: ["danijel.simeunovic"] reviewers: ["danijel.simeunovic"] resources: requests: { cpu: 500m, memory: 1Gi } limits: { cpu: "2", memory: 4Gi } ``` **Note**: Assignees and reviewers are only applied at PR creation time. Existing PRs must be closed and recreated for new assignment rules to take effect. **Secrets**: `renovate-env` (SealedSecret in `secrets` namespace, cloned by Kyverno) containing: - `RENOVATE_TOKEN` — Gitea PAT with repo write + issue write permissions - `RENOVATE_GITHUB_COM_TOKEN` — GitHub PAT (public_repo read-only) for changelog fetching **Setup Steps**: 1. Fill in `private/renovate-env.yaml` with tokens 2. Seal: `kubeseal --format yaml < private/renovate-env.yaml > secrets/renovate-env-sealed.yaml` 3. Commit and push — ArgoCD deploys the CronJob, Kyverno clones the secret **Verification**: - `kubectl get cronjob -n renovate` — CronJob exists - `kubectl create job --from=cronjob/renovate renovate-test -n renovate` — manual trigger - `kubectl logs -n renovate job/renovate-test` — check logs --- ## Kyverno Policies ### Secret Cloner **File**: `cluster-resources/policies/secret-cloner.yaml` **Purpose**: Automatically clone secrets from `secrets` namespace to new namespaces ```yaml apiVersion: kyverno.io/v1 kind: ClusterPolicy metadata: name: sync-secret-with-multi-clone spec: rules: - name: clone-secret match: any: - resources: kinds: - Namespace generate: apiVersion: v1 kind: Secret name: "{{ request.object.metadata.name }}" namespace: "{{ request.object.metadata.name }}" synchronize: true clone: namespace: secrets name: shared-credentials ``` **Label Requirement**: Secrets must have `allowedToBeCloned: "true"` ### Keycloak Client Config Cloner **File**: `cluster-resources/policies/keycloak-client-cloner.yaml` **Purpose**: Clones Secrets labeled `keycloak.forteapps.net/client-config: "true"` from app namespaces to the `keycloak` namespace. This allows apps to declare their OIDC client configuration in their own namespace, which the [Keycloak Client Registrar](#keycloak-client-registrar) then processes. **Trigger**: Any Secret with label `keycloak.forteapps.net/client-config: "true"` created outside the `keycloak` namespace. **Behavior**: - Generates a copy of the Secret in the `keycloak` namespace with the same name - Adds source tracking annotations (`keycloak.forteapps.net/source-namespace`, `keycloak.forteapps.net/source-name`) - `synchronize: true` — changes to the source Secret are reflected in the clone ### Default Namespace Blocker **File**: `cluster-resources/policies/default-ns-blocker.yaml` **Purpose**: Prevent resources from being created in `default` namespace ```yaml apiVersion: kyverno.io/v1 kind: ClusterPolicy metadata: name: disallow-default-namespace spec: validationFailureAction: enforce rules: - name: validate-namespace match: any: - resources: kinds: - Pod - Deployment - Service validate: message: "Using 'default' namespace is not allowed" pattern: metadata: namespace: "!default" ``` ### Bare Pod Cleaner **File**: `cluster-resources/policies/bare-pod-cleaner.yaml` **Purpose**: Delete pods without ownerReferences (not managed by Deployment/StatefulSet) ```yaml apiVersion: kyverno.io/v1 kind: ClusterPolicy metadata: name: cleanup-bare-pods spec: rules: - name: delete-bare-pod match: any: - resources: kinds: - Pod preconditions: all: - key: "{{ request.object.metadata.ownerReferences[] || '' }}" operator: Equals value: "" validate: message: "Bare pods (without controllers) are not allowed" deny: {} ``` ### Auth Sidecar Injector **File**: `cluster-resources/policies/auth-sidecar-injector.yaml` **Purpose**: Automatically inject authentication sidecar into pods with authentication enabled **Rules**: 6 rules in the policy 1. `generate-auth-tokens-secret` - Creates Secret for token mode 2. `generate-auth-oidc-secret` - Creates Secret for OIDC mode 3. `inject-sidecar-token` - Injects auth sidecar for token mode 4. `inject-sidecar-oidc` - Injects auth sidecar for OIDC mode 5. `inject-sidecar-mcp` - Injects auth sidecar for MCP OAuth mode (RFC 9728 / RFC 7591) 6. `generate-auth-network-policy` - Creates NetworkPolicy to restrict ingress #### Trigger Annotation ```yaml policies.forteapps.io/auth: "true" ``` #### Authentication Modes **Token Mode** (default): ```yaml # Annotations policies.forteapps.io/auth: "true" policies.forteapps.io/auth-type: "token" policies.forteapps.io/auth-token-secret-name: "auth-tokens" policies.forteapps.io/auth-upstream-url: "http://localhost:3000" # Optional customization policies.forteapps.io/auth-image: "ghcr.io/fortedigital/auth-sidecar" policies.forteapps.io/auth-image-version: "latest" ``` **OIDC Mode**: ```yaml # Annotations (required) policies.forteapps.io/auth: "true" policies.forteapps.io/auth-type: "oidc" policies.forteapps.io/auth-oidc-authority: "https://auth.example.com/realms/master" policies.forteapps.io/auth-oidc-client-id: "myapp" # Optional annotations policies.forteapps.io/auth-oidc-callback-path: "/auth/callback" policies.forteapps.io/auth-oidc-scopes: "openid,profile,email" policies.forteapps.io/auth-upstream-url: "http://localhost:3000" policies.forteapps.io/auth-image: "ghcr.io/fortedigital/auth-sidecar" policies.forteapps.io/auth-image-version: "latest" ``` **MCP Mode** (OAuth 2.0 for MCP servers, implements RFC 9728 / RFC 7591): ```yaml # Annotations (required) policies.forteapps.io/auth: "true" policies.forteapps.io/auth-type: "mcp" policies.forteapps.io/auth-mcp-resource: "https://mcp.example.com" policies.forteapps.io/auth-mcp-authority: "https://auth.example.com" # Optional annotations policies.forteapps.io/auth-mcp-scopes: "read,write" policies.forteapps.io/auth-upstream-url: "http://localhost:3000" policies.forteapps.io/auth-log-level: "info" policies.forteapps.io/auth-image: "ghcr.io/fortedigital/auth-sidecar" policies.forteapps.io/auth-image-version: "latest" ``` #### Sidecar Container Specification **Token Mode**: ```yaml name: authn image: ghcr.io/fortedigital/auth-sidecar:latest ports: - containerPort: 8080 name: auth protocol: TCP env: - name: AUTH_MODE value: "token" - name: AUTH_LISTEN_ADDR value: ":8080" - name: AUTH_UPSTREAM_URL value: "http://localhost:3000" - name: AUTH_TOKEN_FILE value: "/etc/auth/tokens" volumeMounts: - name: auth-tokens mountPath: /etc/auth readOnly: true resources: requests: cpu: 10m memory: 32Mi limits: cpu: 50m memory: 64Mi securityContext: allowPrivilegeEscalation: false readOnlyRootFilesystem: true capabilities: drop: [ALL] ``` **OIDC Mode**: ```yaml name: authn image: ghcr.io/fortedigital/auth-sidecar:latest ports: - containerPort: 8080 name: auth protocol: TCP env: - name: AUTH_MODE value: "oidc" - name: AUTH_LISTEN_ADDR value: ":8080" - name: AUTH_UPSTREAM_URL value: "http://localhost:3000" - name: AUTH_OIDC_AUTHORITY value: "https://auth.example.com/realms/master" - name: AUTH_OIDC_CLIENT_ID value: "myapp" - name: AUTH_OIDC_CALLBACK_PATH value: "/auth/callback" - name: AUTH_OIDC_SCOPES value: "openid,profile,email" - name: AUTH_OIDC_COOKIE_SECRET valueFrom: secretKeyRef: name: auth-oidc key: cookie-secret - name: AUTH_OIDC_CLIENT_SECRET valueFrom: secretKeyRef: name: auth-oidc key: client-secret resources: requests: cpu: 10m memory: 32Mi limits: cpu: 50m memory: 64Mi securityContext: allowPrivilegeEscalation: false readOnlyRootFilesystem: true capabilities: drop: [ALL] ``` **MCP Mode**: ```yaml name: authn image: ghcr.io/fortedigital/auth-sidecar:latest ports: - containerPort: 8080 name: auth protocol: TCP env: - name: AUTH_MODE value: "mcp" - name: AUTH_LISTEN_ADDR value: ":8080" - name: AUTH_LOG_LEVEL value: "info" - name: AUTH_UPSTREAM_URL value: "http://localhost:3000" - name: AUTH_MCP_RESOURCE value: "https://mcp.example.com" - name: AUTH_MCP_AUTHORIZATION_SERVERS value: "https://auth.example.com" - name: AUTH_MCP_SCOPES_SUPPORTED value: "read,write" resources: requests: cpu: 10m memory: 32Mi limits: cpu: 50m memory: 64Mi securityContext: allowPrivilegeEscalation: false readOnlyRootFilesystem: true capabilities: drop: [ALL] ``` #### Generated Resources **Secret (Token Mode)**: ```yaml apiVersion: v1 kind: Secret metadata: name: auth-tokens namespace: labels: app.kubernetes.io/managed-by: kyverno app.kubernetes.io/created-by: inject-auth-sidecar type: Opaque data: {} # Populated by Helm chart ``` **Secret (OIDC Mode)**: ```yaml apiVersion: v1 kind: Secret metadata: name: auth-oidc namespace: labels: app.kubernetes.io/managed-by: kyverno app.kubernetes.io/created-by: inject-auth-sidecar type: Opaque data: client-secret: cookie-secret: ``` **NetworkPolicy**: ```yaml apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: -auth-ingress namespace: labels: app.kubernetes.io/managed-by: kyverno app.kubernetes.io/created-by: inject-auth-sidecar spec: podSelector: matchLabels: policyTypes: - Ingress ingress: - ports: - port: 8080 protocol: TCP ``` #### Excluded Namespaces The policy does NOT apply to: - `kube-system` - `kyverno` - `argocd` - `cert-manager` - `monitoring` #### Health Checks ```yaml readinessProbe: httpGet: path: /healthz port: 8080 initialDelaySeconds: 2 periodSeconds: 5 livenessProbe: httpGet: path: /healthz port: 8080 initialDelaySeconds: 5 periodSeconds: 10 ``` #### Request Flow ``` External Request → Traefik ↓ Service (port 8080) ↓ Pod: Auth Sidecar (port 8080) ├─ Validate credentials │ • Token mode: Check Bearer token │ • OIDC mode: Validate session or redirect to IdP │ • MCP mode: OAuth 2.0 via RFC 9728 discovery / RFC 7591 dynamic registration ↓ Forward to Application (localhost:3000) ↓ Application processes request ``` #### Forwarded Headers After successful authentication, the sidecar injects user identity as HTTP headers before forwarding the request to the application container: | Header | Description | Auth Modes | |--------|-------------|------------| | `X-Auth-User` | Username or display name | Token, OIDC, MCP | | `X-Auth-Email` | User email address | OIDC | | `X-Auth-Subject` | OIDC `sub` claim (stable user ID) | OIDC, MCP | | `X-Auth-Groups` | Comma-separated group memberships | OIDC (if `groups` scope) | | `X-Auth-Token` | The validated access token | All modes | These headers are trustworthy because the auto-generated `NetworkPolicy` restricts pod ingress to the sidecar port only — external traffic cannot reach the application container directly, so headers cannot be spoofed. Applications should read these headers to obtain authenticated user information (e.g. for display, authorisation decisions, or audit logging) instead of implementing their own authentication. **See**: [Developer Guide - Accessing Authenticated User Information](DEVELOPER-GUIDE.md#accessing-authenticated-user-information) for code examples. --- ## Configuration Reference ### Environment Variables Common environment variables used across applications: | Variable | Purpose | Example | |----------|---------|---------| | `NODE_ENV` | Node.js environment | `production` | | `PORT` | Application port | `3000` | | `DB_HOST` | Database host | `postgres` | | `DB_PORT` | Database port | `5432` | | `DB_USER` | Database user | `app_user` | | `DB_NAME` | Database name | `app_db` | | `DB_PASSWORD` | Database password | From secret | | `API_KEY` | External API key | From secret | ### Resource Limits Recommended resource allocation: | Application Type | CPU Request | Memory Request | CPU Limit | Memory Limit | |------------------|-------------|----------------|-----------|--------------| | **Lightweight API** | 100m | 128Mi | 500m | 512Mi | | **Standard Web App** | 200m | 256Mi | 1000m | 1Gi | | **Heavy Processing** | 500m | 512Mi | 2000m | 2Gi | | **Database** | 250m | 256Mi | 1000m | 1Gi | ### Storage Classes Storage classes are cloud-specific and configured in per-cluster value overrides (`infra/values/{cluster}/gitea-values.yaml`): | Cloud | Storage Class | Driver | |-------|--------------|--------| | **UpCloud** | `upcloud-block-storage-maxiops` | UpCloud CSI | | **AWS EKS** | `gp3` | EBS CSI | | **Azure AKS** | `managed-csi-premium` | Azure Disk CSI | | **GCP GKE** | `premium-rwo` | PD CSI | ```yaml # Example: base values omit storageClass (set in per-cluster overlay) persistence: enabled: true accessMode: ReadWriteOnce size: 5Gi # storageClass set by infra/values/{cluster}/gitea-values.yaml ``` --- ## API Endpoints ### ArgoCD API ``` # Server https://argocd.127.0.0.1.nip.io # Applications endpoint GET /api/v1/applications # Application details GET /api/v1/applications/{name} # Sync application POST /api/v1/applications/{name}/sync ``` ### Prometheus API ``` # Query endpoint GET /api/v1/query?query={promql} # Query range GET /api/v1/query_range?query={promql}&start={time}&end={time}&step={duration} # Metrics GET /api/v1/label/__name__/values ``` ### Tempo API ``` # Search traces GET /api/search?q={traceql} # Get trace by ID GET /api/traces/{traceID} # Service tag values GET /api/v2/search/tag/resource.service.name/values ``` ### Loki API ``` # Query logs GET /loki/api/v1/query?query={logql} # Query range GET /loki/api/v1/query_range?query={logql}&start={time}&end={time} # Push logs POST /loki/api/v1/push ``` --- ## Cloud Overlay Pattern ### Overview Cloud-specific configuration (StorageClass, LoadBalancer annotations, pricing models, etc.) lives in per-cloud overlay value files, **not** in `base/`. Adding a new cloud provider only requires a new overlay directory — no base changes. ### Supported Clouds | Cloud | Dev overlay | Prod overlay | StorageClass | LB type | |-------|-----------|-------------|-------------|---------| | **UpCloud** | `upc-dev` | `upc-prod` | `upcloud-block-storage-maxiops` | UpCloud LB (proxy protocol v2) | | **Azure AKS** | `aks-dev` | `aks-prod` | `managed-csi-premium` | Azure LB | | **AWS EKS** | `eks-dev` | `eks-prod` | `gp3` | AWS NLB (proxy protocol) | | **GCP GKE** | `gke-dev` | `gke-prod` | `premium-rwo` | GCP NEG | Bootstrap any cluster with: `./bootstrap.sh ` (e.g., `./bootstrap.sh aks-dev`) ### How It Works Each ArgoCD Application uses **multi-source Helm values** with two value files: ```yaml # infra/base/gitea.yaml (example) helm: valueFiles: - $values/infra/values/base/gitea-values.yaml # [0] cloud-agnostic - $values/infra/values/upc-dev/gitea-values.yaml # [1] cloud-specific (default: upc-dev) ``` The `upc-prod` Kustomize overlay patches index `[1]` to swap the cloud-specific file: ```yaml # infra/overlays/upc-prod/kustomization.yaml - target: kind: Application name: gitea patch: | - op: replace path: /spec/sources/0/helm/valueFiles/1 value: $values/infra/values/upc-prod/gitea-values.yaml ``` ### Components Using Cloud Overlays | Component | Cloud-specific config | Overlay value file | |-----------|----------------------|-------------------| | **Traefik** | LB annotations, proxy protocol IPs | `traefik-values.yaml` | | **Keycloak** | Hostname, TLS settings | `keycloak-values.yaml` | | **Grafana** | Hostname, datasource URLs | `grafana-values.yaml` | | **Gitea** | StorageClass (persistence + PostgreSQL) | `gitea-values.yaml` | | **OpenCost** | Custom pricing model (CPU/RAM/storage rates) | `opencost-values.yaml` | ### Backup CronJob The `gitea-backup` CronJob uses a generic `s3` alias for `minio/mc`. The actual endpoint and credentials come from the `gitea-backup-s3` Sealed Secret, which is per-cloud. Reference scripts for different cloud providers are in `scripts/backup/`: | Script | Provider | Tool | |--------|----------|------| | `s3-minio.sh` | S3-compatible (UpCloud, MinIO, Wasabi) | `minio/mc` | | `aws-s3.sh` | AWS S3 | `aws` CLI | | `azure-blob.sh` | Azure Blob Storage | `az` CLI | | `gcp-gcs.sh` | GCP Cloud Storage | `gsutil` | ### Adding a New Cloud Provider To add support for a new cloud (e.g., `oci-dev` for Oracle Cloud): 1. **Cluster config**: `clusters/oci-dev.yaml` — clusterName, domain, trustedIPs, cloudProvider 2. **Overlay value files** in `infra/values/oci-dev/`: - `traefik-values.yaml` — LB annotations, proxy protocol config - `keycloak-values.yaml` — hostname - `grafana-values.yaml` — hostname - `gitea-values.yaml` — `storageClass` for persistence + PostgreSQL - `opencost-values.yaml` — pricing model or cloud billing integration 3. **Kustomize overlay**: `infra/overlays/oci-dev/kustomization.yaml` — patch `valueFiles[1]` for each Application 4. **App-of-apps**: `_app-of-apps-oci-dev.yaml` — points to `infra/overlays/oci-dev` 5. **Secrets overlay**: `secrets/overlays/oci-dev/kustomization.yaml` — references `../../base`, add cloud-specific SealedSecrets if needed 6. **Secrets patch**: Add patch to `infra/overlays/oci-dev/kustomization.yaml` to swap secrets path to `secrets/overlays/oci-dev` 7. **Bootstrap**: `./bootstrap.sh oci-dev` --- ## Glossary ### Terms **App-of-Apps**: ArgoCD pattern where a parent Application manages child Applications **GitOps**: Operations approach where Git is the single source of truth **IngressRoute**: Traefik CRD for routing external traffic to services **Multi-Source**: ArgoCD feature allowing multiple Git sources per Application **SealedSecret**: Encrypted secret that can be safely stored in Git **Sync Wave**: Ordered deployment using annotations **Self-Heal**: ArgoCD automatically reverts manual cluster changes **Prune**: Automatically delete resources removed from Git --- ## Annotations Reference ### ArgoCD Annotations ```yaml # Sync wave (deployment order) argocd.argoproj.io/sync-wave: "1" # Refresh application argocd.argoproj.io/refresh: "hard" # Compare options argocd.argoproj.io/compare-options: IgnoreExtraneous # Sync options per resource argocd.argoproj.io/sync-options: Prune=false ``` ### Kyverno Annotations ```yaml # Exclude from policy policies.kyverno.io/exclude: "true" # Severity policies.kyverno.io/severity: high ``` ### Custom Annotations ```yaml # Authentication enabled policies.forteapps.io/auth: "true" # OIDC configuration policies.forteapps.io/auth-oidc-authority: "https://..." policies.forteapps.io/auth-oidc-client-id: "client-id" ``` --- ## Labels Reference ### Standard Labels ```yaml # Application name app.kubernetes.io/name: myapp # Application instance app.kubernetes.io/instance: myapp # Application version app.kubernetes.io/version: "1.0.0" # Component type app.kubernetes.io/component: frontend # Part of larger application app.kubernetes.io/part-of: ecommerce # Managed by app.kubernetes.io/managed-by: argocd ``` ### Custom Labels ```yaml # Allow secret cloning allowedToBeCloned: "true" # Environment environment: production # Team ownership team: platform ``` --- ## Version Matrix ### Component Versions | Component | Version | Chart Version | |-----------|---------|---------------| | **ArgoCD** | 2.9.0+ | Latest | | **Traefik** | 2.10.0+ | Latest | | **Cert-Manager** | 1.13.0+ | Latest | | **Kyverno** | 1.10.0+ | Latest | | **Sealed Secrets** | 0.24.0+ | Latest | | **Prometheus** | 2.47.0+ | Latest | | **Grafana** | 10.0.0+ | Latest | | **Loki** | 2.9.0+ | Latest | | **Tempo** | 2.6.0+ | 1.24.4 | | **Fluent-Bit** | 2.1.0+ | Latest | | **Gitea** | 1.25.4 | 12.5.0 | | **Gitea Act Runner** | Latest | Latest | | **Renovate** | v43.113.0 | 46.109.0 | | **PostgreSQL** | 16-alpine | N/A | | **Trivy** | Latest | Latest | ### Kubernetes Compatibility - **Minimum**: 1.24+ - **Tested**: 1.28+ - **Recommended**: Latest stable --- **Last Updated**: 2026-04-22 **Maintained By**: Platform Team **Version**: 1.0.0