Files
launchpad/docs/REFERENCE.md
2026-04-25 11:49:17 +02:00

2022 lines
57 KiB
Markdown

# Technical Reference
## Table of Contents
- [Architecture Components](#architecture-components)
- [Repository Reference](#repository-reference)
- [Helm Chart Reference](#helm-chart-reference)
- [ArgoCD Configuration](#argocd-configuration)
- [Infrastructure Components](#infrastructure-components)
- [Kyverno Policies](#kyverno-policies)
- [Configuration Reference](#configuration-reference)
- [API Endpoints](#api-endpoints)
- [Cloud Overlay Pattern](#cloud-overlay-pattern)
- [Glossary](#glossary)
---
## Architecture Components
### Cluster Specifications
| Component | Value |
|-----------|-------|
| **Provider** | Multi-cloud (UpCloud, AWS EKS, Azure AKS, GCP GKE) |
| **Environment** | Dev + Production per cloud |
| **Active clusters** | UpCloud (upc-dev, upc-prod) |
| **Cloud-ready templates** | EKS, AKS, GKE (dev + prod each) |
| **GitOps Tool** | ArgoCD |
| **Ingress Controller** | Traefik v2 |
| **Certificate Management** | Cert-Manager + Let's Encrypt |
| **Policy Engine** | Kyverno |
| **Secret Management** | Sealed Secrets (Bitnami) |
| **Monitoring** | Prometheus + Grafana |
| **Logging** | Loki + Fluent-Bit |
| **Tracing** | Tempo (OTLP) |
| **Container Scanning** | Trivy |
| **Version Control** | Gitea |
### Network Architecture
```
Internet
[DNS: *.forteapps.net]
[Cloud Load Balancer]
[Traefik Ingress Controller]
├──► IngressRoute (TLS termination via Cert-Manager)
├──► Service (ClusterIP)
│ │
│ └──► Pod (Application Container)
└──► Service (Database - ClusterIP)
└──► StatefulSet (PostgreSQL)
```
---
## Repository Reference
### Config Repository: `launchpad`
**URL**: `https://git.forteapps.net/Forte/launchpad`
#### Directory Structure
```
launchpad/
├── bootstrap.sh # Cluster initialization script
├── _app-of-apps-upc-dev.yaml # Root ArgoCD Application (upc-dev)
├── _app-of-apps-upc-prod.yaml # Root ArgoCD Application (upc-prod)
├── infra/ # Infrastructure applications
│ ├── cluster-resources-application.yaml
│ ├── enterprise-apps.yaml
│ ├── traefik-application.yaml
│ ├── cert-manager-application.yaml
│ ├── kyverno.yaml
│ ├── kyverno-policies.yaml
│ ├── prometheus.yaml
│ ├── grafana.yaml
│ ├── loki.yaml
│ ├── tempo.yaml
│ ├── fluent-bit.yaml
│ ├── gitea.yaml
│ ├── gitea-actions.yaml
│ ├── sealedsecrets.yaml
│ ├── secrets.yaml
│ ├── renovate.yaml
│ ├── base/ # ArgoCD Application manifests (Kustomize base)
│ │ ├── gitea.yaml
│ │ ├── opencost.yaml
│ │ ├── traefik-application.yaml
│ │ ├── keycloak.yaml
│ │ ├── grafana.yaml
│ │ └── ...
│ ├── overlays/
│ │ └── upc-prod/
│ │ └── kustomization.yaml # Patches upc-dev → upc-prod valueFile paths
│ └── values/
│ ├── base/ # Cloud-agnostic Helm values
│ │ ├── gitea-values.yaml
│ │ ├── opencost-values.yaml
│ │ ├── prometheus-values.yaml
│ │ └── ...
│ ├── upc-dev/ # UpCloud dev overlay values
│ │ ├── traefik-values.yaml
│ │ ├── keycloak-values.yaml
│ │ ├── grafana-values.yaml
│ │ ├── gitea-values.yaml
│ │ └── opencost-values.yaml
│ └── upc-prod/ # UpCloud prod overlay values
│ ├── traefik-values.yaml
│ ├── keycloak-values.yaml
│ ├── grafana-values.yaml
│ ├── gitea-values.yaml
│ └── opencost-values.yaml
├── apps/ # Business applications
│ ├── mcp10x.yaml
│ ├── musicman.yaml
│ ├── dot-ai-stack.yaml
│ └── argo-mcp.yaml
├── cluster-resources/ # Cluster-level resources
│ ├── cert-manager-namespace.yaml
│ ├── secrets-namespace.yaml
│ ├── letsencrypt-issuer.yaml
│ ├── kyverno-config.yaml
│ ├── argocd-notifications-secret-sealed.yaml
│ ├── forte10x-repo-credentials-sealed.yaml
│ ├── mcp10x-repo-credentials-sealed.yaml
│ └── policies/
│ ├── deployment-verifier.yaml
│ ├── label-checker.yaml
│ ├── bare-pod-cleaner.yaml
│ ├── replicaset-cleaner.yaml
│ ├── default-ns-blocker.yaml
│ ├── secret-cloner.yaml
│ ├── keycloak-client-cloner.yaml
│ └── auth-sidecar-injector.yaml
├── secrets/ # Application secrets (sealed)
│ ├── base/ # All SealedSecrets (shared across clouds)
│ │ ├── kustomization.yaml
│ │ ├── argocd-forte-helm-secret-sealed.yaml
│ │ ├── argocd-mcp-credentials.yaml
│ │ ├── argocdmcp-auth-oidc-sealed.yaml
│ │ ├── dot-ai-secrets.yaml
│ │ ├── forte10x-app-credentials-sealed.yaml
│ │ ├── gitea-backup-s3-sealed.yaml
│ │ ├── gitea-credentials-sealed.yaml
│ │ ├── gitea-runner-token-sealed.yaml
│ │ ├── gitea-smtp-secret-sealed.yaml
│ │ ├── keycloak-credentials-sealed.yaml
│ │ ├── musicman-auth-oidc-sealed.yaml
│ │ ├── musicman-credentials.yaml
│ │ └── renovate-env-sealed.yaml
│ └── overlays/ # Per-cloud overlays (reference base)
│ ├── aks-dev/kustomization.yaml
│ ├── aks-prod/kustomization.yaml
│ ├── eks-dev/kustomization.yaml
│ ├── eks-prod/kustomization.yaml
│ ├── gke-dev/kustomization.yaml
│ ├── gke-prod/kustomization.yaml
│ ├── upc-dev/kustomization.yaml
│ └── upc-prod/kustomization.yaml
├── scripts/ # Operational helper scripts
│ ├── gitea-backup.sh # S3 backup helper (list/download)
│ ├── gitea-restore.sh
│ └── backup/ # Per-cloud backup reference scripts
│ ├── s3-minio.sh # S3-compatible (UpCloud, MinIO, Wasabi)
│ ├── aws-s3.sh # Native AWS S3
│ ├── azure-blob.sh # Azure Blob Storage
│ └── gcp-gcs.sh # GCP Cloud Storage
├── private/ # Local-only (Git-ignored)
│ ├── *.yaml
│ └── *.sh
└── docs/ # Documentation
├── GITOPS-ARCHITECTURE.md
├── DEVELOPER-GUIDE.md
├── OPERATIONS-RUNBOOK.md
└── REFERENCE.md
```
#### Key Files
**`bootstrap.sh`**
```bash
#!/bin/zsh
# Initializes cluster with ArgoCD
ArgoCd() {
helm upgrade --install argocd argo-cd \
--repo https://argoproj.github.io/argo-helm \
--namespace argocd --create-namespace \
--values infra/values/base/argocd-values.yaml \
--set notifications.context.clusterName="$CLUSTER_NAME" \
--timeout 60s --atomic
kubectl apply -f _app-of-apps-upc-dev.yaml -n argocd # or _app-of-apps-upc-prod.yaml
}
```
**`_app-of-apps-upc-dev.yaml`** / **`_app-of-apps-upc-prod.yaml`**
```yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: infrastructure-apps
namespace: argocd
spec:
project: default
source:
repoURL: ssh://git@git.forteapps.net:2222/Forte/launchpad.git
path: infra
destination:
server: https://kubernetes.default.svc
namespace: default
syncPolicy:
automated:
prune: true
selfHeal: true
```
---
### Helm Charts Repository: `forte-helm`
**URL**: `https://git.forteapps.net/Forte/forte-helm`
#### Chart: `forteapp`
**Version**: 0.1.0
**App Version**: 1.0.0
**Type**: application
##### Templates
| Template | Purpose |
|----------|---------|
| `_helpers.tpl` | Template helper functions |
| `namespace.yaml` | Namespace resource |
| `deployment.yaml` | Main application Deployment |
| `service.yaml` | ClusterIP Service |
| `ingressroute.yaml` | Traefik IngressRoute |
| `certificate.yaml` | Cert-Manager Certificate |
| `configmap.yaml` | Application ConfigMap |
| `secret-auth-tokens.yaml` | Authentication tokens |
| `hpa.yaml` | Horizontal Pod Autoscaler |
| `database-statefulset.yaml` | Optional PostgreSQL StatefulSet |
| `database-service.yaml` | PostgreSQL Service |
##### Default Values Schema
```yaml
app:
image:
repository: "" # Required
tag: "" # Required
pullPolicy: IfNotPresent
containerPort: 3000
replicaCount: 1
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
hpa:
enabled: false
minReplicas: 2
maxReplicas: 10
targetCPUUtilizationPercentage: 70
extraEnv: []
# - name: KEY
# value: "value"
envSecretName: "" # Reference to Secret
nodeEnv: production
db:
enabled: false
name: postgres
image:
repository: postgres
tag: "16-alpine"
service:
type: ClusterIP
port: 5432
targetPort: 5432
persistence:
enabled: true
storageClass: ""
accessMode: ReadWriteOnce
size: 5Gi
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "1000m"
extraEnv: []
envSecretName: ""
livenessProbe:
exec:
command:
- pg_isready
- -U
- db_user
- -d
- db_name
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
exec:
command:
- pg_isready
- -U
- db_user
- -d
- db_name
initialDelaySeconds: 5
periodSeconds: 5
service:
type: ClusterIP
port: 3000
ingress:
enabled: false
host: ""
entrypoint: websecure
tls:
enabled: true
secretName: ""
clusterIssuer: letsencrypt-prod
auth:
enabled: false # Enable authentication sidecar injection
type: token # Authentication mode: "token" or "oidc"
# Token-based authentication configuration
tokens: [] # List of valid bearer tokens (hex strings, 32+ bytes recommended)
# - d4f88f6d9292c10cc3e21c4aad56d2be485db532b54fe961d738e1137d247823
# - 8803f621acc3898df1d7a8f514bc3602551a0681a8f747bd4e43c3c5849d57a7
# OIDC authentication configuration
oidc:
authority: "" # OIDC provider URL (e.g., https://auth.example.com/realms/master)
clientId: "" # OIDC client ID registered with provider
scopes: "openid,profile,email" # OAuth scopes (comma-separated)
callbackPath: /auth/callback # OAuth callback path (default: /auth/callback)
# Note: Client secret must be in 'auth-oidc' Secret (client-secret key)
# Cookie secret must be in 'auth-oidc' Secret (cookie-secret key)
configmap: [] # Application ConfigMap key-value pairs
# KEY: value
# DB_HOST: postgres
# DB_PORT: "5432"
```
---
### Helm Values Repository: `helm-prod-values`
**URL**: `https://git.forteapps.net/Forte/helm-prod-values.git`
#### Structure
```
helm-prod-values/
├── mcp10x/
│ └── values.yaml
├── musicman/
│ └── values.yaml
└── argocd-mcp/
└── values.yaml
```
#### Example: `mcp10x/values.yaml`
```yaml
app:
image:
repository: ghcr.io/fortedigital/10x
tag: 2.0.4 # Updated by CI/CD
extraEnv:
- name: PORT
value: "3000"
- name: SKILLS_DIR
value: "/app/skills"
- name: FLOWCASE_ENDPOINT
value: "https://forte.cvpartner.com/api/"
envSecretName: "app-credentials"
auth:
enabled: false
tokens:
- d4f88f6d9292c10cc3e21c4aad56d2be485db532b54fe961d738e1137d247823
ingress:
enabled: true
host: mcp10x.forteapps.net
```
---
## Helm Chart Reference
### Template Functions
#### `forteapp.fullname`
```yaml
{{ include "forteapp.fullname" . }}
# Output: <release-name>
```
#### `forteapp.labels`
```yaml
{{ include "forteapp.labels" . }}
# Output:
# app.kubernetes.io/name: forteapp
# app.kubernetes.io/instance: <release-name>
# app.kubernetes.io/version: <chart-version>
# app.kubernetes.io/managed-by: Helm
```
#### `forteapp.selectorLabels`
```yaml
{{ include "forteapp.selectorLabels" . }}
# Output:
# app.kubernetes.io/name: forteapp
# app.kubernetes.io/instance: <release-name>
```
### Deployment Specification
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "forteapp.fullname" . }}
labels:
{{- include "forteapp.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.app.replicaCount }}
selector:
matchLabels:
{{- include "forteapp.selectorLabels" . | nindent 6 }}
template:
metadata:
annotations:
policies.forteapps.io/auth: {{ .Values.auth.enabled | quote }}
labels:
{{- include "forteapp.selectorLabels" . | nindent 8 }}
spec:
containers:
- name: app
image: "{{ .Values.app.image.repository }}:{{ .Values.app.image.tag }}"
imagePullPolicy: {{ .Values.app.image.pullPolicy }}
ports:
- name: http
containerPort: {{ .Values.app.image.containerPort }}
env:
- name: NODE_ENV
value: {{ .Values.app.nodeEnv | quote }}
{{- with .Values.app.extraEnv }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- if .Values.app.envSecretName }}
envFrom:
- secretRef:
name: {{ .Values.app.envSecretName }}
{{- end }}
resources:
{{- toYaml .Values.app.resources | nindent 10 }}
securityContext:
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
```
### IngressRoute Specification
```yaml
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: {{ include "forteapp.fullname" . }}
spec:
entryPoints:
- {{ .Values.ingress.entrypoint }}
routes:
- match: Host(`{{ .Values.ingress.host }}`)
kind: Rule
services:
- name: {{ include "forteapp.fullname" . }}
port: {{ .Values.service.port }}
{{- if .Values.ingress.tls.enabled }}
tls:
secretName: {{ default .Release.Name .Values.ingress.tls.secretName }}-tls
{{- end }}
```
### Certificate Specification
```yaml
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: {{ include "forteapp.fullname" . }}-tls
spec:
secretName: {{ default .Release.Name .Values.ingress.tls.secretName }}-tls
issuerRef:
name: {{ .Values.ingress.tls.clusterIssuer }}
kind: ClusterIssuer
dnsNames:
- {{ .Values.ingress.host }}
```
---
## ArgoCD Configuration
### Application Manifest Schema
```yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: <app-name>
namespace: argocd
annotations:
argocd.argoproj.io/sync-wave: "1"
notifications.argoproj.io/subscribe.on-sync-succeeded.slack: ""
notifications.argoproj.io/subscribe.on-sync-failed.slack: ""
notifications.argoproj.io/subscribe.on-degraded.slack: ""
labels:
app.kubernetes.io/name: <app-name>
app.kubernetes.io/part-of: apps
app.kubernetes.io/managed-by: argocd
finalizers:
- resources-finalizer.argocd.argoproj.io
spec:
project: default
# Multi-source configuration
sources:
- repoURL: https://git.forteapps.net/Forte/forte-helm
path: forteapp
targetRevision: HEAD
helm:
valueFiles:
- $values/<app-name>/values.yaml
- repoURL: git@github.com:fortedigital/helm-prod-values.git
targetRevision: HEAD
ref: values
destination:
server: https://kubernetes.default.svc
namespace: <app-name>
syncPolicy:
automated:
prune: true
selfHeal: true
allowEmpty: false
syncOptions:
- CreateNamespace=true
- Validate=true
- ServerSideApply=true
- Replace=false
retry:
limit: 5
backoff:
duration: 5s
factor: 2
maxDuration: 3m
ignoreDifferences:
- group: apps
kind: Deployment
jsonPointers:
- /spec/replicas
```
### Sync Waves
| Wave | Components | Purpose |
|------|------------|---------|
| `-1` | Namespaces | Create namespaces first |
| `0` | Kyverno | Install policy engine |
| `1` | Cluster resources, infrastructure | Base infrastructure |
| `2+` | Applications | Business applications |
### Sync Options
| Option | Description |
|--------|-------------|
| `CreateNamespace=true` | Automatically create target namespace |
| `Validate=true` | Validate resources before applying |
| `ServerSideApply=true` | Use server-side apply (safer) |
| `Replace=false` | Don't use kubectl replace |
| `Prune=true` | Delete resources not in Git |
### Retry Policy
```yaml
retry:
limit: 5 # Max retry attempts
backoff:
duration: 5s # Initial backoff
factor: 2 # Exponential factor
maxDuration: 3m # Max backoff time
```
**Retry Schedule**:
1. 5 seconds
2. 10 seconds
3. 20 seconds
4. 40 seconds
5. 80 seconds (capped at 3 minutes)
### Global Settings (`argocd-cm`)
| Setting | Value | Purpose |
|---------|-------|---------|
| `application.resourceTrackingMethod` | `annotation` | Track resources via annotations |
| `timeout.reconciliation` | `60s` | Reconciliation interval |
| `admin.enabled` | `false` | Admin login disabled (SSO-only) |
| `url` | `https://argocd.forteapps.net` | External URL for ArgoCD UI |
**Git Submodule Disable**: Set via `configs.params` (NOT `repoServer.env` — that causes strategic merge conflicts with chart's `valueFrom` entries):
```yaml
configs:
params:
"reposerver.enable.git.submodule": "false"
```
This writes to `argocd-cmd-params-cm` ConfigMap, which the chart already reads via `valueFrom`. Submodules (e.g., `shared-prompts`) are not needed for K8s manifest generation.
**Break-Glass Admin Access**: Admin login is disabled (`admin.enabled: false`). The admin password remains in `argocd-secret`. To re-enable temporarily:
```bash
# Enable admin login
kubectl patch cm argocd-cm -n argocd -p '{"data":{"admin.enabled":"true"}}'
# Log in as admin, do what's needed, then disable again
kubectl patch cm argocd-cm -n argocd -p '{"data":{"admin.enabled":"false"}}'
```
ArgoCD picks up ConfigMap changes within the reconciliation timeout (60s). Note: ArgoCD will revert this on next sync — this is intentional (temporary access only).
**OIDC Authentication** (Keycloak):
```yaml
configs:
cm:
oidc.config: |
name: Forte SSO
issuer: https://id.forteapps.net/realms/forte
clientID: argocd
clientSecret: $oidc.clientSecret
requestedScopes: ["openid", "email", "profile"]
rbacConfig:
policy.csv: |
g, ArgoCD Admins, role:admin
g, ArgoCD Viewers, role:readonly
# Deny users not in any declared KC group
policy.default: ""
scopes: '[groups]'
```
**Access Control**: Only users in Keycloak groups `ArgoCD Admins` or `ArgoCD Viewers` can access ArgoCD. Users not in either group are denied (empty `policy.default`). Assign users to groups in Keycloak admin console.
- ArgoCD does NOT add `openid` implicitly — must include in `requestedScopes`
- Do NOT add `groups` as a scope — the KC groups mapper emits the claim regardless
- `$oidc.clientSecret` references the `oidc.clientSecret` key in `argocd-secret`
- OIDC secret is synced by CronJob `argocd-oidc-sync` (see `cluster-resources/argocd-oidc-secret-sync.yaml`)
- The CronJob bridges `argocd-oidc-credentials` (from KC registrar) → `argocd-secret` every 2 min
- Safe for fresh deploys: no-ops if source secret doesn't exist yet
**Ingress** (Traefik + TLS):
```yaml
server:
ingress:
enabled: true
ingressClassName: traefik
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
tls: true
extraArgs:
- --insecure
configs:
params:
"server.insecure": true
```
TLS terminates at Traefik; ArgoCD runs in insecure mode behind the proxy.
---
## Infrastructure Components
### Traefik
**Chart**: `traefik/traefik`
**Version**: Latest
**Namespace**: `traefik`
**Configuration**:
```yaml
# infra/base/traefik-application.yaml
replicas: 2
service:
type: LoadBalancer
ingressRoute:
dashboard:
enabled: false
ports:
web:
redirectTo: websecure # HTTP → HTTPS redirect
websecure:
tls:
enabled: true
```
**Endpoints**:
- HTTP: `:80` → Redirects to HTTPS
- HTTPS: `:443`
### Cert-Manager
**Chart**: `jetstack/cert-manager`
**Namespace**: `cert-manager`
**ClusterIssuer**:
```yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: admin@forteapps.net
privateKeySecretRef:
name: letsencrypt-prod-key
solvers:
- http01:
ingress:
class: traefik
```
### Kyverno
**Chart**: `kyverno/kyverno`
**Namespace**: `kyverno`
**Policies**:
- Secret cloner
- Default namespace blocker
- Bare pod cleaner
- ReplicaSet cleaner
- Deployment verifier
- Auth sidecar injector
### Sealed Secrets
**Chart**: `sealed-secrets/sealed-secrets-controller`
**Namespace**: `kube-system`
**Directory Structure**: `secrets/base/` contains all SealedSecrets with a `kustomization.yaml`. Per-cloud overlays in `secrets/overlays/<cloud>/` reference the base via Kustomize. The ArgoCD `secrets` Application points to the active overlay (e.g., `secrets/overlays/upc-dev`), and `infra/overlays/upc-prod` patches the path to `secrets/overlays/upc-prod`.
To add cloud-specific secrets, create a new SealedSecret in the overlay directory and add it to the overlay's `kustomization.yaml`.
**Public Certificate**:
```bash
kubeseal --fetch-cert \
--controller-name=sealed-secrets-controller \
--controller-namespace=kube-system \
> pub-cert.pem
```
### Prometheus
**Chart**: `prometheus-community/prometheus`
**Namespace**: `monitoring`
**Configuration**:
```yaml
server:
persistentVolume:
enabled: true
size: 10Gi
alertmanager:
enabled: false
nodeExporter:
enabled: true
kubeStateMetrics:
enabled: true
```
### Grafana
**Chart**: `grafana/grafana`
**Namespace**: `monitoring`
**Datasources**:
- Prometheus
- Loki
- Tempo
**Ingress**: Exposed via Traefik at `https://grafana.forteapps.net` with cert-manager TLS.
**OIDC Authentication** (Keycloak):
- Uses `grafana.ini.auth.generic_oauth` with KC `grafana` client
- Secret `grafana-oidc-credentials` synced by KC registrar, loaded via `envFromSecrets`
- SSO-only mode: `auth.disable_login_form: true` + `auth.generic_oauth.auto_login: true`
- Role mapping via JMESPath on `resource_access.grafana.roles` claim (requires KC client role mapper)
- Roles: KC client roles `Admin`/`Editor` map to Grafana roles; default is `Viewer`
### Loki
**Chart**: `grafana/loki-stack`
**Namespace**: `monitoring`
**Configuration**:
```yaml
loki:
persistence:
enabled: true
size: 10Gi
promtail:
enabled: false # Using Fluent-Bit instead
```
### Tempo
**Chart**: `grafana/tempo`
**Version**: 1.24.4
**Namespace**: `monitoring`
**Purpose**: Distributed tracing backend receiving OTLP traces from Traefik and other instrumented services.
**Configuration**:
```yaml
tempo:
storage:
trace:
backend: local
local:
path: /var/tempo/traces
receivers:
otlp:
protocols:
grpc:
endpoint: "0.0.0.0:4317"
http:
endpoint: "0.0.0.0:4318"
persistence:
enabled: true
size: 10Gi
```
**Endpoints**:
- gRPC OTLP receiver: `:4317`
- HTTP OTLP receiver: `:4318`
- Query API: `:3200`
**Grafana Integration**:
- Trace-to-logs correlation with Loki (by namespace, pod, container)
- Trace-to-metrics correlation with Prometheus (by service name)
- Service graph and node graph visualization
### Fluent-Bit
**Chart**: `fluent/fluent-bit`
**Namespace**: `monitoring`
**Output**: Loki
### Gitea
**Chart**: `gitea/gitea`
**Version**: 12.5.0 (app v1.25.4)
**Namespace**: `gitea`
**Purpose**: Self-hosted Git repository hosting with pull requests, issues, CI/CD (Gitea Actions), container registry, and package registry.
**Configuration**:
```yaml
# infra/base/gitea.yaml + infra/values/base/gitea-values.yaml
ingress:
host: git.forteapps.net
tls: cert-manager (letsencrypt-prod)
gitea:
admin:
existingSecret: gitea-credentials
config:
service:
DISABLE_REGISTRATION: true
ALLOW_ONLY_EXTERNAL_REGISTRATION: true
actions:
ENABLED: true
packages:
ENABLED: true
metrics:
ENABLED: true
postgresql:
enabled: true
persistence: 8Gi (upcloud-block-storage-maxiops)
```
**Authentication**: Keycloak OIDC via `forte` realm (client ID: `gitea`). Protocol mapper: `email_verified` hardcoded claim (`true`, boolean) on ID token, Access token, and Userinfo.
**External User Sync**: Disabled (`cron.sync_external_users.ENABLED: false`). This Gitea cron job is designed for LDAP and deactivates OIDC-only users because it cannot enumerate them — causing "Sign-in prohibited" errors after the sync runs.
**Email Notifications**: Enabled (`ENABLE_NOTIFY_MAIL: true`). SMTP credentials injected via `gitea-smtp-secret` using `additionalConfigFromEnvs` with `GITEA__mailer__USER` / `GITEA__mailer__PASSWD` environment variables.
**Auto-Watch**: Disabled (`AUTO_WATCH_ON_CHANGES: false`, `AUTO_WATCH_NEW_REPOS: false`). Prevents contributors from being auto-subscribed to repo notifications on push, reducing email noise from CI bots (e.g., ai-review PR comments). Users who were already watching before this change need to manually unwatch or switch to "Only participating".
**Endpoints**:
- Web UI: `https://git.forteapps.net`
- SSH: port 22 (ClusterIP)
- Metrics: `/metrics` (Prometheus scrape)
**Secrets**:
- `gitea-credentials` (SealedSecret) — admin password
- `gitea-oidc-credentials` (registrar-managed) — OIDC client ID + secret
- `gitea-smtp-secret` (SealedSecret) — SMTP username + password
### Gitea Actions Runners
**Chart**: `actions` (from `https://dl.gitea.com/charts`)
**Namespace**: `gitea`
**Sync Wave**: 2 (deploys after Gitea)
**Purpose**: Act runners execute Gitea Actions CI/CD workflows. Deployed as a StatefulSet with a Docker-in-Docker sidecar for container-based job execution.
**Configuration**:
```yaml
# infra/base/gitea-actions.yaml + infra/values/base/gitea-actions-values.yaml
replicaCount: 3
runner:
labels:
- "ubuntu-latest:docker://node:20-bookworm"
- "ubuntu-22.04:docker://node:20-bookworm"
existingSecret: gitea-runner-token
gitea:
instance:
url: http://gitea-http.gitea.svc.cluster.local:3000
dind:
enabled: true # Docker-in-Docker sidecar (privileged)
```
**Resources**:
| Container | CPU Request | Memory Request | CPU Limit | Memory Limit |
|-----------|-------------|----------------|-----------|--------------|
| Runner | 250m | 256Mi | 1 | 1Gi |
| DinD sidecar | 250m | 256Mi | 1 | 1Gi |
**Secrets**: `gitea-runner-token` (SealedSecret) containing `token` (instance-level runner registration token from `/admin/runners`)
**Setup Steps**:
1. Get runner registration token from Gitea admin panel (`/admin/runners`)
2. Fill in `private/gitea-runner-token.yaml` with the token
3. Seal: `kubeseal --format yaml < private/gitea-runner-token.yaml > secrets/gitea-runner-token-sealed.yaml`
4. Commit and push — ArgoCD deploys runners automatically
**Verification**:
- `kubectl get statefulset -n gitea` — 3/3 runners ready
- Gitea admin panel (`/admin/runners`) — runners show as Online
- Create test workflow in `.gitea/workflows/test.yml` — job executes
### AI Code Review (ai-review)
**Type**: Gitea Actions workflow (`.gitea/workflows/ai-review.yaml`)
**Trigger**: `pull_request` events (`opened`, `synchronize`)
**Runner**: `ubuntu-latest` (container: `nikitafilonov/ai-review:latest`)
**Purpose**: Automated AI-powered code review on pull requests using Claude (Anthropic). Posts inline comments on changed lines and a PR summary comment highlighting infrastructure impact.
**Architecture**:
- Uses [xai-review](https://github.com/nicktechnologies/xai-review) Docker image
- Shared configuration and prompts live in the `shared-prompts` Git submodule (→ `Forte/ai-review-prompts`)
- Review mode: `ONLY_ADDED_WITH_CONTEXT` — reviews only new/changed lines plus surrounding context (token-efficient)
- Agent mode: disabled (one-shot review, no multi-turn reasoning)
- LLM: Claude Sonnet (`claude-sonnet-4-20250514`)
**Shared Prompts Structure** (submodule: `Forte/ai-review-prompts`):
```
shared-prompts/
base/
security.md # org-wide security rules (all profiles)
iac/
.ai-review.yaml # IaC/GitOps profile config
inline.md # inline review prompt
summary.md # PR summary prompt
# future profiles: backend/, frontend/, etc.
```
**Configuration** (`shared-prompts/iac/.ai-review.yaml`):
```yaml
llm:
provider: CLAUDE
model: claude-sonnet-4-20250514
vcs:
provider: GITEA
review:
mode: ONLY_ADDED_WITH_CONTEXT
agent:
enabled: false
prompt:
inline_prompt_files: # concatenated in order
- ./shared-prompts/base/security.md
- ./shared-prompts/iac/inline.md
summary_prompt_files:
- ./shared-prompts/iac/summary.md
ignore:
- "*.sealed.yaml"
- "*.lock"
- "docs/**"
```
**Custom Prompts** (IaC profile):
- `shared-prompts/base/security.md` — org-wide security rules, concatenated before every inline review prompt
- `shared-prompts/iac/inline.md` — IaC-specific inline review (YAML, Helm, K8s manifests, shell scripts), max 7 comments
- `shared-prompts/iac/summary.md` — PR summary: affected services/namespaces, infrastructure impact, security flags
**Prompt composition**: ai-review does not support Jinja includes. Instead, list multiple files under `inline_prompt_files` / `summary_prompt_files` — they are concatenated in order with double newlines.
**Adding a new profile**: Create a new directory (e.g., `backend/`) with its own `.ai-review.yaml`, `inline.md`, and `summary.md`. The `inline_prompt_files` list should include `base/security.md` first, then the profile-specific prompt. Reference it in the consuming repo's workflow: `AI_REVIEW_CONFIG_FILE_YAML=./shared-prompts/backend/.ai-review.yaml`
**Required Secrets** (configure in Gitea repo or org settings):
| Secret | Purpose |
|--------|---------|
| `ANTHROPIC_API_KEY` | Claude API key (from Anthropic console) |
| `AI_REVIEW_TOKEN` | Gitea API token with `write:repository` + `read:repository` scopes (use a bot/service account) |
**Setup Steps**:
1. Create a Gitea bot/service account and generate an API token with `write:repository` + `read:repository` scopes
2. Add `AI_REVIEW_TOKEN` secret in Gitea repo settings → Actions → Secrets
3. Add `ANTHROPIC_API_KEY` secret with your Anthropic API key
4. Ensure the `shared-prompts` submodule is initialized (`git submodule update --init`)
5. Push the workflow file — it triggers automatically on PR creation/update
**Verification**:
- Open a PR with infrastructure changes → workflow runs → inline comments + summary appear
- Check Gitea Actions tab for workflow run status and logs
- Monitor Anthropic usage dashboard for token consumption
### Keycloak Client Registrar
**Type**: CronJob (deployed via Keycloak Helm chart `extraDeploy`)
**Namespace**: `keycloak`
**Schedule**: `*/2 * * * *` (every 2 minutes)
**Purpose**: Handles two responsibilities:
1. **Legacy sync** — extracts secrets from Keycloak clients with `k8s.secret.sync: "true"` attribute (same as former PostSync syncer)
2. **Self-service registration** — processes config Secrets (cloned by Kyverno) to register new OIDC clients and sync their credentials
**How It Works**:
*Legacy path (existing clients like Gitea):*
1. Authenticates to Keycloak Admin API using admin credentials from `keycloak-credentials` secret
2. Queries all clients in the `forte` realm
3. Filters clients with `k8s.secret.sync: "true"` attribute
4. For each matching client, retrieves the auto-generated secret via Keycloak Admin API
5. Creates/updates a K8s Secret in the target namespace (from `k8s.secret.namespace` attribute)
6. Always writes a central copy to the `secrets` namespace
*Self-service path (new clients):*
1. Lists Secrets in `keycloak` namespace with label `keycloak.forteapps.net/client-config=true`
2. For each config Secret, parses `client.json` and computes a config hash
3. Skips if hash matches annotation and credential Secret already exists
4. Creates or updates the Keycloak client via Admin API
5. Fetches the generated client secret
6. Upserts credential Secret in target namespace + central `secrets` namespace
7. Annotates config Secret with sync status, config hash, and timestamp
**Resources**:
- `ServiceAccount`: `keycloak-client-registrar` (namespace: `keycloak`)
- `ClusterRole`: `keycloak-client-registrar` (secrets: get/list/create/update/patch; namespaces: get/list)
- `ClusterRoleBinding`: `keycloak-client-registrar`
- `CronJob`: `keycloak-client-registrar`
**Kyverno Policy**: `keycloak-client-config-cloner` — clones labeled Secrets from app namespaces to `keycloak` namespace (see [Kyverno Policies](#kyverno-policies))
**Legacy Client Attributes** (set in `forte-realm.json`):
| Attribute | Required | Default | Description |
|-----------|----------|---------|-------------|
| `k8s.secret.sync` | Yes | — | Set to `"true"` to enable syncing |
| `k8s.secret.namespace` | Yes | — | Target K8s namespace |
| `k8s.secret.name` | Yes | — | Name of the K8s Secret |
| `k8s.secret.client-id-key` | No | `client-id` | Field name for client ID in the Secret |
| `k8s.secret.client-secret-key` | No | `client-secret` | Field name for client secret in the Secret |
**Self-Service Config Secret Schema**:
```yaml
apiVersion: v1
kind: Secret
metadata:
name: keycloak-client-<app>
namespace: <app-namespace>
labels:
keycloak.forteapps.net/client-config: "true"
stringData:
client.json: |
{
"clientId": "<app>",
"name": "<App Name>",
"redirectUris": ["https://<app>.forteapps.net/*"],
"webOrigins": ["https://<app>.forteapps.net"],
"defaultClientScopes": ["openid", "email", "profile"],
"protocolMappers": [],
"secret": {
"namespace": "<app-namespace>",
"name": "<app>-oidc-credentials",
"keys": { "clientId": "client-id", "clientSecret": "client-secret" }
}
}
```
**Created Credential Secret Format**:
```yaml
apiVersion: v1
kind: Secret
metadata:
name: <target-name>
namespace: <target-namespace>
labels:
app.kubernetes.io/managed-by: keycloak-client-registrar
type: Opaque
data:
<client-id-key>: <base64-encoded client ID>
<client-secret-key>: <base64-encoded client secret>
```
**Config Secret Annotations** (set by registrar):
| Annotation | Description |
|-----------|-------------|
| `keycloak.forteapps.net/config-hash` | SHA-256 hash of client.json for change detection |
| `keycloak.forteapps.net/sync-status` | `synced` or `error` |
| `keycloak.forteapps.net/last-sync` | ISO 8601 timestamp of last successful sync |
**Verification**:
```bash
# Check CronJob status
kubectl get cronjobs -n keycloak
# View latest registrar logs
kubectl logs -n keycloak job/$(kubectl get jobs -n keycloak --sort-by=.metadata.creationTimestamp -o jsonpath='{.items[-1].metadata.name}')
# Verify created secret
kubectl get secret <name> -n <namespace> -o yaml
# Check config Secret annotations (self-service)
kubectl get secret keycloak-client-<app> -n keycloak -o jsonpath='{.metadata.annotations}'
```
**See**: [Developer Guide - Adding a New Keycloak Client](DEVELOPER-GUIDE.md#adding-a-new-keycloak-client)
### Karpor
**Chart**: `karpor` from `https://kusionstack.github.io/charts`
**Version**: 0.7.6 (app v0.6.4)
**Namespace**: `karpor`
**Sync Wave**: 1
**Purpose**: Kubernetes visualization and intelligence tool. Provides cross-cluster resource search, compliance checking, and topology visualization. Gives platform engineers a unified view of all cluster resources and their relationships.
**Architecture** (4 components):
- **Server** — main Karpor API/UI (port 7443)
- **Syncer** — syncs cluster state into the search index
- **ElasticSearch** — search backend for resource indexing
- **etcd** — persistent key-value store (10Gi PVC)
**Configuration** (`infra/values/base/karpor-values.yaml`):
- `namespaceEnabled: false` — ArgoCD manages namespace creation
- Default resource limits tuned for small clusters
- ElasticSearch: 2 CPU / 4Gi memory (the heaviest component)
- AI features available but not enabled (requires `server.ai.authToken` + backend config)
**Access**: Port-forward to reach the UI:
```bash
kubectl port-forward svc/karpor-release-server -n karpor 7443:7443
# Open https://localhost:7443
```
### Renovate
**Chart**: `renovate` (OCI: `ghcr.io/renovatebot/charts`)
**Version**: 46.109.0 (app v43.113.0)
**Namespace**: `renovate`
**Sync Wave**: 2
**Purpose**: Automated dependency update bot. Runs as a CronJob that scans Gitea repositories for outdated dependencies and creates pull requests with updates.
**Configuration**:
```yaml
# infra/base/renovate.yaml + infra/values/base/renovate-values.yaml
cronjob:
schedule: "@daily"
concurrencyPolicy: Forbid
renovate:
config:
platform: gitea
endpoint: https://git.forteapps.net
autodiscover: true
gitAuthor: "Renovate Bot <renovate@forteapps.net>"
packageRules:
- matchRepositories: ["**/10x"]
assignees: ["edvard.unsvag"]
reviewers: ["edvard.unsvag"]
- matchRepositories: ["**/auth-sidecar"]
assignees: ["danijel.simeunovic"]
reviewers: ["danijel.simeunovic"]
- matchRepositories: ["**/forte-helm"]
assignees: ["danijel.simeunovic"]
reviewers: ["danijel.simeunovic"]
resources:
requests: { cpu: 500m, memory: 1Gi }
limits: { cpu: "2", memory: 4Gi }
```
**Note**: Assignees and reviewers are only applied at PR creation time. Existing PRs must be closed and recreated for new assignment rules to take effect.
**Secrets**: `renovate-env` (SealedSecret in `secrets` namespace, cloned by Kyverno) containing:
- `RENOVATE_TOKEN` — Gitea PAT with repo write + issue write permissions
- `RENOVATE_GITHUB_COM_TOKEN` — GitHub PAT (public_repo read-only) for changelog fetching
**Setup Steps**:
1. Fill in `private/renovate-env.yaml` with tokens
2. Seal: `kubeseal --format yaml < private/renovate-env.yaml > secrets/renovate-env-sealed.yaml`
3. Commit and push — ArgoCD deploys the CronJob, Kyverno clones the secret
**Verification**:
- `kubectl get cronjob -n renovate` — CronJob exists
- `kubectl create job --from=cronjob/renovate renovate-test -n renovate` — manual trigger
- `kubectl logs -n renovate job/renovate-test` — check logs
---
## Kyverno Policies
### Secret Cloner
**File**: `cluster-resources/policies/secret-cloner.yaml`
**Purpose**: Automatically clone secrets from `secrets` namespace to new namespaces
```yaml
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: sync-secret-with-multi-clone
spec:
rules:
- name: clone-secret
match:
any:
- resources:
kinds:
- Namespace
generate:
apiVersion: v1
kind: Secret
name: "{{ request.object.metadata.name }}"
namespace: "{{ request.object.metadata.name }}"
synchronize: true
clone:
namespace: secrets
name: shared-credentials
```
**Label Requirement**: Secrets must have `allowedToBeCloned: "true"`
### Keycloak Client Config Cloner
**File**: `cluster-resources/policies/keycloak-client-cloner.yaml`
**Purpose**: Clones Secrets labeled `keycloak.forteapps.net/client-config: "true"` from app namespaces to the `keycloak` namespace. This allows apps to declare their OIDC client configuration in their own namespace, which the [Keycloak Client Registrar](#keycloak-client-registrar) then processes.
**Trigger**: Any Secret with label `keycloak.forteapps.net/client-config: "true"` created outside the `keycloak` namespace.
**Behavior**:
- Generates a copy of the Secret in the `keycloak` namespace with the same name
- Adds source tracking annotations (`keycloak.forteapps.net/source-namespace`, `keycloak.forteapps.net/source-name`)
- `synchronize: true` — changes to the source Secret are reflected in the clone
### Default Namespace Blocker
**File**: `cluster-resources/policies/default-ns-blocker.yaml`
**Purpose**: Prevent resources from being created in `default` namespace
```yaml
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: disallow-default-namespace
spec:
validationFailureAction: enforce
rules:
- name: validate-namespace
match:
any:
- resources:
kinds:
- Pod
- Deployment
- Service
validate:
message: "Using 'default' namespace is not allowed"
pattern:
metadata:
namespace: "!default"
```
### Bare Pod Cleaner
**File**: `cluster-resources/policies/bare-pod-cleaner.yaml`
**Purpose**: Delete pods without ownerReferences (not managed by Deployment/StatefulSet)
```yaml
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: cleanup-bare-pods
spec:
rules:
- name: delete-bare-pod
match:
any:
- resources:
kinds:
- Pod
preconditions:
all:
- key: "{{ request.object.metadata.ownerReferences[] || '' }}"
operator: Equals
value: ""
validate:
message: "Bare pods (without controllers) are not allowed"
deny: {}
```
### Auth Sidecar Injector
**File**: `cluster-resources/policies/auth-sidecar-injector.yaml`
**Purpose**: Automatically inject authentication sidecar into pods with authentication enabled
**Rules**: 6 rules in the policy
1. `generate-auth-tokens-secret` - Creates Secret for token mode
2. `generate-auth-oidc-secret` - Creates Secret for OIDC mode
3. `inject-sidecar-token` - Injects auth sidecar for token mode
4. `inject-sidecar-oidc` - Injects auth sidecar for OIDC mode
5. `inject-sidecar-mcp` - Injects auth sidecar for MCP OAuth mode (RFC 9728 / RFC 7591)
6. `generate-auth-network-policy` - Creates NetworkPolicy to restrict ingress
#### Trigger Annotation
```yaml
policies.forteapps.io/auth: "true"
```
#### Authentication Modes
**Token Mode** (default):
```yaml
# Annotations
policies.forteapps.io/auth: "true"
policies.forteapps.io/auth-type: "token"
policies.forteapps.io/auth-token-secret-name: "auth-tokens"
policies.forteapps.io/auth-upstream-url: "http://localhost:3000"
# Optional customization
policies.forteapps.io/auth-image: "ghcr.io/fortedigital/auth-sidecar"
policies.forteapps.io/auth-image-version: "latest"
```
**OIDC Mode**:
```yaml
# Annotations (required)
policies.forteapps.io/auth: "true"
policies.forteapps.io/auth-type: "oidc"
policies.forteapps.io/auth-oidc-authority: "https://auth.example.com/realms/master"
policies.forteapps.io/auth-oidc-client-id: "myapp"
# Optional annotations
policies.forteapps.io/auth-oidc-callback-path: "/auth/callback"
policies.forteapps.io/auth-oidc-scopes: "openid,profile,email"
policies.forteapps.io/auth-upstream-url: "http://localhost:3000"
policies.forteapps.io/auth-image: "ghcr.io/fortedigital/auth-sidecar"
policies.forteapps.io/auth-image-version: "latest"
```
**MCP Mode** (OAuth 2.0 for MCP servers, implements RFC 9728 / RFC 7591):
```yaml
# Annotations (required)
policies.forteapps.io/auth: "true"
policies.forteapps.io/auth-type: "mcp"
policies.forteapps.io/auth-mcp-resource: "https://mcp.example.com"
policies.forteapps.io/auth-mcp-authority: "https://auth.example.com"
# Optional annotations
policies.forteapps.io/auth-mcp-scopes: "read,write"
policies.forteapps.io/auth-upstream-url: "http://localhost:3000"
policies.forteapps.io/auth-log-level: "info"
policies.forteapps.io/auth-image: "ghcr.io/fortedigital/auth-sidecar"
policies.forteapps.io/auth-image-version: "latest"
```
#### Sidecar Container Specification
**Token Mode**:
```yaml
name: authn
image: ghcr.io/fortedigital/auth-sidecar:latest
ports:
- containerPort: 8080
name: auth
protocol: TCP
env:
- name: AUTH_MODE
value: "token"
- name: AUTH_LISTEN_ADDR
value: ":8080"
- name: AUTH_UPSTREAM_URL
value: "http://localhost:3000"
- name: AUTH_TOKEN_FILE
value: "/etc/auth/tokens"
volumeMounts:
- name: auth-tokens
mountPath: /etc/auth
readOnly: true
resources:
requests:
cpu: 10m
memory: 32Mi
limits:
cpu: 50m
memory: 64Mi
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop: [ALL]
```
**OIDC Mode**:
```yaml
name: authn
image: ghcr.io/fortedigital/auth-sidecar:latest
ports:
- containerPort: 8080
name: auth
protocol: TCP
env:
- name: AUTH_MODE
value: "oidc"
- name: AUTH_LISTEN_ADDR
value: ":8080"
- name: AUTH_UPSTREAM_URL
value: "http://localhost:3000"
- name: AUTH_OIDC_AUTHORITY
value: "https://auth.example.com/realms/master"
- name: AUTH_OIDC_CLIENT_ID
value: "myapp"
- name: AUTH_OIDC_CALLBACK_PATH
value: "/auth/callback"
- name: AUTH_OIDC_SCOPES
value: "openid,profile,email"
- name: AUTH_OIDC_COOKIE_SECRET
valueFrom:
secretKeyRef:
name: auth-oidc
key: cookie-secret
- name: AUTH_OIDC_CLIENT_SECRET
valueFrom:
secretKeyRef:
name: auth-oidc
key: client-secret
resources:
requests:
cpu: 10m
memory: 32Mi
limits:
cpu: 50m
memory: 64Mi
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop: [ALL]
```
**MCP Mode**:
```yaml
name: authn
image: ghcr.io/fortedigital/auth-sidecar:latest
ports:
- containerPort: 8080
name: auth
protocol: TCP
env:
- name: AUTH_MODE
value: "mcp"
- name: AUTH_LISTEN_ADDR
value: ":8080"
- name: AUTH_LOG_LEVEL
value: "info"
- name: AUTH_UPSTREAM_URL
value: "http://localhost:3000"
- name: AUTH_MCP_RESOURCE
value: "https://mcp.example.com"
- name: AUTH_MCP_AUTHORIZATION_SERVERS
value: "https://auth.example.com"
- name: AUTH_MCP_SCOPES_SUPPORTED
value: "read,write"
resources:
requests:
cpu: 10m
memory: 32Mi
limits:
cpu: 50m
memory: 64Mi
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop: [ALL]
```
#### Generated Resources
**Secret (Token Mode)**:
```yaml
apiVersion: v1
kind: Secret
metadata:
name: auth-tokens
namespace: <app-namespace>
labels:
app.kubernetes.io/managed-by: kyverno
app.kubernetes.io/created-by: inject-auth-sidecar
type: Opaque
data: {} # Populated by Helm chart
```
**Secret (OIDC Mode)**:
```yaml
apiVersion: v1
kind: Secret
metadata:
name: auth-oidc
namespace: <app-namespace>
labels:
app.kubernetes.io/managed-by: kyverno
app.kubernetes.io/created-by: inject-auth-sidecar
type: Opaque
data:
client-secret: <base64>
cookie-secret: <base64>
```
**NetworkPolicy**:
```yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: <pod-name>-auth-ingress
namespace: <app-namespace>
labels:
app.kubernetes.io/managed-by: kyverno
app.kubernetes.io/created-by: inject-auth-sidecar
spec:
podSelector:
matchLabels: <pod-labels>
policyTypes:
- Ingress
ingress:
- ports:
- port: 8080
protocol: TCP
```
#### Excluded Namespaces
The policy does NOT apply to:
- `kube-system`
- `kyverno`
- `argocd`
- `cert-manager`
- `monitoring`
#### Health Checks
```yaml
readinessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 2
periodSeconds: 5
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
```
#### Request Flow
```
External Request → Traefik
Service (port 8080)
Pod: Auth Sidecar (port 8080)
├─ Validate credentials
│ • Token mode: Check Bearer token
│ • OIDC mode: Validate session or redirect to IdP
│ • MCP mode: OAuth 2.0 via RFC 9728 discovery / RFC 7591 dynamic registration
Forward to Application (localhost:3000)
Application processes request
```
#### Forwarded Headers
After successful authentication, the sidecar injects user identity as HTTP headers before forwarding the request to the application container:
| Header | Description | Auth Modes |
|--------|-------------|------------|
| `X-Auth-User` | Username or display name | Token, OIDC, MCP |
| `X-Auth-Email` | User email address | OIDC |
| `X-Auth-Subject` | OIDC `sub` claim (stable user ID) | OIDC, MCP |
| `X-Auth-Groups` | Comma-separated group memberships | OIDC (if `groups` scope) |
| `X-Auth-Token` | The validated access token | All modes |
These headers are trustworthy because the auto-generated `NetworkPolicy` restricts pod ingress to the sidecar port only — external traffic cannot reach the application container directly, so headers cannot be spoofed.
Applications should read these headers to obtain authenticated user information (e.g. for display, authorisation decisions, or audit logging) instead of implementing their own authentication.
**See**: [Developer Guide - Accessing Authenticated User Information](DEVELOPER-GUIDE.md#accessing-authenticated-user-information) for code examples.
---
## Configuration Reference
### Environment Variables
Common environment variables used across applications:
| Variable | Purpose | Example |
|----------|---------|---------|
| `NODE_ENV` | Node.js environment | `production` |
| `PORT` | Application port | `3000` |
| `DB_HOST` | Database host | `postgres` |
| `DB_PORT` | Database port | `5432` |
| `DB_USER` | Database user | `app_user` |
| `DB_NAME` | Database name | `app_db` |
| `DB_PASSWORD` | Database password | From secret |
| `API_KEY` | External API key | From secret |
### Resource Limits
Recommended resource allocation:
| Application Type | CPU Request | Memory Request | CPU Limit | Memory Limit |
|------------------|-------------|----------------|-----------|--------------|
| **Lightweight API** | 100m | 128Mi | 500m | 512Mi |
| **Standard Web App** | 200m | 256Mi | 1000m | 1Gi |
| **Heavy Processing** | 500m | 512Mi | 2000m | 2Gi |
| **Database** | 250m | 256Mi | 1000m | 1Gi |
### Storage Classes
Storage classes are cloud-specific and configured in per-cluster value overrides (`infra/values/{cluster}/gitea-values.yaml`):
| Cloud | Storage Class | Driver |
|-------|--------------|--------|
| **UpCloud** | `upcloud-block-storage-maxiops` | UpCloud CSI |
| **AWS EKS** | `gp3` | EBS CSI |
| **Azure AKS** | `managed-csi-premium` | Azure Disk CSI |
| **GCP GKE** | `premium-rwo` | PD CSI |
```yaml
# Example: base values omit storageClass (set in per-cluster overlay)
persistence:
enabled: true
accessMode: ReadWriteOnce
size: 5Gi
# storageClass set by infra/values/{cluster}/gitea-values.yaml
```
---
## API Endpoints
### ArgoCD API
```
# Server
https://argocd.127.0.0.1.nip.io
# Applications endpoint
GET /api/v1/applications
# Application details
GET /api/v1/applications/{name}
# Sync application
POST /api/v1/applications/{name}/sync
```
### Prometheus API
```
# Query endpoint
GET /api/v1/query?query={promql}
# Query range
GET /api/v1/query_range?query={promql}&start={time}&end={time}&step={duration}
# Metrics
GET /api/v1/label/__name__/values
```
### Tempo API
```
# Search traces
GET /api/search?q={traceql}
# Get trace by ID
GET /api/traces/{traceID}
# Service tag values
GET /api/v2/search/tag/resource.service.name/values
```
### Loki API
```
# Query logs
GET /loki/api/v1/query?query={logql}
# Query range
GET /loki/api/v1/query_range?query={logql}&start={time}&end={time}
# Push logs
POST /loki/api/v1/push
```
---
## Cloud Overlay Pattern
### Overview
Cloud-specific configuration (StorageClass, LoadBalancer annotations, pricing models, etc.) lives in per-cloud overlay value files, **not** in `base/`. Adding a new cloud provider only requires a new overlay directory — no base changes.
### Supported Clouds
| Cloud | Dev overlay | Prod overlay | StorageClass | LB type |
|-------|-----------|-------------|-------------|---------|
| **UpCloud** | `upc-dev` | `upc-prod` | `upcloud-block-storage-maxiops` | UpCloud LB (proxy protocol v2) |
| **Azure AKS** | `aks-dev` | `aks-prod` | `managed-csi-premium` | Azure LB |
| **AWS EKS** | `eks-dev` | `eks-prod` | `gp3` | AWS NLB (proxy protocol) |
| **GCP GKE** | `gke-dev` | `gke-prod` | `premium-rwo` | GCP NEG |
Bootstrap any cluster with: `./bootstrap.sh <cluster>` (e.g., `./bootstrap.sh aks-dev`)
### How It Works
Each ArgoCD Application uses **multi-source Helm values** with two value files:
```yaml
# infra/base/gitea.yaml (example)
helm:
valueFiles:
- $values/infra/values/base/gitea-values.yaml # [0] cloud-agnostic
- $values/infra/values/upc-dev/gitea-values.yaml # [1] cloud-specific (default: upc-dev)
```
The `upc-prod` Kustomize overlay patches index `[1]` to swap the cloud-specific file:
```yaml
# infra/overlays/upc-prod/kustomization.yaml
- target:
kind: Application
name: gitea
patch: |
- op: replace
path: /spec/sources/0/helm/valueFiles/1
value: $values/infra/values/upc-prod/gitea-values.yaml
```
### Components Using Cloud Overlays
| Component | Cloud-specific config | Overlay value file |
|-----------|----------------------|-------------------|
| **Traefik** | LB annotations, proxy protocol IPs | `traefik-values.yaml` |
| **Keycloak** | Hostname, TLS settings | `keycloak-values.yaml` |
| **Grafana** | Hostname, datasource URLs | `grafana-values.yaml` |
| **Gitea** | StorageClass (persistence + PostgreSQL) | `gitea-values.yaml` |
| **OpenCost** | Custom pricing model (CPU/RAM/storage rates) | `opencost-values.yaml` |
### Backup CronJob
The `gitea-backup` CronJob uses a generic `s3` alias for `minio/mc`. The actual endpoint and credentials come from the `gitea-backup-s3` Sealed Secret, which is per-cloud. Reference scripts for different cloud providers are in `scripts/backup/`:
| Script | Provider | Tool |
|--------|----------|------|
| `s3-minio.sh` | S3-compatible (UpCloud, MinIO, Wasabi) | `minio/mc` |
| `aws-s3.sh` | AWS S3 | `aws` CLI |
| `azure-blob.sh` | Azure Blob Storage | `az` CLI |
| `gcp-gcs.sh` | GCP Cloud Storage | `gsutil` |
### Adding a New Cloud Provider
To add support for a new cloud (e.g., `oci-dev` for Oracle Cloud):
1. **Cluster config**: `clusters/oci-dev.yaml` — clusterName, domain, trustedIPs, cloudProvider
2. **Overlay value files** in `infra/values/oci-dev/`:
- `traefik-values.yaml` — LB annotations, proxy protocol config
- `keycloak-values.yaml` — hostname
- `grafana-values.yaml` — hostname
- `gitea-values.yaml``storageClass` for persistence + PostgreSQL
- `opencost-values.yaml` — pricing model or cloud billing integration
3. **Kustomize overlay**: `infra/overlays/oci-dev/kustomization.yaml` — patch `valueFiles[1]` for each Application
4. **App-of-apps**: `_app-of-apps-oci-dev.yaml` — points to `infra/overlays/oci-dev`
5. **Secrets overlay**: `secrets/overlays/oci-dev/kustomization.yaml` — references `../../base`, add cloud-specific SealedSecrets if needed
6. **Secrets patch**: Add patch to `infra/overlays/oci-dev/kustomization.yaml` to swap secrets path to `secrets/overlays/oci-dev`
7. **Bootstrap**: `./bootstrap.sh oci-dev`
---
## Glossary
### Terms
**App-of-Apps**: ArgoCD pattern where a parent Application manages child Applications
**GitOps**: Operations approach where Git is the single source of truth
**IngressRoute**: Traefik CRD for routing external traffic to services
**Multi-Source**: ArgoCD feature allowing multiple Git sources per Application
**SealedSecret**: Encrypted secret that can be safely stored in Git
**Sync Wave**: Ordered deployment using annotations
**Self-Heal**: ArgoCD automatically reverts manual cluster changes
**Prune**: Automatically delete resources removed from Git
---
## Annotations Reference
### ArgoCD Annotations
```yaml
# Sync wave (deployment order)
argocd.argoproj.io/sync-wave: "1"
# Refresh application
argocd.argoproj.io/refresh: "hard"
# Compare options
argocd.argoproj.io/compare-options: IgnoreExtraneous
# Sync options per resource
argocd.argoproj.io/sync-options: Prune=false
```
### Kyverno Annotations
```yaml
# Exclude from policy
policies.kyverno.io/exclude: "true"
# Severity
policies.kyverno.io/severity: high
```
### Custom Annotations
```yaml
# Authentication enabled
policies.forteapps.io/auth: "true"
# OIDC configuration
policies.forteapps.io/auth-oidc-authority: "https://..."
policies.forteapps.io/auth-oidc-client-id: "client-id"
```
---
## Labels Reference
### Standard Labels
```yaml
# Application name
app.kubernetes.io/name: myapp
# Application instance
app.kubernetes.io/instance: myapp
# Application version
app.kubernetes.io/version: "1.0.0"
# Component type
app.kubernetes.io/component: frontend
# Part of larger application
app.kubernetes.io/part-of: ecommerce
# Managed by
app.kubernetes.io/managed-by: argocd
```
### Custom Labels
```yaml
# Allow secret cloning
allowedToBeCloned: "true"
# Environment
environment: production
# Team ownership
team: platform
```
---
## Version Matrix
### Component Versions
| Component | Version | Chart Version |
|-----------|---------|---------------|
| **ArgoCD** | 2.9.0+ | Latest |
| **Traefik** | 2.10.0+ | Latest |
| **Cert-Manager** | 1.13.0+ | Latest |
| **Kyverno** | 1.10.0+ | Latest |
| **Sealed Secrets** | 0.24.0+ | Latest |
| **Prometheus** | 2.47.0+ | Latest |
| **Grafana** | 10.0.0+ | Latest |
| **Loki** | 2.9.0+ | Latest |
| **Tempo** | 2.6.0+ | 1.24.4 |
| **Fluent-Bit** | 2.1.0+ | Latest |
| **Gitea** | 1.25.4 | 12.5.0 |
| **Gitea Act Runner** | Latest | Latest |
| **Renovate** | v43.113.0 | 46.109.0 |
| **PostgreSQL** | 16-alpine | N/A |
| **Trivy** | Latest | Latest |
### Kubernetes Compatibility
- **Minimum**: 1.24+
- **Tested**: 1.28+
- **Recommended**: Latest stable
---
**Last Updated**: 2026-04-22
**Maintained By**: Platform Team
**Version**: 1.0.0