25 Commits

Author SHA1 Message Date
2a50028e51 merge
All checks were successful
AI Code Review / ai-review (pull_request) Successful in 10s
2026-06-04 15:26:20 +02:00
Sten
e5da47efb3 refactor(apps): move forte-drop apps from base to upc-dev overlay
forte-drop, forte-drop-mcp and forte-drop-postgresql lived under apps/base/
but were only ever wired into the upc-dev overlay (never listed in
apps/base/kustomization.yaml). They carry hackathon-domain hardcoded values
and must not sync to upc-prod, so they belong in the overlay alongside
dbunk-demo — per danijel.simeunovic's review on PR #18.

- git mv the three dirs into apps/overlays/upc-dev/ (history preserved)
- rewrite overlay kustomization refs from ../../base/forte-drop* to local
- repoint forte-drop-postgresql Application path
  apps/base/... -> apps/overlays/upc-dev/forte-drop-postgresql/resources

Render-verified: kubectl kustomize apps/overlays/upc-dev differs only by the
postgres path line; apps/overlays/upc-prod render byte-identical (forte-drop
never reaches prod).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 15:22:34 +02:00
Sten
e49c0928d2 refactor(apps): registrar-managed oidc creds, drop mcp client, DRY secret
Per platform review (danijel):
- keycloak-client-forte-drop: add the secret{} block telling the
  registrar where to write the credential Secret + key names
  (forte-drop-oidc-credentials, client-id/client-secret). The
  forte-helm oidc sidecar consumes that registrar-created Secret —
  no manual auth-oidc SealedSecret step (removed that NOTE).
- Delete keycloak-client-forte-drop-mcp: auth.type: mcp auto-registers
  the MCP client; no manual config needed.
- Re-seal forte-drop-secrets with all shared env (BASE_DOMAIN, PG*,
  S3_*, PASSWORD_GATE_SECRET) so both deployments get identical values
  via envSecretName (values extraEnv now carries only APP_MODE).
2026-06-04 15:22:05 +02:00
Sten
d83cbdc7ca chore(apps): clarify auth-oidc follow-up (drop commented-out resource line)
ai-review: a commented-out resource line reads as GitOps debt. Replace
the '# - auth-oidc-sealed.yaml' line with an explicit NOTE explaining
it's a deliberate post-deploy step (needs the registrar-generated
client-secret), not a disabled resource.
2026-06-04 15:22:05 +02:00
Sten
5913e0c4c0 refactor(apps): move forte-drop postgres from infra to apps
Per reviewer (danijel): forte-drop's DB deployment belongs in apps/,
not infra/. Straight relocation — same structure (Application +
resources/ subdir), source.path updated to apps/base/forte-drop-postgresql/resources,
wired into apps/overlays/upc-dev. Backup CronJob + RESTORE.md + sealed
pg creds move with it.

Consolidates the whole forte-drop deployment (postgres + web + mcp)
under apps/. The infra PR (#17) is now superseded by this.
2026-06-04 15:22:05 +02:00
Sten
6f6f8c1c55 fix(apps): explicit forte-drop namespace (sync-wave -1, Prune=false)
Codex review: the apps overlay applies namespaced resources
(keycloak-client Secrets, forte-drop-secrets, PDB) to forte-drop, but
no base created the namespace — first sync on a fresh cluster raced
ahead of the Applications' CreateNamespace and failed with
'namespaces forte-drop not found' until a retry.

Add an explicit Namespace at sync-wave -1 so it exists before the
wave-0 namespaced resources (covers both web + mcp bases via the
shared parent). Prune=false keeps removing a base from cascade-
deleting the namespace + postgres data + the other deployment.
2026-06-04 15:22:04 +02:00
Sten
6d25437e98 feat(apps): add forte-drop-secrets sealed secret
Sealed forte-drop-secrets with the real UpCloud Managed Object Storage
creds (existing drops bucket), PG creds matching the deployed
forte-drop-pg-creds, and PASSWORD_GATE_SECRET. Consumed by both web +
mcp deployments (envSecretName) and the pg-backup CronJob (S3 creds).
2026-06-04 15:22:04 +02:00
Sten
46f2d2d661 feat(apps): PodDisruptionBudget for forte-drop web (minAvailable 1) 2026-06-04 15:22:04 +02:00
c840dbb4b5 merge 2026-06-04 15:21:35 +02:00
Sten
a1a7c048c1 docs(apps): clarify mcp deployment needs no auth-oidc secret 2026-06-04 14:53:18 +02:00
Sten
d6e61c5663 feat(apps): forte-drop web + mcp ArgoCD applications
Two ArgoCD apps from the same forte-drop image:
- forte-drop (web): admin + public drops, sidecar in oidc mode,
  ingress drop-k8s.hackathon.forteapps.net.
- forte-drop-mcp (mcp): MCP-over-HTTP, sidecar in mcp mode,
  ingress mcp.drop-k8s.hackathon.forteapps.net.

Plus two labeled Keycloak client config Secrets — the registrar
creates the OIDC clients in the forte realm within ~2 min.

Sealed secrets (forte-drop-secrets + auth-oidc) added in a
follow-up commit by the maintainer:
  cd /Users/sten/dev/work/forte_k8/launchpad
  kubeseal --format=yaml \
    --controller-name=sealed-secrets-controller \
    --controller-namespace=kube-system \
    < private/forte-drop-secrets.yaml \
    > apps/base/forte-drop/forte-drop-secrets-sealed.yaml
  # auth-oidc: wait for registrar, copy client-secret into private/,
  # then seal as apps/base/forte-drop/auth-oidc-sealed.yaml.
  # (mcp deployment is sidecar type=mcp — no auth-oidc Secret needed;
  # only the web deployment requires it.)
2026-06-04 14:53:18 +02:00
dffb9c43f0 dbunk delete 2026-06-03 20:16:37 +02:00
33f0463c1f upc dev spec 2026-06-03 20:14:21 +02:00
a997a6b81e kc cleanup 2026-06-03 17:41:10 +02:00
071f57f1d3 kc cleanup 2026-06-03 17:39:02 +02:00
ecf871f0e4 kc fix 2026-06-03 17:36:29 +02:00
376d81a5ac keycloak client cleanup 2026-06-03 17:28:08 +02:00
428de7af78 tofu config and docs 2026-05-31 20:48:25 +02:00
24c59256c9 tofu+tools 2026-05-31 19:53:26 +02:00
e319295f62 bunker host 2026-05-29 22:06:08 +02:00
a7106bc8f4 new tls wildcard 2026-05-29 21:58:34 +02:00
6d874111da tenantID 2026-05-29 21:51:27 +02:00
a8cc103e4c dns01 2026-05-29 21:48:32 +02:00
Ghost
a9dbaf5354 feature/tofu (#15)
@thomas.solbjor her er "import" av tofu fra ditt repo med justeringer for å tilpasse patterns her. Også minimalisert til å kun opprette cluster, ingen managed services som postgres etc. Ta en titt.

Co-authored-by: Danijel Simeunovic <danijel.simeunovic@fortedigital.com>
Reviewed-on: #15
Reviewed-by: Danijel Simeunovic <danijel.simeunovic@fortedigital.com>
Co-authored-by: Ghost <>
Co-committed-by: Ghost <>
2026-05-29 15:48:28 +00:00
6e175e9e8c docs 2026-05-29 15:20:51 +02:00
75 changed files with 3414 additions and 39 deletions

8
.gitignore vendored
View File

@@ -16,3 +16,11 @@ devbox.d/
devbox.lock devbox.lock
.devbox/ .devbox/
bash.exe.stackdump bash.exe.stackdump
# OpenTofu
.tofu/configs/*.env
.tofu/scripts/*.config
.tofu/platforms/**/.terraform/
.tofu/platforms/**/terraform.tfstate*
.tofu/platforms/**/tfplan
.tofu/platforms/**/.terraform.lock.hcl

View File

@@ -0,0 +1,9 @@
# Azure AKS credentials — copy to aks.env and fill in values
# NEVER commit aks.env to git!
# Required
AZURE_TENANT_ID=your-azure-tenant-id
AZURE_SUBSCRIPTION_ID=your-azure-subscription-id
# Optional — defaults to cluster name if not set
ARM_RESOURCE_GROUP=

View File

@@ -0,0 +1,10 @@
# AWS EKS credentials — copy to eks.env and fill in values
# NEVER commit eks.env to git!
# Required — AWS CLI profile or access key
AWS_PROFILE=default
AWS_REGION=eu-west-1
# Optional — override with explicit keys instead of profile
# AWS_ACCESS_KEY_ID=
# AWS_SECRET_ACCESS_KEY=

View File

@@ -0,0 +1,9 @@
# GCP GKE credentials — copy to gke.env and fill in values
# NEVER commit gke.env to git!
# Required
GCP_PROJECT_ID=your-gcp-project-id
GCP_REGION=europe-west4
# Optional — path to service account JSON key (if not using gcloud auth)
# GOOGLE_APPLICATION_CREDENTIALS=/path/to/sa-key.json

View File

@@ -0,0 +1,8 @@
# UpCloud credentials — copy to upc.env and fill in values
# NEVER commit upc.env to git!
# Required
UPCLOUD_TOKEN=your-upcloud-api-token
# Optional — set after cluster creation for kubeconfig retrieval
UPCLOUD_CLUSTER_ID=

View File

@@ -0,0 +1,18 @@
module "cluster" {
source = "../modules/cluster"
prefix = "clst-dev"
location = "norwayeast"
resource_group_name = "clst-dev-rg"
# AKS — small dev nodes
aks_node_vm_size = "Standard_B2s"
aks_node_count = 2
enable_delete_lock = false
tags = {
Environment = "dev"
ManagedBy = "tofu"
}
}

View File

@@ -0,0 +1,26 @@
# ─── Cluster ─────────────────────────────────────────────────────────
output "cluster_name" {
value = module.cluster.cluster_name
}
output "resource_group_name" {
value = module.cluster.resource_group_name
}
output "kubernetes_version" {
value = module.cluster.kubernetes_version
}
output "location" {
value = module.cluster.location
}
output "oidc_issuer_url" {
value = module.cluster.oidc_issuer_url
}
output "kubeconfig" {
value = module.cluster.kubeconfig
sensitive = true
}

View File

@@ -0,0 +1,17 @@
terraform {
required_version = ">= 1.0"
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 4.0"
}
}
}
provider "azurerm" {
features {}
# Credentials via environment variables:
# ARM_SUBSCRIPTION_ID, ARM_TENANT_ID, ARM_CLIENT_ID, ARM_CLIENT_SECRET
# Or: az login (uses your Azure CLI session)
}

View File

@@ -0,0 +1,72 @@
# Current Azure/Entra ID context — provides tenant_id used in outputs
data "azurerm_client_config" "current" {}
# ─── Resource Group ───────────────────────────────────────────────────
resource "azurerm_resource_group" "main" {
name = var.resource_group_name
location = var.location
tags = var.tags
}
resource "azurerm_management_lock" "main" {
count = var.enable_delete_lock ? 1 : 0
name = "${var.prefix}-delete-lock"
scope = azurerm_resource_group.main.id
lock_level = "CanNotDelete"
notes = "Prevents accidental deletion of production resources"
}
# ─── Networking ───────────────────────────────────────────────────────
resource "azurerm_virtual_network" "main" {
name = "${var.prefix}-vnet"
resource_group_name = azurerm_resource_group.main.name
location = azurerm_resource_group.main.location
address_space = [var.vnet_address_space]
tags = var.tags
}
# AKS nodes subnet
resource "azurerm_subnet" "aks" {
name = "${var.prefix}-aks-subnet"
resource_group_name = azurerm_resource_group.main.name
virtual_network_name = azurerm_virtual_network.main.name
address_prefixes = [var.aks_subnet_cidr]
}
# ─── AKS Cluster ──────────────────────────────────────────────────────
resource "azurerm_kubernetes_cluster" "main" {
name = "${var.prefix}-aks"
resource_group_name = azurerm_resource_group.main.name
location = azurerm_resource_group.main.location
dns_prefix = replace(var.prefix, "-", "")
kubernetes_version = var.aks_kubernetes_version
tags = var.tags
default_node_pool {
name = "system"
node_count = var.aks_node_count
vm_size = var.aks_node_vm_size
vnet_subnet_id = azurerm_subnet.aks.id
node_labels = {
prefix = var.prefix
role = "worker"
env = lookup(var.tags, "Environment", "dev")
}
}
identity {
type = "SystemAssigned"
}
network_profile {
network_plugin = "azure"
network_policy = "azure"
}
# Enable Workload Identity for keyless Azure service access (MSI)
oidc_issuer_enabled = true
workload_identity_enabled = true
}

View File

@@ -0,0 +1,32 @@
# ─── Cluster ─────────────────────────────────────────────────────────
output "cluster_name" {
description = "AKS cluster name"
value = azurerm_kubernetes_cluster.main.name
}
output "resource_group_name" {
description = "Resource group name"
value = azurerm_resource_group.main.name
}
output "kubernetes_version" {
description = "Kubernetes version"
value = azurerm_kubernetes_cluster.main.kubernetes_version
}
output "location" {
description = "Azure region"
value = azurerm_resource_group.main.location
}
output "oidc_issuer_url" {
description = "AKS OIDC issuer URL (for workload identity federation)"
value = azurerm_kubernetes_cluster.main.oidc_issuer_url
}
output "kubeconfig" {
description = "Kubeconfig for the AKS cluster"
value = azurerm_kubernetes_cluster.main.kube_config_raw
sensitive = true
}

View File

@@ -0,0 +1,18 @@
terraform {
required_version = ">= 1.0"
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 4.0"
}
azuread = {
source = "hashicorp/azuread"
version = "~> 3.0"
}
random = {
source = "hashicorp/random"
version = "~> 3.0"
}
}
}

View File

@@ -0,0 +1,56 @@
# ─── Cluster ─────────────────────────────────────────────────────────
variable "prefix" {
description = "Prefix for resource names"
type = string
}
variable "location" {
description = "Azure region (e.g., norwayeast, westeurope, northeurope)"
type = string
}
variable "resource_group_name" {
description = "Name of the Azure Resource Group to create"
type = string
}
variable "vnet_address_space" {
description = "Address space for the virtual network"
type = string
default = "10.100.0.0/16"
}
variable "aks_subnet_cidr" {
description = "CIDR block for the AKS node subnet"
type = string
default = "10.100.0.0/22"
}
variable "aks_node_vm_size" {
description = "VM size for AKS worker nodes (e.g., Standard_B2s, Standard_D4s_v3)"
type = string
}
variable "aks_node_count" {
description = "Number of AKS worker nodes"
type = number
}
variable "aks_kubernetes_version" {
description = "Kubernetes version for AKS (null = latest stable)"
type = string
default = null
}
variable "enable_delete_lock" {
description = "Protect the resource group from accidental deletion"
type = bool
default = false
}
variable "tags" {
description = "Tags applied to all resources"
type = map(string)
default = {}
}

View File

@@ -0,0 +1,18 @@
module "cluster" {
source = "../modules/cluster"
prefix = "clst"
location = "westeurope"
resource_group_name = "clst-prod-rg"
# AKS — general-purpose nodes for production
aks_node_vm_size = "Standard_D4s_v3"
aks_node_count = 3
enable_delete_lock = true
tags = {
Environment = "prod"
ManagedBy = "tofu"
}
}

View File

@@ -0,0 +1,26 @@
# ─── Cluster ─────────────────────────────────────────────────────────
output "cluster_name" {
value = module.cluster.cluster_name
}
output "resource_group_name" {
value = module.cluster.resource_group_name
}
output "kubernetes_version" {
value = module.cluster.kubernetes_version
}
output "location" {
value = module.cluster.location
}
output "oidc_issuer_url" {
value = module.cluster.oidc_issuer_url
}
output "kubeconfig" {
value = module.cluster.kubeconfig
sensitive = true
}

View File

@@ -0,0 +1,17 @@
terraform {
required_version = ">= 1.0"
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 4.0"
}
}
}
provider "azurerm" {
features {}
# Credentials via environment variables:
# ARM_SUBSCRIPTION_ID, ARM_TENANT_ID, ARM_CLIENT_ID, ARM_CLIENT_SECRET
# Or: az login (uses your Azure CLI session)
}

View File

@@ -0,0 +1,173 @@
# =============================================================================
# Azure Workload Cluster
# =============================================================================
# A lean AKS cluster for running application workloads. No managed data
# services — those live on the platform cluster. ArgoCD (on the platform
# cluster) deploys apps to this cluster via the app-of-apps pattern.
#
# Platform components deployed by deploy-workload.sh:
# nginx-ingress, cert-manager, external-dns, external-secrets, alloy
#
# Usage:
# tofu init && tofu plan && tofu apply
# ./sync-tofu-outputs.sh --env azure-workload
# ./deploy-workload.sh --env azure-workload
# =============================================================================
variable "prefix" {
description = "Prefix for resource names (e.g., clst-workload)"
type = string
default = "clst-workload"
}
variable "location" {
description = "Azure region"
type = string
default = "norwayeast"
}
variable "resource_group_name" {
description = "Name of the Azure Resource Group to create"
type = string
default = "clst-workload-rg"
}
variable "vnet_address_space" {
description = "Address space for the virtual network"
type = string
default = "10.110.0.0/16"
}
variable "aks_subnet_cidr" {
description = "CIDR block for the AKS node subnet"
type = string
default = "10.110.0.0/22"
}
variable "aks_node_vm_size" {
description = "VM size for AKS worker nodes"
type = string
default = "Standard_B2s"
}
variable "aks_node_count" {
description = "Number of AKS worker nodes"
type = number
default = 2
}
variable "aks_kubernetes_version" {
description = "Kubernetes version for AKS (null = latest stable)"
type = string
default = null
}
variable "domain" {
description = "Public domain name — must have an existing Azure DNS zone"
type = string
}
variable "dns_zone_resource_group" {
description = "Resource group containing the Azure DNS zone (defaults to cluster RG)"
type = string
default = ""
}
variable "tags" {
description = "Tags applied to all resources"
type = map(string)
default = {
Environment = "workload"
ManagedBy = "tofu"
}
}
# ─── Resource Group ───────────────────────────────────────────────────
resource "azurerm_resource_group" "main" {
name = var.resource_group_name
location = var.location
tags = var.tags
}
# ─── Networking ───────────────────────────────────────────────────────
resource "azurerm_virtual_network" "main" {
name = "${var.prefix}-vnet"
resource_group_name = azurerm_resource_group.main.name
location = azurerm_resource_group.main.location
address_space = [var.vnet_address_space]
tags = var.tags
}
resource "azurerm_subnet" "aks" {
name = "${var.prefix}-aks-subnet"
resource_group_name = azurerm_resource_group.main.name
virtual_network_name = azurerm_virtual_network.main.name
address_prefixes = [var.aks_subnet_cidr]
}
# ─── AKS Cluster ──────────────────────────────────────────────────────
resource "azurerm_kubernetes_cluster" "main" {
name = "${var.prefix}-aks"
resource_group_name = azurerm_resource_group.main.name
location = azurerm_resource_group.main.location
dns_prefix = replace(var.prefix, "-", "")
kubernetes_version = var.aks_kubernetes_version
tags = var.tags
default_node_pool {
name = "system"
node_count = var.aks_node_count
vm_size = var.aks_node_vm_size
vnet_subnet_id = azurerm_subnet.aks.id
node_labels = {
prefix = var.prefix
role = "worker"
env = lookup(var.tags, "Environment", "workload")
}
}
identity {
type = "SystemAssigned"
}
network_profile {
network_plugin = "azure"
network_policy = "azure"
}
oidc_issuer_enabled = true
workload_identity_enabled = true
}
# ─── External-DNS Workload Identity ──────────────────────────────────
# Allows external-dns to manage Azure DNS records for app ingresses.
data "azurerm_dns_zone" "main" {
name = var.domain
resource_group_name = var.dns_zone_resource_group != "" ? var.dns_zone_resource_group : azurerm_resource_group.main.name
}
resource "azurerm_user_assigned_identity" "external_dns" {
name = "${var.prefix}-external-dns-identity"
resource_group_name = azurerm_resource_group.main.name
location = azurerm_resource_group.main.location
tags = var.tags
}
resource "azurerm_role_assignment" "external_dns_dns_contributor" {
scope = data.azurerm_dns_zone.main.id
role_definition_name = "DNS Zone Contributor"
principal_id = azurerm_user_assigned_identity.external_dns.principal_id
}
resource "azurerm_federated_identity_credential" "external_dns" {
name = "${var.prefix}-external-dns-fedcred"
resource_group_name = azurerm_resource_group.main.name
parent_id = azurerm_user_assigned_identity.external_dns.id
audience = ["api://AzureADTokenExchange"]
issuer = azurerm_kubernetes_cluster.main.oidc_issuer_url
subject = "system:serviceaccount:external-dns:external-dns"
}

View File

@@ -0,0 +1,4 @@
output "cluster_name" { value = azurerm_kubernetes_cluster.main.name }
output "resource_group_name" { value = azurerm_resource_group.main.name }
output "location" { value = azurerm_resource_group.main.location }
output "external_dns_identity_client_id" { value = azurerm_user_assigned_identity.external_dns.client_id }

View File

@@ -0,0 +1,21 @@
terraform {
required_version = ">= 1.0"
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 4.0"
}
random = {
source = "hashicorp/random"
version = "~> 3.0"
}
}
}
provider "azurerm" {
features {}
# Credentials via environment variables:
# ARM_SUBSCRIPTION_ID, ARM_TENANT_ID, ARM_CLIENT_ID, ARM_CLIENT_SECRET
# Or: az login (uses your Azure CLI session)
}

View File

@@ -0,0 +1,21 @@
module "cluster" {
source = "../modules/cluster"
region = var.region
prefix = "clst-dev"
# VPC
availability_zones = ["${var.region}a", "${var.region}b"]
# EKS — small dev nodes
node_instance_type = "t3.medium"
node_count = 2
node_min_count = 1
node_max_count = 4
kubernetes_version = "1.30"
tags = {
Environment = "dev"
ManagedBy = "tofu"
}
}

View File

@@ -0,0 +1,5 @@
output "cluster_name" { value = module.cluster.cluster_name }
output "aws_region" { value = module.cluster.aws_region }
output "oidc_issuer_url" { value = module.cluster.oidc_issuer_url }
output "oidc_provider_arn" { value = module.cluster.oidc_provider_arn }
output "vpc_id" { value = module.cluster.vpc_id }

View File

@@ -0,0 +1,24 @@
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
tls = {
source = "hashicorp/tls"
version = "~> 4.0"
}
}
}
# Authentication: set AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN
# or configure an AWS profile: export AWS_PROFILE=clst
provider "aws" {
region = var.region
}
variable "region" {
description = "AWS region for dev environment"
type = string
default = "eu-west-1"
}

View File

@@ -0,0 +1,207 @@
# ─── VPC ──────────────────────────────────────────────────────────────
resource "aws_vpc" "main" {
cidr_block = var.vpc_cidr
enable_dns_hostnames = true
enable_dns_support = true
tags = merge(var.tags, { Name = "${var.prefix}-vpc" })
}
resource "aws_internet_gateway" "main" {
vpc_id = aws_vpc.main.id
tags = merge(var.tags, { Name = "${var.prefix}-igw" })
}
# Public subnets (one per AZ) — for NAT gateways and load balancers
resource "aws_subnet" "public" {
count = length(var.availability_zones)
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(var.vpc_cidr, 4, count.index)
availability_zone = var.availability_zones[count.index]
map_public_ip_on_launch = true
tags = merge(var.tags, {
Name = "${var.prefix}-public-${count.index + 1}"
"kubernetes.io/cluster/${var.prefix}-eks" = "shared"
"kubernetes.io/role/elb" = "1"
})
}
# Private subnets (one per AZ) — for EKS nodes
resource "aws_subnet" "private" {
count = length(var.availability_zones)
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(var.vpc_cidr, 4, count.index + length(var.availability_zones))
availability_zone = var.availability_zones[count.index]
tags = merge(var.tags, {
Name = "${var.prefix}-private-${count.index + 1}"
"kubernetes.io/cluster/${var.prefix}-eks" = "shared"
"kubernetes.io/role/internal-elb" = "1"
})
}
# NAT Gateway (single, in first public subnet — use one per AZ for prod HA)
resource "aws_eip" "nat" {
domain = "vpc"
tags = merge(var.tags, { Name = "${var.prefix}-nat-eip" })
}
resource "aws_nat_gateway" "main" {
allocation_id = aws_eip.nat.id
subnet_id = aws_subnet.public[0].id
tags = merge(var.tags, { Name = "${var.prefix}-nat" })
depends_on = [aws_internet_gateway.main]
}
resource "aws_route_table" "public" {
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.main.id
}
tags = merge(var.tags, { Name = "${var.prefix}-public-rt" })
}
resource "aws_route_table_association" "public" {
count = length(var.availability_zones)
subnet_id = aws_subnet.public[count.index].id
route_table_id = aws_route_table.public.id
}
resource "aws_route_table" "private" {
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
nat_gateway_id = aws_nat_gateway.main.id
}
tags = merge(var.tags, { Name = "${var.prefix}-private-rt" })
}
resource "aws_route_table_association" "private" {
count = length(var.availability_zones)
subnet_id = aws_subnet.private[count.index].id
route_table_id = aws_route_table.private.id
}
# ─── EKS Cluster ──────────────────────────────────────────────────────
resource "aws_iam_role" "eks_cluster" {
name_prefix = "${var.prefix}-eks-cluster-"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = { Service = "eks.amazonaws.com" }
}]
})
tags = var.tags
}
resource "aws_iam_role_policy_attachment" "eks_cluster_policy" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"
role = aws_iam_role.eks_cluster.name
}
resource "aws_eks_cluster" "main" {
name = "${var.prefix}-eks"
role_arn = aws_iam_role.eks_cluster.arn
version = var.kubernetes_version
vpc_config {
subnet_ids = concat(aws_subnet.private[*].id, aws_subnet.public[*].id)
endpoint_private_access = true
endpoint_public_access = true
}
# Enable OIDC issuer for IRSA (IAM Roles for Service Accounts)
access_config {
authentication_mode = "API_AND_CONFIG_MAP"
}
tags = var.tags
depends_on = [aws_iam_role_policy_attachment.eks_cluster_policy]
}
# OIDC provider — required for IRSA (IAM Roles for Service Accounts)
data "tls_certificate" "eks" {
url = aws_eks_cluster.main.identity[0].oidc[0].issuer
}
resource "aws_iam_openid_connect_provider" "eks" {
client_id_list = ["sts.amazonaws.com"]
thumbprint_list = [data.tls_certificate.eks.certificates[0].sha1_fingerprint]
url = aws_eks_cluster.main.identity[0].oidc[0].issuer
tags = var.tags
}
# EKS Node Group
resource "aws_iam_role" "eks_nodes" {
name_prefix = "${var.prefix}-eks-nodes-"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = { Service = "ec2.amazonaws.com" }
}]
})
tags = var.tags
}
resource "aws_iam_role_policy_attachment" "eks_worker_node_policy" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"
role = aws_iam_role.eks_nodes.name
}
resource "aws_iam_role_policy_attachment" "eks_cni_policy" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
role = aws_iam_role.eks_nodes.name
}
resource "aws_iam_role_policy_attachment" "eks_ecr_readonly" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
role = aws_iam_role.eks_nodes.name
}
resource "aws_eks_node_group" "main" {
cluster_name = aws_eks_cluster.main.name
node_group_name = "${var.prefix}-nodes"
node_role_arn = aws_iam_role.eks_nodes.arn
subnet_ids = aws_subnet.private[*].id
instance_types = [var.node_instance_type]
scaling_config {
desired_size = var.node_count
max_size = var.node_max_count
min_size = var.node_min_count
}
update_config {
max_unavailable = 1
}
tags = var.tags
depends_on = [
aws_iam_role_policy_attachment.eks_worker_node_policy,
aws_iam_role_policy_attachment.eks_cni_policy,
aws_iam_role_policy_attachment.eks_ecr_readonly,
]
}

View File

@@ -0,0 +1,26 @@
# ─── Cluster ─────────────────────────────────────────────────────────
output "cluster_name" {
description = "EKS cluster name"
value = aws_eks_cluster.main.name
}
output "aws_region" {
description = "AWS region"
value = var.region
}
output "oidc_issuer_url" {
description = "EKS OIDC issuer URL (for IRSA)"
value = aws_eks_cluster.main.identity[0].oidc[0].issuer
}
output "oidc_provider_arn" {
description = "IAM OIDC provider ARN (for IRSA trust policies)"
value = aws_iam_openid_connect_provider.eks.arn
}
output "vpc_id" {
description = "VPC ID"
value = aws_vpc.main.id
}

View File

@@ -0,0 +1,12 @@
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
tls = {
source = "hashicorp/tls"
version = "~> 4.0"
}
}
}

View File

@@ -0,0 +1,61 @@
# ─── Region ──────────────────────────────────────────────────────────
variable "region" {
description = "AWS region (e.g., eu-west-1, us-east-1)"
type = string
}
variable "prefix" {
description = "Prefix for resource names (e.g., clst-dev)"
type = string
}
# ─── Networking ───────────────────────────────────────────────────────
variable "vpc_cidr" {
description = "VPC CIDR block"
type = string
default = "10.100.0.0/16"
}
variable "availability_zones" {
description = "List of AZs for subnets (23 recommended)"
type = list(string)
}
# ─── EKS Cluster ─────────────────────────────────────────────────────
variable "node_instance_type" {
description = "EKS node instance type (e.g., t3.medium, m5.xlarge)"
type = string
}
variable "node_count" {
description = "Desired number of EKS worker nodes"
type = number
}
variable "node_min_count" {
description = "Minimum number of EKS worker nodes"
type = number
default = 1
}
variable "node_max_count" {
description = "Maximum number of EKS worker nodes"
type = number
}
variable "kubernetes_version" {
description = "Kubernetes version for EKS (e.g., \"1.30\")"
type = string
default = "1.30"
}
# ─── Tags ─────────────────────────────────────────────────────────────
variable "tags" {
description = "Tags applied to all resources"
type = map(string)
default = {}
}

View File

@@ -0,0 +1,21 @@
module "cluster" {
source = "../modules/cluster"
region = var.region
prefix = "clst"
# VPC
availability_zones = ["${var.region}a", "${var.region}b", "${var.region}c"]
# EKS — general-purpose nodes for production
node_instance_type = "m5.xlarge"
node_count = 3
node_min_count = 3
node_max_count = 6
kubernetes_version = "1.30"
tags = {
Environment = "prod"
ManagedBy = "tofu"
}
}

View File

@@ -0,0 +1,5 @@
output "cluster_name" { value = module.cluster.cluster_name }
output "aws_region" { value = module.cluster.aws_region }
output "oidc_issuer_url" { value = module.cluster.oidc_issuer_url }
output "oidc_provider_arn" { value = module.cluster.oidc_provider_arn }
output "vpc_id" { value = module.cluster.vpc_id }

View File

@@ -0,0 +1,22 @@
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
tls = {
source = "hashicorp/tls"
version = "~> 4.0"
}
}
}
provider "aws" {
region = var.region
}
variable "region" {
description = "AWS region for prod environment"
type = string
default = "eu-west-1"
}

View File

@@ -0,0 +1,339 @@
# =============================================================================
# AWS Workload Cluster
# =============================================================================
# A lean EKS cluster for running application workloads. No managed data
# services — those live on the platform cluster. ArgoCD (on the platform
# cluster) deploys apps to this cluster via the app-of-apps pattern.
#
# Platform components deployed by deploy-workload.sh:
# nginx-ingress, cert-manager, external-dns, external-secrets, alloy
#
# Usage:
# tofu init && tofu plan && tofu apply
# ./sync-tofu-outputs.sh --env aws-workload
# ./deploy-workload.sh --env aws-workload
# =============================================================================
variable "prefix" {
description = "Prefix for resource names (e.g., clst-workload)"
type = string
default = "clst-workload"
}
variable "availability_zones" {
description = "List of AZs for subnets"
type = list(string)
default = ["eu-west-1a", "eu-west-1b"]
}
variable "vpc_cidr" {
description = "VPC CIDR block"
type = string
default = "10.110.0.0/16"
}
variable "node_instance_type" {
description = "EKS node instance type"
type = string
default = "t3.medium"
}
variable "node_count" {
description = "Desired number of EKS worker nodes"
type = number
default = 2
}
variable "node_min_count" {
description = "Minimum number of EKS worker nodes"
type = number
default = 1
}
variable "node_max_count" {
description = "Maximum number of EKS worker nodes"
type = number
default = 4
}
variable "kubernetes_version" {
description = "Kubernetes version for EKS"
type = string
default = "1.30"
}
variable "domain" {
description = "Public domain name — must have an existing Route53 hosted zone"
type = string
}
variable "tags" {
description = "Tags applied to all resources"
type = map(string)
default = {
Environment = "workload"
ManagedBy = "tofu"
}
}
# ─── VPC ──────────────────────────────────────────────────────────────
resource "aws_vpc" "main" {
cidr_block = var.vpc_cidr
enable_dns_hostnames = true
enable_dns_support = true
tags = merge(var.tags, { Name = "${var.prefix}-vpc" })
}
resource "aws_internet_gateway" "main" {
vpc_id = aws_vpc.main.id
tags = merge(var.tags, { Name = "${var.prefix}-igw" })
}
resource "aws_subnet" "public" {
count = length(var.availability_zones)
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(var.vpc_cidr, 4, count.index)
availability_zone = var.availability_zones[count.index]
map_public_ip_on_launch = true
tags = merge(var.tags, {
Name = "${var.prefix}-public-${count.index + 1}"
"kubernetes.io/cluster/${var.prefix}-eks" = "shared"
"kubernetes.io/role/elb" = "1"
})
}
resource "aws_subnet" "private" {
count = length(var.availability_zones)
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(var.vpc_cidr, 4, count.index + length(var.availability_zones))
availability_zone = var.availability_zones[count.index]
tags = merge(var.tags, {
Name = "${var.prefix}-private-${count.index + 1}"
"kubernetes.io/cluster/${var.prefix}-eks" = "shared"
"kubernetes.io/role/internal-elb" = "1"
})
}
resource "aws_eip" "nat" {
domain = "vpc"
tags = merge(var.tags, { Name = "${var.prefix}-nat-eip" })
}
resource "aws_nat_gateway" "main" {
allocation_id = aws_eip.nat.id
subnet_id = aws_subnet.public[0].id
tags = merge(var.tags, { Name = "${var.prefix}-nat" })
depends_on = [aws_internet_gateway.main]
}
resource "aws_route_table" "public" {
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.main.id
}
tags = merge(var.tags, { Name = "${var.prefix}-public-rt" })
}
resource "aws_route_table_association" "public" {
count = length(var.availability_zones)
subnet_id = aws_subnet.public[count.index].id
route_table_id = aws_route_table.public.id
}
resource "aws_route_table" "private" {
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
nat_gateway_id = aws_nat_gateway.main.id
}
tags = merge(var.tags, { Name = "${var.prefix}-private-rt" })
}
resource "aws_route_table_association" "private" {
count = length(var.availability_zones)
subnet_id = aws_subnet.private[count.index].id
route_table_id = aws_route_table.private.id
}
# ─── EKS Cluster ──────────────────────────────────────────────────────
resource "aws_iam_role" "eks_cluster" {
name_prefix = "${var.prefix}-eks-cluster-"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = { Service = "eks.amazonaws.com" }
}]
})
tags = var.tags
}
resource "aws_iam_role_policy_attachment" "eks_cluster_policy" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"
role = aws_iam_role.eks_cluster.name
}
resource "aws_eks_cluster" "main" {
name = "${var.prefix}-eks"
role_arn = aws_iam_role.eks_cluster.arn
version = var.kubernetes_version
vpc_config {
subnet_ids = concat(aws_subnet.private[*].id, aws_subnet.public[*].id)
endpoint_private_access = true
endpoint_public_access = true
}
access_config {
authentication_mode = "API_AND_CONFIG_MAP"
}
tags = var.tags
depends_on = [aws_iam_role_policy_attachment.eks_cluster_policy]
}
# OIDC provider — required for IRSA
data "tls_certificate" "eks" {
url = aws_eks_cluster.main.identity[0].oidc[0].issuer
}
resource "aws_iam_openid_connect_provider" "eks" {
client_id_list = ["sts.amazonaws.com"]
thumbprint_list = [data.tls_certificate.eks.certificates[0].sha1_fingerprint]
url = aws_eks_cluster.main.identity[0].oidc[0].issuer
tags = var.tags
}
resource "aws_iam_role" "eks_nodes" {
name_prefix = "${var.prefix}-eks-nodes-"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = { Service = "ec2.amazonaws.com" }
}]
})
tags = var.tags
}
resource "aws_iam_role_policy_attachment" "eks_worker_node_policy" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"
role = aws_iam_role.eks_nodes.name
}
resource "aws_iam_role_policy_attachment" "eks_cni_policy" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
role = aws_iam_role.eks_nodes.name
}
resource "aws_iam_role_policy_attachment" "eks_ecr_readonly" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
role = aws_iam_role.eks_nodes.name
}
resource "aws_eks_node_group" "main" {
cluster_name = aws_eks_cluster.main.name
node_group_name = "${var.prefix}-nodes"
node_role_arn = aws_iam_role.eks_nodes.arn
subnet_ids = aws_subnet.private[*].id
instance_types = [var.node_instance_type]
scaling_config {
desired_size = var.node_count
max_size = var.node_max_count
min_size = var.node_min_count
}
update_config {
max_unavailable = 1
}
tags = var.tags
depends_on = [
aws_iam_role_policy_attachment.eks_worker_node_policy,
aws_iam_role_policy_attachment.eks_cni_policy,
aws_iam_role_policy_attachment.eks_ecr_readonly,
]
}
# ─── External-DNS IRSA ───────────────────────────────────────────────
# Allows external-dns to manage Route53 records for app ingresses.
data "aws_route53_zone" "main" {
name = var.domain
private_zone = false
}
data "aws_iam_policy_document" "external_dns_assume_role" {
statement {
effect = "Allow"
principals {
type = "Federated"
identifiers = [aws_iam_openid_connect_provider.eks.arn]
}
actions = ["sts:AssumeRoleWithWebIdentity"]
condition {
test = "StringEquals"
variable = "${replace(aws_iam_openid_connect_provider.eks.url, "https://", "")}:sub"
values = ["system:serviceaccount:external-dns:external-dns"]
}
condition {
test = "StringEquals"
variable = "${replace(aws_iam_openid_connect_provider.eks.url, "https://", "")}:aud"
values = ["sts.amazonaws.com"]
}
}
}
resource "aws_iam_role" "external_dns_irsa" {
name_prefix = "${var.prefix}-external-dns-irsa-"
assume_role_policy = data.aws_iam_policy_document.external_dns_assume_role.json
tags = var.tags
}
data "aws_iam_policy_document" "external_dns_route53" {
statement {
effect = "Allow"
actions = ["route53:ChangeResourceRecordSets"]
resources = ["arn:aws:route53:::hostedzone/${data.aws_route53_zone.main.zone_id}"]
}
statement {
effect = "Allow"
actions = ["route53:ListHostedZones", "route53:ListResourceRecordSets", "route53:ListTagsForResource"]
resources = ["*"]
}
}
resource "aws_iam_role_policy" "external_dns_route53" {
name_prefix = "${var.prefix}-external-dns-route53-"
role = aws_iam_role.external_dns_irsa.id
policy = data.aws_iam_policy_document.external_dns_route53.json
}

View File

@@ -0,0 +1,3 @@
output "cluster_name" { value = aws_eks_cluster.main.name }
output "aws_region" { value = var.region }
output "external_dns_irsa_role_arn" { value = aws_iam_role.external_dns_irsa.arn }

View File

@@ -0,0 +1,24 @@
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
tls = {
source = "hashicorp/tls"
version = "~> 4.0"
}
}
}
# Authentication: set AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN
# or configure an AWS profile: export AWS_PROFILE=clst
provider "aws" {
region = var.region
}
variable "region" {
description = "AWS region for the workload environment"
type = string
default = "eu-west-1"
}

View File

@@ -0,0 +1,17 @@
module "cluster" {
source = "../modules/cluster"
project_id = var.project_id
region = var.region
prefix = "clst-dev"
# GKE — small dev nodes
node_machine_type = "e2-standard-2"
node_count = 2
deletion_protection = false
labels = {
environment = "dev"
managed-by = "tofu"
}
}

View File

@@ -0,0 +1,3 @@
output "cluster_name" { value = module.cluster.cluster_name }
output "project_id" { value = module.cluster.project_id }
output "region" { value = module.cluster.region }

View File

@@ -0,0 +1,26 @@
terraform {
required_providers {
google = {
source = "hashicorp/google"
version = "~> 6.0"
}
}
}
# Authentication: use Application Default Credentials (gcloud auth application-default login)
# or set GOOGLE_APPLICATION_CREDENTIALS to a service account key file.
provider "google" {
project = var.project_id
region = var.region
}
variable "project_id" {
description = "GCP project ID for the dev environment"
type = string
}
variable "region" {
description = "GCP region"
type = string
default = "europe-west4"
}

View File

@@ -0,0 +1,115 @@
# ─── Required APIs ────────────────────────────────────────────────────
resource "google_project_service" "compute" {
project = var.project_id
service = "compute.googleapis.com"
disable_on_destroy = false
}
resource "google_project_service" "container" {
project = var.project_id
service = "container.googleapis.com"
disable_on_destroy = false
}
# ─── Networking ───────────────────────────────────────────────────────
resource "google_compute_network" "main" {
project = var.project_id
name = "${var.prefix}-vpc"
auto_create_subnetworks = false
depends_on = [google_project_service.compute]
}
resource "google_compute_subnetwork" "main" {
project = var.project_id
name = "${var.prefix}-subnet"
ip_cidr_range = "10.100.0.0/22"
region = var.region
network = google_compute_network.main.id
# Secondary ranges required for GKE VPC-native cluster
secondary_ip_range {
range_name = "pods"
ip_cidr_range = "10.200.0.0/14" # /14 = ~262k pod IPs
}
secondary_ip_range {
range_name = "services"
ip_cidr_range = "10.204.0.0/20" # /20 = ~4k service IPs
}
}
# ─── GKE Cluster ──────────────────────────────────────────────────────
#
# Regional cluster (3 control-plane replicas) for HA.
# Workload Identity enabled — allows K8s service accounts to impersonate
# Google Service Accounts for keyless access to GCP services.
resource "google_container_cluster" "main" {
project = var.project_id
name = "${var.prefix}-gke"
location = var.region # regional cluster
network = google_compute_network.main.id
subnetwork = google_compute_subnetwork.main.id
# VPC-native cluster with alias IP ranges
ip_allocation_policy {
cluster_secondary_range_name = "pods"
services_secondary_range_name = "services"
}
# Workload Identity pool — enables OIDC token projection for pods
workload_identity_config {
workload_pool = "${var.project_id}.svc.id.goog"
}
# Remove default node pool — we manage our own below
remove_default_node_pool = true
initial_node_count = 1
deletion_protection = var.deletion_protection
dynamic "release_channel" {
for_each = var.kubernetes_version == null ? [1] : []
content {
channel = "STABLE"
}
}
resource_labels = var.labels
depends_on = [google_project_service.container]
}
resource "google_container_node_pool" "main" {
project = var.project_id
name = "${var.prefix}-nodes"
location = var.region
cluster = google_container_cluster.main.name
node_count = var.node_count
node_config {
machine_type = var.node_machine_type
# GKE_METADATA mode is required for Workload Identity
workload_metadata_config {
mode = "GKE_METADATA"
}
oauth_scopes = [
"https://www.googleapis.com/auth/cloud-platform",
]
labels = merge(var.labels, {
role = "worker"
})
}
management {
auto_repair = true
auto_upgrade = true
}
}

View File

@@ -0,0 +1,16 @@
# ─── Cluster ─────────────────────────────────────────────────────────
output "cluster_name" {
description = "GKE cluster name"
value = google_container_cluster.main.name
}
output "project_id" {
description = "GCP project ID"
value = var.project_id
}
output "region" {
description = "GCP region"
value = var.region
}

View File

@@ -0,0 +1,8 @@
terraform {
required_providers {
google = {
source = "hashicorp/google"
version = "~> 6.0"
}
}
}

View File

@@ -0,0 +1,48 @@
# ─── Project / Region ────────────────────────────────────────────────
variable "project_id" {
description = "GCP project ID"
type = string
}
variable "region" {
description = "GCP region (e.g., europe-west4, europe-west1)"
type = string
}
variable "prefix" {
description = "Prefix for resource names (e.g., clst-dev)"
type = string
}
# ─── GKE Cluster ─────────────────────────────────────────────────────
variable "node_machine_type" {
description = "GKE node machine type (e.g., e2-standard-2, e2-standard-4)"
type = string
}
variable "node_count" {
description = "Number of nodes per zone (regional cluster spawns nodes in each zone)"
type = number
}
variable "kubernetes_version" {
description = "GKE Kubernetes version channel (null = STABLE release channel)"
type = string
default = null
}
variable "deletion_protection" {
description = "Prevent cluster deletion (set true for production)"
type = bool
default = false
}
# ─── Labels ──────────────────────────────────────────────────────────
variable "labels" {
description = "Labels applied to all resources"
type = map(string)
default = {}
}

View File

@@ -0,0 +1,17 @@
module "cluster" {
source = "../modules/cluster"
project_id = var.project_id
region = var.region
prefix = "clst"
# GKE — general-purpose nodes for production
node_machine_type = "e2-standard-4"
node_count = 3
deletion_protection = true
labels = {
environment = "prod"
managed-by = "tofu"
}
}

View File

@@ -0,0 +1,3 @@
output "cluster_name" { value = module.cluster.cluster_name }
output "project_id" { value = module.cluster.project_id }
output "region" { value = module.cluster.region }

View File

@@ -0,0 +1,24 @@
terraform {
required_providers {
google = {
source = "hashicorp/google"
version = "~> 6.0"
}
}
}
provider "google" {
project = var.project_id
region = var.region
}
variable "project_id" {
description = "GCP project ID for the prod environment"
type = string
}
variable "region" {
description = "GCP region"
type = string
default = "europe-west1"
}

View File

@@ -0,0 +1,194 @@
# =============================================================================
# GCP Workload Cluster
# =============================================================================
# A lean GKE cluster for running application workloads. No managed data
# services — those live on the platform cluster. ArgoCD (on the platform
# cluster) deploys apps to this cluster via the app-of-apps pattern.
#
# Platform components deployed by deploy-workload.sh:
# nginx-ingress, cert-manager, external-dns, external-secrets, alloy
#
# Usage:
# tofu init && tofu plan && tofu apply
# ./sync-tofu-outputs.sh --env gcp-workload
# ./deploy-workload.sh --env gcp-workload
# =============================================================================
variable "prefix" {
description = "Prefix for resource names (e.g., clst-workload)"
type = string
default = "clst-workload"
}
variable "node_machine_type" {
description = "GKE node machine type"
type = string
default = "e2-standard-2"
}
variable "node_count" {
description = "Number of nodes per zone"
type = number
default = 1
}
variable "kubernetes_version" {
description = "GKE Kubernetes version (null = STABLE release channel)"
type = string
default = null
}
variable "deletion_protection" {
description = "Prevent cluster deletion"
type = bool
default = false
}
variable "labels" {
description = "Labels applied to all resources"
type = map(string)
default = {
environment = "workload"
managed-by = "tofu"
}
}
# ─── Required APIs ────────────────────────────────────────────────────
resource "google_project_service" "compute" {
project = var.project_id
service = "compute.googleapis.com"
disable_on_destroy = false
}
resource "google_project_service" "container" {
project = var.project_id
service = "container.googleapis.com"
disable_on_destroy = false
}
resource "google_project_service" "iam" {
project = var.project_id
service = "iam.googleapis.com"
disable_on_destroy = false
}
resource "google_project_service" "dns" {
project = var.project_id
service = "dns.googleapis.com"
disable_on_destroy = false
}
# ─── Networking ───────────────────────────────────────────────────────
resource "google_compute_network" "main" {
project = var.project_id
name = "${var.prefix}-vpc"
auto_create_subnetworks = false
depends_on = [google_project_service.compute]
}
resource "google_compute_subnetwork" "main" {
project = var.project_id
name = "${var.prefix}-subnet"
ip_cidr_range = "10.110.0.0/22"
region = var.region
network = google_compute_network.main.id
secondary_ip_range {
range_name = "pods"
ip_cidr_range = "10.210.0.0/14"
}
secondary_ip_range {
range_name = "services"
ip_cidr_range = "10.214.0.0/20"
}
}
# ─── GKE Cluster ──────────────────────────────────────────────────────
resource "google_container_cluster" "main" {
project = var.project_id
name = "${var.prefix}-gke"
location = var.region
network = google_compute_network.main.id
subnetwork = google_compute_subnetwork.main.id
ip_allocation_policy {
cluster_secondary_range_name = "pods"
services_secondary_range_name = "services"
}
workload_identity_config {
workload_pool = "${var.project_id}.svc.id.goog"
}
remove_default_node_pool = true
initial_node_count = 1
deletion_protection = var.deletion_protection
dynamic "release_channel" {
for_each = var.kubernetes_version == null ? [1] : []
content {
channel = "STABLE"
}
}
resource_labels = var.labels
depends_on = [google_project_service.container]
}
resource "google_container_node_pool" "main" {
project = var.project_id
name = "${var.prefix}-nodes"
location = var.region
cluster = google_container_cluster.main.name
node_count = var.node_count
node_config {
machine_type = var.node_machine_type
workload_metadata_config {
mode = "GKE_METADATA"
}
oauth_scopes = [
"https://www.googleapis.com/auth/cloud-platform",
]
labels = merge(var.labels, { role = "worker" })
}
management {
auto_repair = true
auto_upgrade = true
}
}
# ─── External-DNS Workload Identity ──────────────────────────────────
# Allows external-dns to manage Cloud DNS records for app ingresses.
resource "google_service_account" "external_dns" {
project = var.project_id
account_id = "${var.prefix}-external-dns"
display_name = "External-DNS Service Account (Workload Identity)"
depends_on = [google_project_service.iam]
}
resource "google_project_iam_member" "external_dns_dns_admin" {
project = var.project_id
role = "roles/dns.admin"
member = "serviceAccount:${google_service_account.external_dns.email}"
}
resource "google_service_account_iam_member" "external_dns_workload_identity" {
service_account_id = google_service_account.external_dns.name
role = "roles/iam.workloadIdentityUser"
member = "serviceAccount:${var.project_id}.svc.id.goog[external-dns/external-dns]"
}

View File

@@ -0,0 +1,4 @@
output "cluster_name" { value = google_container_cluster.main.name }
output "project_id" { value = var.project_id }
output "region" { value = var.region }
output "external_dns_gsa_email" { value = google_service_account.external_dns.email }

View File

@@ -0,0 +1,26 @@
terraform {
required_providers {
google = {
source = "hashicorp/google"
version = "~> 6.0"
}
}
}
# Authentication: use Application Default Credentials (gcloud auth application-default login)
# or set GOOGLE_APPLICATION_CREDENTIALS to a service account key file.
provider "google" {
project = var.project_id
region = var.region
}
variable "project_id" {
description = "GCP project ID for the workload environment"
type = string
}
variable "region" {
description = "GCP region"
type = string
default = "europe-west4"
}

View File

@@ -0,0 +1,14 @@
module "cluster" {
source = "../modules/cluster"
prefix = "clst-dev"
zone = "no-svg1"
node_plan = "DEV-1xCPU-2GB"
node_count = 2
network_cidr = "10.100.0.0/24"
tags = {
Environment = "dev"
ManagedBy = "tofu"
}
}

View File

@@ -0,0 +1,13 @@
# ─── Cluster ─────────────────────────────────────────────────────────
output "cluster_id" {
value = module.cluster.cluster_id
}
output "cluster_name" {
value = module.cluster.cluster_name
}
output "zone" {
value = module.cluster.zone
}

View File

@@ -0,0 +1,14 @@
terraform {
required_version = ">= 1.0"
required_providers {
upcloud = {
source = "UpCloudLtd/upcloud"
version = "~> 5.0"
}
}
}
provider "upcloud" {
# Set via environment variables: UPCLOUD_USERNAME, UPCLOUD_PASSWORD
}

View File

@@ -0,0 +1,64 @@
# Router for the private network
resource "upcloud_router" "kubernetes" {
name = "${var.prefix}-${var.cluster_name}-router"
}
# Gateway for internet connectivity
resource "upcloud_gateway" "kubernetes" {
name = "${var.prefix}-${var.cluster_name}-gateway"
zone = var.zone
features = ["nat"]
router {
id = upcloud_router.kubernetes.id
}
}
# Private network for the Kubernetes cluster
resource "upcloud_network" "kubernetes" {
name = "${var.prefix}-${var.cluster_name}-network"
zone = var.zone
router = upcloud_router.kubernetes.id
ip_network {
address = var.network_cidr
dhcp = true
dhcp_default_route = true
family = "IPv4"
gateway = cidrhost(var.network_cidr, 1)
}
depends_on = [upcloud_gateway.kubernetes]
}
# Kubernetes cluster
resource "upcloud_kubernetes_cluster" "main" {
name = "${var.prefix}-${var.cluster_name}"
zone = var.zone
network = upcloud_network.kubernetes.id
control_plane_ip_filter = var.control_plane_ip_filter
private_node_groups = true
}
# Node group for worker nodes
resource "upcloud_kubernetes_node_group" "workers" {
cluster = upcloud_kubernetes_cluster.main.id
name = "${var.prefix}-${var.cluster_name}-workers"
node_count = var.node_count
plan = var.node_plan
anti_affinity = var.node_count > 1
dynamic "cloud_native_plan" {
for_each = var.storage_size != null ? [1] : []
content {
storage_size = var.storage_size
}
}
labels = {
prefix = var.prefix
cluster = var.cluster_name
role = "worker"
env = lookup(var.tags, "Environment", "dev")
}
}

View File

@@ -0,0 +1,31 @@
# ─── Cluster ─────────────────────────────────────────────────────────
output "cluster_id" {
description = "The ID of the Kubernetes cluster"
value = upcloud_kubernetes_cluster.main.id
}
output "cluster_name" {
description = "The name of the Kubernetes cluster"
value = upcloud_kubernetes_cluster.main.name
}
output "network_id" {
description = "The ID of the private network"
value = upcloud_network.kubernetes.id
}
output "network_cidr" {
description = "The CIDR block of the private network"
value = var.network_cidr
}
output "kubernetes_version" {
description = "The Kubernetes version of the cluster"
value = upcloud_kubernetes_cluster.main.version
}
output "zone" {
description = "The zone where the cluster is deployed"
value = var.zone
}

View File

@@ -0,0 +1,8 @@
terraform {
required_providers {
upcloud = {
source = "UpCloudLtd/upcloud"
version = "~> 5.0"
}
}
}

View File

@@ -0,0 +1,50 @@
# ─── Cluster ─────────────────────────────────────────────────────────
variable "prefix" {
description = "Prefix for resource names"
type = string
}
variable "cluster_name" {
description = "Name of the Kubernetes cluster"
type = string
default = "main"
}
variable "zone" {
description = "UpCloud zone"
type = string
}
variable "node_plan" {
description = "UpCloud server plan for worker nodes"
type = string
}
variable "node_count" {
description = "Number of worker nodes"
type = number
}
variable "network_cidr" {
description = "CIDR block for the private network"
type = string
default = "10.100.0.0/24"
}
variable "control_plane_ip_filter" {
description = "CIDRs allowed to access the K8s API"
type = list(string)
default = ["0.0.0.0/0"]
}
variable "storage_size" {
description = "Storage size in GB for worker nodes (overrides plan default via cloud_native_plan block)"
type = number
default = null
}
variable "tags" {
description = "Labels to apply to resources"
type = map(string)
}

View File

@@ -0,0 +1,120 @@
# =============================================================================
# UpCloud Workload Cluster
# =============================================================================
# A lean UCS cluster for running application workloads. No managed data
# services — those live on the platform cluster. ArgoCD (on the platform
# cluster) deploys apps to this cluster via the app-of-apps pattern.
#
# Platform components deployed by deploy-workload.sh:
# nginx-ingress, cert-manager, external-dns, external-secrets, alloy
#
# Usage:
# tofu init && tofu plan && tofu apply
# ./sync-tofu-outputs.sh --env upcloud-workload
# ./deploy-workload.sh --env upcloud-workload
# =============================================================================
variable "prefix" {
description = "Prefix for resource names"
type = string
default = "clst-workload"
}
variable "zone" {
description = "UpCloud zone"
type = string
default = "no-svg1"
}
variable "node_plan" {
description = "UpCloud server plan for worker nodes"
type = string
default = "2xCPU-4GB"
}
variable "node_count" {
description = "Number of worker nodes"
type = number
default = 2
}
variable "network_cidr" {
description = "CIDR block for the private network"
type = string
default = "10.110.0.0/24"
}
variable "control_plane_ip_filter" {
description = "CIDRs allowed to access the K8s API"
type = list(string)
default = ["0.0.0.0/0"]
}
variable "tags" {
description = "Labels to apply to resources"
type = map(string)
default = {
Environment = "workload"
ManagedBy = "tofu"
}
}
module "cluster" {
source = "../modules/cluster"
prefix = "clst-prod"
zone = "no-svg1"
node_plan = "CLOUDNATIVE-4xCPU-8GB"
node_count = 4
storage_size = 30
network_cidr = "10.100.0.0/24"
control_plane_ip_filter = ["0.0.0.0/0"] # TODO: restrict to known CIDRs
tags = {
Environment = "prod"
ManagedBy = "tofu"
}
}
# ─── Networking ───────────────────────────────────────────────────────
resource "upcloud_router" "kubernetes" {
name = "${var.prefix}-workload-router"
}
resource "upcloud_gateway" "kubernetes" {
name = "${var.prefix}-workload-gateway"
zone = var.zone
features = ["nat"]
router {
id = upcloud_router.kubernetes.id
}
}
resource "upcloud_network" "kubernetes" {
name = "${var.prefix}-workload-network"
zone = var.zone
router = upcloud_router.kubernetes.id
ip_network {
address = var.network_cidr
dhcp = true
dhcp_default_route = true
family = "IPv4"
gateway = cidrhost(var.network_cidr, 1)
}
depends_on = [upcloud_gateway.kubernetes]
}
# ─── Kubernetes Cluster ───────────────────────────────────────────────
resource "upcloud_kubernetes_cluster" "main-prod" {
name = "${var.prefix}-workload"
zone = var.zone
network = upcloud_network.kubernetes.id
control_plane_ip_filter = var.control_plane_ip_filter
private_node_groups = true
}

View File

@@ -0,0 +1,13 @@
# ─── Cluster ─────────────────────────────────────────────────────────
output "cluster_id" {
value = module.cluster.cluster_id
}
output "cluster_name" {
value = module.cluster.cluster_name
}
output "zone" {
value = module.cluster.zone
}

View File

@@ -0,0 +1,14 @@
terraform {
required_version = ">= 1.0"
required_providers {
upcloud = {
source = "UpCloudLtd/upcloud"
version = "~> 5.0"
}
}
}
provider "upcloud" {
# Set via environment variables: UPCLOUD_USERNAME, UPCLOUD_PASSWORD
}

View File

@@ -0,0 +1,116 @@
# =============================================================================
# UpCloud Workload Cluster
# =============================================================================
# A lean UCS cluster for running application workloads. No managed data
# services — those live on the platform cluster. ArgoCD (on the platform
# cluster) deploys apps to this cluster via the app-of-apps pattern.
#
# Platform components deployed by deploy-workload.sh:
# nginx-ingress, cert-manager, external-dns, external-secrets, alloy
#
# Usage:
# tofu init && tofu plan && tofu apply
# ./sync-tofu-outputs.sh --env upcloud-workload
# ./deploy-workload.sh --env upcloud-workload
# =============================================================================
variable "prefix" {
description = "Prefix for resource names"
type = string
default = "clst-workload"
}
variable "zone" {
description = "UpCloud zone"
type = string
default = "no-svg1"
}
variable "node_plan" {
description = "UpCloud server plan for worker nodes"
type = string
default = "2xCPU-4GB"
}
variable "node_count" {
description = "Number of worker nodes"
type = number
default = 2
}
variable "network_cidr" {
description = "CIDR block for the private network"
type = string
default = "10.110.0.0/24"
}
variable "control_plane_ip_filter" {
description = "CIDRs allowed to access the K8s API"
type = list(string)
default = ["0.0.0.0/0"]
}
variable "tags" {
description = "Labels to apply to resources"
type = map(string)
default = {
Environment = "workload"
ManagedBy = "tofu"
}
}
# ─── Networking ───────────────────────────────────────────────────────
resource "upcloud_router" "kubernetes" {
name = "${var.prefix}-workload-router"
}
resource "upcloud_gateway" "kubernetes" {
name = "${var.prefix}-workload-gateway"
zone = var.zone
features = ["nat"]
router {
id = upcloud_router.kubernetes.id
}
}
resource "upcloud_network" "kubernetes" {
name = "${var.prefix}-workload-network"
zone = var.zone
router = upcloud_router.kubernetes.id
ip_network {
address = var.network_cidr
dhcp = true
dhcp_default_route = true
family = "IPv4"
gateway = cidrhost(var.network_cidr, 1)
}
depends_on = [upcloud_gateway.kubernetes]
}
# ─── Kubernetes Cluster ───────────────────────────────────────────────
resource "upcloud_kubernetes_cluster" "main" {
name = "${var.prefix}-workload"
zone = var.zone
network = upcloud_network.kubernetes.id
control_plane_ip_filter = var.control_plane_ip_filter
private_node_groups = true
}
resource "upcloud_kubernetes_node_group" "workers" {
cluster = upcloud_kubernetes_cluster.main.id
name = "${var.prefix}-workload-workers"
node_count = var.node_count
plan = var.node_plan
anti_affinity = var.node_count > 1
labels = {
prefix = var.prefix
cluster = "workload"
role = "worker"
env = lookup(var.tags, "Environment", "workload")
}
}

View File

@@ -0,0 +1,3 @@
output "cluster_name" { value = upcloud_kubernetes_cluster.main.name }
output "cluster_id" { value = upcloud_kubernetes_cluster.main.id }
output "zone" { value = var.zone }

View File

@@ -0,0 +1,14 @@
terraform {
required_version = ">= 1.0"
required_providers {
upcloud = {
source = "UpCloudLtd/upcloud"
version = "~> 5.0"
}
}
}
provider "upcloud" {
# Set via environment variables: UPCLOUD_USERNAME, UPCLOUD_PASSWORD
}

View File

@@ -0,0 +1,66 @@
#!/bin/bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
TOFU_ROOT="$(dirname "$SCRIPT_DIR")"
PROJECT_ROOT="$(dirname "$TOFU_ROOT")"
CLUSTER="${1:?Usage: $0 <cluster> (e.g., aks-dev, eks-prod)}"
PLATFORM="${CLUSTER%%-*}"
ENV="${CLUSTER#*-}"
KUBECONFIG_FILE="$PROJECT_ROOT/private/$CLUSTER/kubeconfig"
if [[ -f "$KUBECONFIG_FILE" ]]; then
echo "Kubeconfig already exists: $KUBECONFIG_FILE"
echo ""
echo " export KUBECONFIG=$KUBECONFIG_FILE"
else
echo "No cached kubeconfig. Fetching from platform..."
# Load platform credentials
ENV_FILE="$TOFU_ROOT/configs/$PLATFORM.env"
if [[ -f "$ENV_FILE" ]]; then
set -a; source "$ENV_FILE"; set +a
fi
TOFU_DIR="$TOFU_ROOT/platforms/$PLATFORM/$ENV"
mkdir -p "$(dirname "$KUBECONFIG_FILE")"
case "$PLATFORM" in
aks)
cd "$TOFU_DIR"
RG=$(tofu output -raw resource_group_name 2>/dev/null || echo "$CLUSTER-rg")
NAME=$(tofu output -raw cluster_name 2>/dev/null || echo "$CLUSTER")
az aks get-credentials --resource-group "$RG" --name "$NAME" --file "$KUBECONFIG_FILE" --overwrite-existing
;;
eks)
cd "$TOFU_DIR"
NAME=$(tofu output -raw cluster_name 2>/dev/null || echo "$CLUSTER")
REGION=$(tofu output -raw aws_region 2>/dev/null || echo "${AWS_REGION:-eu-west-1}")
aws eks update-kubeconfig --name "$NAME" --region "$REGION" --kubeconfig "$KUBECONFIG_FILE"
;;
gke)
cd "$TOFU_DIR"
NAME=$(tofu output -raw cluster_name 2>/dev/null || echo "$CLUSTER")
REGION=$(tofu output -raw region 2>/dev/null || echo "${GCP_REGION:-europe-west4}")
PROJECT=$(tofu output -raw project_id 2>/dev/null || echo "${GCP_PROJECT_ID:-}")
gcloud container clusters get-credentials "$NAME" --region "$REGION" --project "$PROJECT"
cp ~/.kube/config "$KUBECONFIG_FILE"
;;
upc)
cd "$TOFU_DIR"
CLUSTER_ID=$(tofu output -raw cluster_id 2>/dev/null || echo "${UPCLOUD_CLUSTER_ID:-}")
upctl kubernetes config "$CLUSTER_ID" > "$KUBECONFIG_FILE"
;;
*)
echo "Error: unknown platform '$PLATFORM'"
exit 1
;;
esac
chmod 600 "$KUBECONFIG_FILE"
echo "Kubeconfig saved: $KUBECONFIG_FILE"
echo ""
echo " export KUBECONFIG=$KUBECONFIG_FILE"
fi

View File

@@ -0,0 +1,246 @@
#!/bin/bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
TOFU_ROOT="$(dirname "$SCRIPT_DIR")"
PROJECT_ROOT="$(dirname "$TOFU_ROOT")"
# ─── Usage ────────────────────────────────────────────────────────────
usage() {
cat <<EOF
Usage: $0 <cluster> [options]
Provision a Kubernetes cluster using OpenTofu.
Mirrors bootstrap.sh convention: cluster = <platform>-<env>
Clusters: aks-dev | aks-prod | eks-dev | eks-prod
gke-dev | gke-prod | upc-dev | upc-prod
<platform>-workload (for workload clusters)
Options:
--plan Plan only, don't apply
--destroy Destroy the cluster (use teardown-cluster.sh instead)
--auto Skip confirmation prompts
-h, --help Show this help
Examples:
$0 aks-dev
$0 eks-prod --plan
$0 upc-dev --auto
Prerequisites:
- tofu, kubectl, helm installed
- Platform credentials in .tofu/configs/<platform>.env
- Cluster config in clusters/<cluster>.yaml
After provisioning, run:
./bootstrap.sh <cluster>
EOF
exit "${1:-0}"
}
# ─── Parse arguments ──────────────────────────────────────────────────
CLUSTER=""
PLAN_ONLY=false
DESTROY=false
AUTO_APPROVE=false
while [[ $# -gt 0 ]]; do
case "$1" in
--plan) PLAN_ONLY=true; shift ;;
--destroy) DESTROY=true; shift ;;
--auto) AUTO_APPROVE=true; shift ;;
-h|--help) usage 0 ;;
-*) echo "Unknown option: $1"; usage 1 ;;
*)
if [[ -z "$CLUSTER" ]]; then
CLUSTER="$1"
else
echo "Error: unexpected argument '$1'"
usage 1
fi
shift
;;
esac
done
[[ -z "$CLUSTER" ]] && { echo "Error: <cluster> argument required"; usage 1; }
# ─── Map cluster → platform + env ────────────────────────────────────
PLATFORM="${CLUSTER%%-*}" # aks-dev → aks
ENV="${CLUSTER#*-}" # aks-dev → dev
case "$PLATFORM" in
aks|eks|gke|upc) ;;
*) echo "Error: unknown platform '$PLATFORM'. Expected: aks, eks, gke, upc"; exit 1 ;;
esac
TOFU_DIR="$TOFU_ROOT/platforms/$PLATFORM/$ENV"
if [[ ! -d "$TOFU_DIR" ]]; then
echo "Error: tofu directory not found: $TOFU_DIR"
echo "Available environments for $PLATFORM:"
ls -1 "$TOFU_ROOT/platforms/$PLATFORM/" 2>/dev/null | grep -v modules || echo " (none)"
exit 1
fi
echo "========================================="
echo " Kubernetes Cluster Setup"
echo "========================================="
echo ""
echo " Cluster: $CLUSTER"
echo " Platform: $PLATFORM"
echo " Env: $ENV"
echo " Tofu dir: $TOFU_DIR"
echo ""
# ─── Prerequisites ────────────────────────────────────────────────────
echo "=== Checking Prerequisites ==="
command -v tofu >/dev/null 2>&1 || { echo "Error: tofu is not installed."; exit 1; }
command -v kubectl >/dev/null 2>&1 || { echo "Error: kubectl is not installed."; exit 1; }
command -v helm >/dev/null 2>&1 || { echo "Error: helm is not installed."; exit 1; }
echo " tofu, kubectl, helm: OK"
# ─── Load platform credentials ────────────────────────────────────────
ENV_FILE="$TOFU_ROOT/configs/$PLATFORM.env"
if [[ -f "$ENV_FILE" ]]; then
echo " Loading credentials from configs/$PLATFORM.env"
set -a
# shellcheck disable=SC1090
source "$ENV_FILE"
set +a
else
echo " Warning: $ENV_FILE not found — using existing environment/CLI auth"
echo " Copy configs/$PLATFORM.env.example → configs/$PLATFORM.env to configure"
fi
# ─── Load cluster config (if exists) ──────────────────────────────────
CLUSTER_CONFIG="$PROJECT_ROOT/clusters/$CLUSTER.yaml"
if [[ -f "$CLUSTER_CONFIG" ]]; then
echo " Loading cluster config from clusters/$CLUSTER.yaml"
if command -v yq >/dev/null 2>&1; then
eval "$(yq -r 'to_entries[] | "export CLUSTER_\(.key)=\"\(.value)\""' "$CLUSTER_CONFIG")"
echo " Cluster name: ${CLUSTER_clusterName:-$CLUSTER}"
else
echo " Warning: yq not installed — cluster config not loaded"
fi
else
echo " Warning: $CLUSTER_CONFIG not found — using defaults"
fi
echo ""
# ─── Run OpenTofu ─────────────────────────────────────────────────────
cd "$TOFU_DIR"
echo "=== Initializing OpenTofu ==="
tofu init
echo ""
if $DESTROY; then
echo "=== Planning Destruction ==="
tofu plan -destroy -out=tfplan
if ! $AUTO_APPROVE; then
echo ""
read -rp "DESTROY cluster $CLUSTER? This is irreversible. (yes/no) " REPLY
[[ "$REPLY" == "yes" ]] || { echo "Cancelled."; exit 1; }
fi
echo "Destroying infrastructure..."
tofu apply tfplan
echo ""
echo "=== Cluster $CLUSTER Destroyed ==="
elif $PLAN_ONLY; then
echo "=== Planning Infrastructure ==="
tofu plan
echo ""
echo "=== Plan complete (--plan mode, no changes applied) ==="
else
echo "=== Planning Infrastructure ==="
tofu plan -out=tfplan
if ! $AUTO_APPROVE; then
echo ""
read -rp "Apply this plan for $CLUSTER? (y/n) " -n 1 REPLY
echo
[[ "$REPLY" =~ ^[Yy]$ ]] || { echo "Cancelled."; exit 1; }
fi
echo "Applying infrastructure..."
tofu apply tfplan
# ─── Save kubeconfig ──────────────────────────────────────────────
KUBECONFIG_DIR="$PROJECT_ROOT/private/$CLUSTER"
mkdir -p "$KUBECONFIG_DIR"
KUBECONFIG_FILE="$KUBECONFIG_DIR/kubeconfig"
echo ""
echo "=== Saving Kubeconfig ==="
case "$PLATFORM" in
aks)
if tofu output -raw kubeconfig > "$KUBECONFIG_FILE" 2>/dev/null; then
echo " Saved from tofu output"
else
echo " Fetching from Azure CLI..."
RG=$(tofu output -raw resource_group_name 2>/dev/null || echo "${CLUSTER_clusterName:-$CLUSTER}-rg")
NAME=$(tofu output -raw cluster_name 2>/dev/null || echo "${CLUSTER_clusterName:-$CLUSTER}")
az aks get-credentials --resource-group "$RG" --name "$NAME" --file "$KUBECONFIG_FILE" --overwrite-existing
fi
;;
eks)
NAME=$(tofu output -raw cluster_name 2>/dev/null || echo "${CLUSTER_clusterName:-$CLUSTER}")
REGION=$(tofu output -raw aws_region 2>/dev/null || echo "${AWS_REGION:-eu-west-1}")
aws eks update-kubeconfig --name "$NAME" --region "$REGION" --kubeconfig "$KUBECONFIG_FILE"
;;
gke)
NAME=$(tofu output -raw cluster_name 2>/dev/null || echo "${CLUSTER_clusterName:-$CLUSTER}")
REGION=$(tofu output -raw region 2>/dev/null || echo "${GCP_REGION:-europe-west4}")
PROJECT=$(tofu output -raw project_id 2>/dev/null || echo "${GCP_PROJECT_ID:-}")
gcloud container clusters get-credentials "$NAME" --region "$REGION" --project "$PROJECT" 2>/dev/null \
&& cp ~/.kube/config "$KUBECONFIG_FILE" \
|| echo " Warning: could not fetch kubeconfig via gcloud"
;;
upc)
if tofu output -raw kubeconfig > "$KUBECONFIG_FILE" 2>/dev/null; then
echo " Saved from tofu output"
else
CLUSTER_ID=$(tofu output -raw cluster_id 2>/dev/null || echo "${UPCLOUD_CLUSTER_ID:-}")
if [[ -n "$CLUSTER_ID" ]]; then
upctl kubernetes config "$CLUSTER_ID" > "$KUBECONFIG_FILE"
else
echo " Warning: could not determine cluster ID for kubeconfig"
fi
fi
;;
esac
if [[ -f "$KUBECONFIG_FILE" ]]; then
chmod 600 "$KUBECONFIG_FILE"
echo " Kubeconfig: $KUBECONFIG_FILE"
fi
# ─── Wait for nodes ──────────────────────────────────────────────
echo ""
echo "=== Waiting for Cluster Nodes ==="
export KUBECONFIG="$KUBECONFIG_FILE"
if kubectl wait --for=condition=Ready nodes --all --timeout=300s 2>/dev/null; then
echo " All nodes ready"
else
echo " Warning: nodes not ready within timeout — check cluster status"
fi
# ─── Summary ─────────────────────────────────────────────────────
echo ""
echo "========================================="
echo " Cluster $CLUSTER Provisioned"
echo "========================================="
echo ""
echo " Kubeconfig: $KUBECONFIG_FILE"
echo ""
echo " Next steps:"
echo " export KUBECONFIG=$KUBECONFIG_FILE"
echo " ./bootstrap.sh $CLUSTER"
echo ""
fi

View File

@@ -0,0 +1,7 @@
#!/bin/bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# Delegate to setup-cluster.sh with --destroy flag
exec "$SCRIPT_DIR/setup-cluster.sh" "$@" --destroy

View File

@@ -80,8 +80,23 @@ This repository contains the complete GitOps configuration for our Kubernetes cl
``` ```
. .
├── bootstrap.sh # Cluster initialization script ├── bootstrap.sh # Cluster initialization (ArgoCD + GitOps)
├── _app-of-apps.yaml # Root ArgoCD Application (App-of-Apps pattern) ├── _app-of-apps-{cluster}.yaml # Root ArgoCD Application (per cluster)
├── .tofu/ # Infrastructure provisioning (OpenTofu)
│ ├── platforms/ # Per-platform IaC (one dir per cloud)
│ │ ├── aks/ # Azure AKS (modules/ + dev/ + prod/ + workload/)
│ │ ├── eks/ # AWS EKS
│ │ ├── gke/ # GCP GKE
│ │ └── upc/ # UpCloud
│ ├── configs/ # Platform credentials (git-ignored)
│ │ └── *.env.example # Template for each platform
│ └── scripts/ # Cluster lifecycle scripts
│ ├── setup-cluster.sh # Create cluster: ./setup-cluster.sh aks-dev
│ ├── teardown-cluster.sh
│ └── get-kubeconfig.sh
├── clusters/ # Cluster metadata (domain, trustedIPs, etc.)
├── infra/ # Infrastructure ArgoCD Applications (Kustomize multi-cluster) ├── infra/ # Infrastructure ArgoCD Applications (Kustomize multi-cluster)
│ ├── base/ # Base ArgoCD Application manifests (one dir per component) │ ├── base/ # Base ArgoCD Application manifests (one dir per component)

View File

@@ -5,7 +5,6 @@ resources:
- forte-drop-postgresql - forte-drop-postgresql
- forte-drop - forte-drop
- forte-drop-mcp - forte-drop-mcp
- dbunk-demo
# No patches needed — base apps already default to "upc-dev" value paths # No patches needed — base apps already default to "upc-dev" value paths
# (upc-dev is the default/base cluster). # (upc-dev is the default/base cluster).

View File

@@ -0,0 +1,15 @@
---
apiVersion: bitnami.com/v1alpha1
kind: SealedSecret
metadata:
creationTimestamp: null
name: azuredns-config
namespace: cert-manager
spec:
encryptedData:
client-secret: AgBPCix6yTt8gXV2pMfRx6weWtLWUeMa/cBwuaDBZkqZE4CTSDxjQWWK8ul5OtndlEX7+gXV2HpuAmHRhGC8P1z39yWRxIDbKf+AH4JOGSIfu57tnrGvzAwjanKtqeFCEM67Y42Kz1rahCEtdwJj2YIL9oVdCtRPkpAJrAJclqmrCvJZtqQYcDeEn5ONK5Gchxc6x2/J0U0LTjLz/MP74CEKw3CiZ+9VKA4ppvPqjqoE4yyAXSglSnhYTqtMZpFg5sr+at6rAB1ufmtiS70Vfks46OL0/ZJjdS6h5wjIZhl/1DgIFXqc1yoAjuEoMRzMeW9ji1PnjRqas4lU19tuEOf9/Eq65BYOqSIqJ3saG4I/+z013coCGCalo4ghuufmu5pcsi6ywcszz+g20N/PZcVcLbzZbMnKcXsaE007MDLxGY1QD89aribAvsypDUNEuNh7w2n/OAyQdw+nRDGIi8Oz3FoJUmC0HBxin2hstuBnrAXtxdqRX1NU25HfR2s3qt/yQ33TFx4xHA3NAll8Riyg2rvgqAKo9x9rmlcnzML011vlNu5oLKAuCqJlH8W/Nnnh6LNcZJy8Cj+fqXPeWHS4Qk7nEAbYJN9/sNAUg/VzsP0yYejPfjwzoDDaPLvTHROwv9nZ+Lr/U5epHr231jc5+i3x8dLuBtg+aa5PHoS/Ml1a4811w3Bxj3u36q8UPJGnszQXLKCpucynVVstAj4ufhXhhNXJdK/U31Zrc6j3Skw4zgF8Ddv0
template:
metadata:
creationTimestamp: null
name: azuredns-config
namespace: cert-manager

View File

@@ -12,10 +12,25 @@ spec:
privateKeySecretRef: privateKeySecretRef:
name: letsencrypt-staging-key name: letsencrypt-staging-key
solvers: solvers:
- dns01:
azureDNS:
subscriptionID: 1b52bc03-6815-4574-b579-60745dce544d
resourceGroupName: forteapps-domain
hostedZoneName: forteapps.net
environment: AzurePublicCloud
tenantID: 063afd9e-5fcb-48d2-a769-ca31b0f5b443
clientID: 3b7a4ebf-894c-4f5d-9b1e-2b61312f8e74
clientSecretSecretRef:
name: azuredns-config
key: client-secret
selector:
dnsNames:
- '*.forteapps.net'
- 'forteapps.net'
# HTTP-01 fallback for non-wildcard certificates
- http01: - http01:
ingress: ingress:
class: traefik class: traefik
--- ---
# Production ClusterIssuer for browser-trusted certificates # Production ClusterIssuer for browser-trusted certificates
apiVersion: cert-manager.io/v1 apiVersion: cert-manager.io/v1
@@ -30,6 +45,147 @@ spec:
privateKeySecretRef: privateKeySecretRef:
name: letsencrypt-prod-key name: letsencrypt-prod-key
solvers: solvers:
# DNS-01 solver for wildcard certificates (*.forteapps.net)
- dns01:
azureDNS:
subscriptionID: 1b52bc03-6815-4574-b579-60745dce544d
resourceGroupName: forteapps-domain
hostedZoneName: forteapps.net
environment: AzurePublicCloud
tenantID: 063afd9e-5fcb-48d2-a769-ca31b0f5b443
clientID: 3b7a4ebf-894c-4f5d-9b1e-2b61312f8e74
clientSecretSecretRef:
name: azuredns-config
key: client-secret
selector:
dnsNames:
- '*.forteapps.net'
- 'forteapps.net'
# HTTP-01 fallback for non-wildcard certificates
- http01: - http01:
ingress: ingress:
class: traefik class: traefik
# =============================================================================
# CONFIGURATION INSTRUCTIONS FOR AZURE DNS WITH WILDCARD CERTIFICATES
# =============================================================================
#
# PREREQUISITES IN AZURE DNS PORTAL:
# ----------------------------------
# 1. Ensure you have an Azure DNS Zone for "forteapps.net" created in your
# Azure subscription. If not, create it in Azure Portal:
# - Search for "DNS zones" → Create → Zone name: forteapps.net
# - Note the Resource Group where you create it (e.g., "dns-zones-rg")
#
# 2. Configure NS records at your domain registrar to point to Azure DNS:
# - In Azure Portal → DNS zones → forteapps.net
# - Note the 4 NS records shown (e.g., ns1-04.azure-dns.com, etc.)
# - Go to your domain registrar and update the NS records to these values
#
# AUTHENTICATION (Service Principal - Required for UpCloud/non-Azure clusters):
# ----------------------------------------------------------------------------
# Since your cluster runs on UpCloud (not AKS), you must use Service Principal
# authentication. Managed Identity only works with Azure-hosted resources.
#
# =============================================================================
# SETUP: Service Principal for UpCloud Clusters
# =============================================================================
#
# 1. Create Azure AD App Registration:
# az ad sp create-for-rbac --name cert-manager-dns --sdk-auth
# # Save the JSON output - you'll need appId (clientID) and password (clientSecret)
#
# 2. Assign DNS Zone Contributor role:
# az role assignment create \
# --role "DNS Zone Contributor" \
# --assignee <SERVICE_PRINCIPAL_CLIENT_ID> \
# --scope /subscriptions/<SUBSCRIPTION_ID>/resourceGroups/<DNS_RESOURCE_GROUP>/providers/Microsoft.Network/dnszones/forteapps.net
#
# 3. Create Kubernetes secret for the service principal:
# kubectl create secret generic azuredns-config \
# --namespace cert-manager \
# --from-literal=client-secret=YOUR_CLIENT_SECRET
#
# 4. Update the ClusterIssuer above with:
# - subscriptionID: Your Azure subscription ID
# - resourceGroupName: The resource group containing your DNS zone
# - clientID: The Service Principal appId/clientID
# - clientSecretSecretRef: References the secret created in step 3
#
# =============================================================================
# ALTERNATIVE DNS PROVIDERS (for reference):
# =============================================================================
# -----------------------------------------------------------------------------
# Cloudflare (original configuration)
# -----------------------------------------------------------------------------
# Create secret with: kubectl create secret generic cloudflare-api-token-secret \
# --from-literal=api-token=YOUR_CLOUDFLARE_API_TOKEN -n cert-manager
#
# dns01:
# cloudflare:
# email: your-cloudflare-email@example.com
# apiTokenSecretRef:
# name: cloudflare-api-token-secret
# key: api-token
# -----------------------------------------------------------------------------
# AWS Route53
# -----------------------------------------------------------------------------
# Create secret with: kubectl create secret generic route53-credentials \
# --from-literal=secret-access-key=YOUR_SECRET_KEY -n cert-manager
#
# dns01:
# route53:
# region: us-east-1
# hostedZoneID: ZXXXXXXXXXXXXX
# accessKeyID: YOUR_ACCESS_KEY_ID
# secretAccessKeySecretRef:
# name: route53-credentials
# key: secret-access-key
# -----------------------------------------------------------------------------
# Google Cloud DNS
# -----------------------------------------------------------------------------
# Create secret with service account JSON key:
# kubectl create secret generic clouddns-service-account \
# --from-file=service-account.json=path/to/key.json -n cert-manager
#
# dns01:
# cloudDNS:
# project: YOUR_GCP_PROJECT_ID
# hostedZoneName: example-com
# serviceAccountSecretRef:
# name: clouddns-service-account
# key: service-account.json
# -----------------------------------------------------------------------------
# GoDaddy
# -----------------------------------------------------------------------------
# Requires external webhook: https://github.com/snowdrop/godaddy-webhook
#
# dns01:
# webhook:
# groupName: acme.yourcompany.com
# solverName: godaddy
# config:
# apiKeySecretRef:
# name: godaddy-api-credentials
# key: api-key
# apiSecretSecretRef:
# name: godaddy-api-credentials
# key: api-secret
# -----------------------------------------------------------------------------
# Manual/Dynamic DNS (for homelab)
# -----------------------------------------------------------------------------
# Requires RFC2136 provider or external webhook
#
# dns01:
# rfc2136:
# nameserver: your-dns-server.example.com
# tsigKeyName: cert-manager-key
# tsigAlgorithm: HMACSHA256
# tsigSecretSecretRef:
# name: tsig-secret
# key: secret

View File

@@ -10,7 +10,7 @@ metadata:
policies.kyverno.io/severity: medium policies.kyverno.io/severity: medium
policies.kyverno.io/subject: Pod policies.kyverno.io/subject: Pod
policies.kyverno.io/description: >- policies.kyverno.io/description: >-
Injects an auth sidecar container into Pods annotated with policies.forteapps.io/auth: "true". Supports three auth modes controlled by the policies.forteapps.io/auth-type annotation: "token" (default), "oidc", and "mcp". In token mode the sidecar reads credentials from a mounted Secret volume. In OIDC mode the sidecar uses OpenID Connect with authority and client-id provided via required annotations (policies.forteapps.io/auth-oidc-authority and policies.forteapps.io/auth-oidc-client-id) and secrets from an auth-oidc Secret. In MCP mode the sidecar implements OAuth 2.0 for MCP servers per RFC 9728 (Protected Resource Metadata) and RFC 7591 (Dynamic Client Registration), configured via policies.forteapps.io/auth-mcp-resource and policies.forteapps.io/auth-mcp-authority annotations. The sidecar port defaults to 9001 and can be overridden via the policies.forteapps.io/auth-port annotation. A NetworkPolicy is generated to restrict ingress to the sidecar port only. Injects an auth sidecar container into Pods annotated with policies.forteapps.io/auth: "true". Supports three auth modes controlled by the policies.forteapps.io/auth-type annotation: "token" (default), "oidc", and "mcp". In token mode the sidecar reads credentials from a mounted Secret volume. In OIDC mode the sidecar uses OpenID Connect with authority and client-id provided via required annotations (policies.forteapps.io/auth-oidc-authority and policies.forteapps.io/auth-oidc-client-id) and secrets from an auth-oidc Secret. In MCP mode the sidecar implements OAuth 2.0 for MCP servers per RFC 9728 (Protected Resource Metadata); Dynamic Client Registration (RFC 7591) is handled natively by Keycloak and consumed directly by MCP clients. Configured via policies.forteapps.io/auth-mcp-resource and policies.forteapps.io/auth-mcp-authority annotations. The sidecar port defaults to 9001 and can be overridden via the policies.forteapps.io/auth-port annotation. A NetworkPolicy is generated to restrict ingress to the sidecar port only.
spec: spec:
background: false background: false
rules: rules:

View File

@@ -17,7 +17,11 @@
"claude-code@latest", "claude-code@latest",
"go@latest", "go@latest",
"dotnet-sdk@latest", "dotnet-sdk@latest",
"opentofu@1.11.6" "opentofu@1.11.6",
"_1password@latest",
"github-cli@latest",
"upcloud-cli@3.29.0",
"awscli2@2.34.24"
], ],
"shell": { "shell": {
"init_hook": [ "init_hook": [

View File

@@ -772,7 +772,7 @@ Internet → Traefik → Service:8080 → Auth Sidecar:8080 → localhost → Yo
Three authentication modes are supported: Three authentication modes are supported:
1. **Token-based**: Static tokens (simple, good for service-to-service or internal apps) 1. **Token-based**: Static tokens (simple, good for service-to-service or internal apps)
2. **OIDC**: OpenID Connect (full SSO, good for user-facing apps) 2. **OIDC**: OpenID Connect (full SSO, good for user-facing apps)
3. **MCP**: OAuth 2.0 for MCP servers via RFC 9728 / RFC 7591 (good for MCP tool servers requiring OAuth-based access control) 3. **MCP**: OAuth 2.0 for MCP servers via RFC 9728 (Protected Resource Metadata); Keycloak provides native RFC 7591 Dynamic Client Registration (good for MCP tool servers requiring OAuth-based access control)
--- ---
@@ -1013,7 +1013,7 @@ auth:
scopes: "openid,profile,email" # OIDC scopes (optional) scopes: "openid,profile,email" # OIDC scopes (optional)
callbackPath: /auth/callback # OAuth callback path (optional) callbackPath: /auth/callback # OAuth callback path (optional)
# MCP mode configuration (RFC 9728 / RFC 7591) # MCP mode configuration (RFC 9728)
mcp: mcp:
resource: "" # Protected resource URL (required for MCP) resource: "" # Protected resource URL (required for MCP)
authority: "" # Authorization server URL (required for MCP) authority: "" # Authorization server URL (required for MCP)
@@ -1161,7 +1161,7 @@ ingress:
host: mcp-server.forteapps.net host: mcp-server.forteapps.net
``` ```
The MCP auth mode implements RFC 9728 (OAuth 2.0 Protected Resource Metadata) for authorization server discovery and RFC 7591 (OAuth 2.0 Dynamic Client Registration) for automatic client registration. MCP clients discover the authorization server and scopes from the `/.well-known/oauth-protected-resource` endpoint served by the sidecar. The MCP auth mode implements RFC 9728 (OAuth 2.0 Protected Resource Metadata) for authorization server discovery. Dynamic Client Registration (RFC 7591) is handled natively by Keycloak; MCP clients discover the authorization server and scopes from the `/.well-known/oauth-protected-resource` endpoint served by the sidecar and then register directly with Keycloak.
#### Example 4: Disabling Authentication #### Example 4: Disabling Authentication
@@ -1336,16 +1336,34 @@ stringData:
| Field | Required | Description | | Field | Required | Description |
|-------|----------|-------------| |-------|----------|-------------|
| `clientId` | Yes | Keycloak client ID | | `clientId` | Yes | Keycloak client ID (must be unique in realm) |
| `name` | Yes | Display name in Keycloak | | `name` | Yes | Display name in Keycloak UI |
| `redirectUris` | Yes | Allowed redirect URIs | | `redirectUris` | Yes | Allowed OAuth redirect URLs (supports wildcards like `/*`) |
| `webOrigins` | Yes | Allowed web origins (CORS) | | `webOrigins` | Yes | Allowed CORS origins |
| `defaultClientScopes` | No | Scopes (default: `["openid", "email", "profile"]`) | | `defaultClientScopes` | No | OIDC scopes (default: `["openid", "email", "profile"]`) |
| `protocolMappers` | No | Custom claim mappers (default: `[]`) | | `protocolMappers` | No | Custom claim mappers for tokens (see examples below) |
| `secret.namespace` | No | Namespace for the credential Secret (default: source namespace) | | `secret.namespace` | No | Target namespace for credentials (default: `source-namespace` annotation value) |
| `secret.name` | No | Name of the credential Secret (default: `<clientId>-oidc-credentials`) | | `secret.name` | No | Credential Secret name (default: `<clientId>-oidc-credentials`) |
| `secret.keys.clientId` | No | Key name for client ID in credential Secret (default: `client-id`) | | `secret.keys.clientId` | No | Key name for client ID (default: `client-id`) |
| `secret.keys.clientSecret` | No | Key name for client secret in credential Secret (default: `client-secret`) | | `secret.keys.clientSecret` | No | Key name for client secret (default: `client-secret`) |
**Protocol Mappers Example**:
```json
"protocolMappers": [
{
"name": "groups",
"protocol": "openid-connect",
"protocolMapper": "oidc-group-membership-mapper",
"config": {
"claim.name": "groups",
"full.path": "false",
"id.token.claim": "true",
"access.token.claim": "true",
"userinfo.token.claim": "true"
}
}
]
```
#### Step 2: Reference the Credential Secret #### Step 2: Reference the Credential Secret

View File

@@ -115,9 +115,30 @@ This Kubernetes cluster uses a **GitOps approach** powered by **ArgoCD**, where
``` ```
launchpad/ launchpad/
├── bootstrap.sh # Cluster initialization script ├── bootstrap.sh # Cluster initialization (ArgoCD + GitOps)
├── _app-of-apps-upc-dev.yaml # Root ArgoCD Application (upc-dev cluster) ├── _app-of-apps-{cluster}.yaml # Root ArgoCD Application (per cluster)
├── _app-of-apps-upc-prod.yaml # Root ArgoCD Application (upc-prod cluster)
├── .tofu/ # Infrastructure provisioning (OpenTofu)
│ ├── platforms/ # Per-platform IaC
│ │ ├── aks/ # Azure AKS
│ │ │ ├── modules/cluster/ # Reusable AKS module
│ │ │ ├── dev/ # tofu root for aks-dev
│ │ │ ├── prod/ # tofu root for aks-prod
│ │ │ └── workload/ # workload cluster (no data services)
│ │ ├── eks/ # AWS EKS (same structure)
│ │ ├── gke/ # GCP GKE
│ │ └── upc/ # UpCloud
│ ├── configs/ # Platform credentials (git-ignored)
│ │ └── {platform}.env.example # Template per platform
│ └── scripts/
│ ├── setup-cluster.sh # ./setup-cluster.sh <cluster> [--plan|--auto]
│ ├── teardown-cluster.sh # ./teardown-cluster.sh <cluster>
│ └── get-kubeconfig.sh # ./get-kubeconfig.sh <cluster>
├── clusters/ # Cluster metadata YAML (domain, IPs, etc.)
│ ├── aks-dev.yaml
│ ├── upc-dev.yaml
│ └── ...
├── infra/ # Infrastructure ArgoCD Applications (Kustomize) ├── infra/ # Infrastructure ArgoCD Applications (Kustomize)
│ ├── base/ # Base Application manifests (one dir per component) │ ├── base/ # Base Application manifests (one dir per component)

View File

@@ -2,6 +2,12 @@
## Table of Contents ## Table of Contents
- [Overview](#overview) - [Overview](#overview)
- [Infrastructure Provisioning (OpenTofu)](#infrastructure-provisioning-opentofu)
- [Prerequisites](#provisioning-prerequisites)
- [Provisioning a Cluster](#provisioning-a-cluster)
- [Tearing Down a Cluster](#tearing-down-a-cluster)
- [Retrieving Kubeconfig](#retrieving-kubeconfig)
- [Platform Credentials](#platform-credentials)
- [Cluster Bootstrap](#cluster-bootstrap) - [Cluster Bootstrap](#cluster-bootstrap)
- [Initial Cluster Setup](#initial-cluster-setup) - [Initial Cluster Setup](#initial-cluster-setup)
- [ArgoCD Repository Access Setup](#argocd-repository-access-setup) - [ArgoCD Repository Access Setup](#argocd-repository-access-setup)
@@ -29,6 +35,120 @@ This runbook provides operational procedures for maintaining the Kubernetes clus
--- ---
## Infrastructure Provisioning (OpenTofu)
The `.tofu/` directory contains multi-cloud Kubernetes infrastructure-as-code using [OpenTofu](https://opentofu.org/). It provisions clusters on four cloud platforms (AKS, EKS, GKE, UpCloud), each with three environment tiers: **dev**, **prod**, and **workload**.
### Provisioning Prerequisites {#provisioning-prerequisites}
- **OpenTofu** (`tofu`) installed
- **kubectl** installed
- **helm** installed
- **yq** (optional — loads cluster config from `clusters/<cluster>.yaml`)
- Platform CLI tools:
- **AKS**: `az` (Azure CLI)
- **EKS**: `aws` (AWS CLI)
- **GKE**: `gcloud` (Google Cloud SDK)
- **UPC**: `upctl` (UpCloud CLI)
### Provisioning a Cluster
```bash
# Navigate to the scripts directory
cd .tofu/scripts
# 1. Copy and fill in credentials for your platform
cp ../configs/aks.env.example ../configs/aks.env
# Edit ../configs/aks.env with your credentials
# 2. Provision cluster (interactive — prompts before applying)
./setup-cluster.sh aks-dev
# 3. Dry-run only (plan without applying)
./setup-cluster.sh aks-dev --plan
# 4. Non-interactive (skip confirmations)
./setup-cluster.sh aks-dev --auto
```
**Cluster name format**: `<platform>-<env>` — e.g., `aks-dev`, `eks-prod`, `gke-workload`, `upc-dev`
**What `setup-cluster.sh` does**:
1. Validates cluster name, extracts platform and environment
2. Checks prerequisites (tofu, kubectl, helm)
3. Loads credentials from `configs/<platform>.env`
4. Optionally loads cluster config from `clusters/<cluster>.yaml` (via yq)
5. Runs `tofu init``tofu plan` → prompts → `tofu apply`
6. Fetches and caches kubeconfig to `private/<cluster>/kubeconfig`
7. Waits for all nodes to reach Ready state (300s timeout)
8. Outputs next steps: `export KUBECONFIG` + `./bootstrap.sh`
### Tearing Down a Cluster
```bash
# Destroy cluster infrastructure
./teardown-cluster.sh aks-dev
# Equivalent to:
./setup-cluster.sh aks-dev --destroy
```
### Retrieving Kubeconfig
```bash
# Get kubeconfig for an existing cluster (uses cache or platform CLI)
./get-kubeconfig.sh aks-dev
# Cached kubeconfigs stored in: private/<cluster>/kubeconfig
```
Platform-specific retrieval fallbacks:
- **AKS**: `az aks get-credentials`
- **EKS**: `aws eks update-kubeconfig`
- **GKE**: `gcloud container clusters get-credentials`
- **UPC**: `upctl kubernetes config`
### Platform Credentials
Each platform has a `configs/<platform>.env.example` template. Copy to `.env` and populate:
| Platform | Required Variables | Optional |
|----------|--------------------|----------|
| **AKS** | `AZURE_TENANT_ID`, `AZURE_SUBSCRIPTION_ID` | `ARM_RESOURCE_GROUP` (defaults to cluster name) |
| **EKS** | `AWS_PROFILE` (default: "default"), `AWS_REGION` (default: "eu-west-1") | `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY` |
| **GKE** | `GCP_PROJECT_ID`, `GCP_REGION` (default: "europe-west4") | `GOOGLE_APPLICATION_CREDENTIALS` (SA JSON path) |
| **UPC** | `UPCLOUD_TOKEN` | `UPCLOUD_CLUSTER_ID` (set after creation) |
> **Note**: `.env` files are git-ignored. Never commit credentials.
### End-to-End Workflow
Full cluster lifecycle: provision → bootstrap → operate → teardown:
```bash
# 1. Provision infrastructure
cd .tofu/scripts
./setup-cluster.sh aks-dev
# 2. Export kubeconfig (printed by setup-cluster.sh)
export KUBECONFIG=$(pwd)/../../private/aks-dev/kubeconfig
# 3. Bootstrap GitOps (ArgoCD + App-of-Apps)
cd ../..
./bootstrap.sh aks-dev
# 4. Verify
kubectl get applications -n argocd
# ... operate cluster ...
# 5. Teardown when done
cd .tofu/scripts
./teardown-cluster.sh aks-dev
```
---
## Cluster Bootstrap ## Cluster Bootstrap
### Initial Cluster Setup ### Initial Cluster Setup
@@ -37,7 +157,7 @@ Bootstrap a new cluster from scratch:
#### Prerequisites #### Prerequisites
1. **Kubernetes cluster running** (UpCloud, AWS EKS, Azure AKS, GCP GKE, or any K8s cluster) 1. **Kubernetes cluster running** (provisioned via `.tofu/scripts/setup-cluster.sh` or manually on UpCloud, AWS EKS, Azure AKS, GCP GKE)
2. **kubectl configured** with admin access 2. **kubectl configured** with admin access
3. **Repositories cloned** locally 3. **Repositories cloned** locally
@@ -1286,14 +1406,17 @@ spec:
```bash ```bash
# 1. Provision new Kubernetes cluster # 1. Provision new Kubernetes cluster
cd .tofu/scripts
./setup-cluster.sh upc-dev # or aks-dev, eks-prod, etc.
export KUBECONFIG=$(pwd)/../../private/upc-dev/kubeconfig
# 2. Configure kubectl # 2. Verify cluster access
kubectl config use-context new-cluster
kubectl cluster-info kubectl cluster-info
kubectl get nodes
# 3. Bootstrap cluster # 3. Bootstrap cluster
cd ~/dev/k8s/launchpad cd ../..
./bootstrap.sh ./bootstrap.sh upc-dev
# 4. Wait for ArgoCD to sync all applications # 4. Wait for ArgoCD to sync all applications
kubectl get applications -n argocd -w kubectl get applications -n argocd -w

View File

@@ -3,6 +3,7 @@
## Table of Contents ## Table of Contents
- [Architecture Components](#architecture-components) - [Architecture Components](#architecture-components)
- [Repository Reference](#repository-reference) - [Repository Reference](#repository-reference)
- [OpenTofu Infrastructure Reference](#opentofu-infrastructure-reference)
- [Helm Chart Reference](#helm-chart-reference) - [Helm Chart Reference](#helm-chart-reference)
- [ArgoCD Configuration](#argocd-configuration) - [ArgoCD Configuration](#argocd-configuration)
- [Infrastructure Components](#infrastructure-components) - [Infrastructure Components](#infrastructure-components)
@@ -72,9 +73,22 @@ Internet
``` ```
launchpad/ launchpad/
├── bootstrap.sh # Cluster initialization script ├── bootstrap.sh # Cluster initialization (ArgoCD + GitOps)
├── _app-of-apps-upc-dev.yaml # Root ArgoCD Application (upc-dev) ├── _app-of-apps-{cluster}.yaml # Root ArgoCD Application (per cluster)
├── _app-of-apps-upc-prod.yaml # Root ArgoCD Application (upc-prod)
├── .tofu/ # Infrastructure provisioning (OpenTofu)
│ ├── platforms/ # Per-platform IaC
│ │ ├── aks/ # Azure: modules/cluster/, dev/, prod/, workload/
│ │ ├── eks/ # AWS: same structure
│ │ ├── gke/ # GCP
│ │ └── upc/ # UpCloud
│ ├── configs/ # Platform credentials (git-ignored)
│ └── scripts/ # setup-cluster.sh, teardown-cluster.sh, get-kubeconfig.sh
├── clusters/ # Cluster metadata YAML
│ ├── aks-dev.yaml
│ ├── upc-dev.yaml
│ └── ...
├── infra/ # Infrastructure applications (Kustomize) ├── infra/ # Infrastructure applications (Kustomize)
│ ├── base/ # One subdirectory per component │ ├── base/ # One subdirectory per component
@@ -194,6 +208,196 @@ launchpad/
└── REFERENCE.md └── REFERENCE.md
``` ```
---
## OpenTofu Infrastructure Reference
The `.tofu/` directory provides multi-cloud Kubernetes cluster provisioning using OpenTofu.
### Directory Structure
```
.tofu/
├── configs/ # Platform credential templates (git-ignored .env files)
│ ├── aks.env.example
│ ├── eks.env.example
│ ├── gke.env.example
│ └── upc.env.example
├── platforms/ # OpenTofu modules per cloud provider
│ ├── aks/ # Azure AKS
│ │ ├── modules/cluster/ # Reusable AKS module
│ │ │ ├── main.tf # Resource group, VNet, subnet, AKS cluster
│ │ │ ├── variables.tf
│ │ │ ├── outputs.tf
│ │ │ └── providers.tf
│ │ ├── dev/ # Dev environment root
│ │ ├── prod/ # Prod environment root
│ │ └── workload/ # Workload cluster (+ external-dns identity)
│ ├── eks/ # AWS EKS (same structure)
│ ├── gke/ # GCP GKE
│ └── upc/ # UpCloud Kubernetes
└── scripts/
├── setup-cluster.sh # Provision cluster
├── teardown-cluster.sh # Destroy cluster
└── get-kubeconfig.sh # Retrieve/cache kubeconfig
```
### Three-Tier Cluster Strategy
Each platform defines three environment tiers:
| Tier | Purpose | Typical Sizing | Notes |
|------|---------|---------------|-------|
| **dev** | Development/testing | Small, economical nodes (2 nodes) | No delete locks, minimal HA |
| **prod** | Production workloads | Larger nodes, multiple AZs (3 nodes) | Delete locks, HA networking |
| **workload** | Application-only cluster | Medium nodes (2 nodes) | Includes external-DNS integration, no platform services |
### Platform Specifications
#### AKS (Azure Kubernetes Service)
| Resource | Description |
|----------|-------------|
| `azurerm_resource_group` | Container for all Azure resources |
| `azurerm_management_lock` | Optional CanNotDelete lock (prod) |
| `azurerm_virtual_network` | VPC, default `10.100.0.0/16` |
| `azurerm_subnet` | Node subnet, default `10.100.0.0/22` |
| `azurerm_kubernetes_cluster` | AKS with Azure CNI, OIDC issuer, Workload Identity |
**Dev**: Standard_B2s, 2 nodes, norwayeast, no delete lock
**Prod**: Standard_D4s_v3, 3 nodes, westeurope, delete lock enabled
**Workload**: Adds `azurerm_user_assigned_identity` + federated credential for external-dns with DNS Zone Contributor role
**Variables** (`modules/cluster/variables.tf`):
- `prefix` — resource name prefix
- `location` — Azure region
- `vnet_address_space` — default `10.100.0.0/16`
- `aks_subnet_cidr` — default `10.100.0.0/22`
- `aks_node_vm_size` — VM size (e.g., `Standard_B2s`)
- `aks_node_count` — number of nodes
- `aks_kubernetes_version``null` = latest
- `enable_delete_lock` — default `false`
#### EKS (Amazon Elastic Kubernetes Service)
| Resource | Description |
|----------|-------------|
| `aws_vpc` | VPC with DNS enabled, default `10.100.0.0/16` |
| `aws_subnet` (public) | Per-AZ, tagged `kubernetes.io/role/elb=1` |
| `aws_subnet` (private) | Per-AZ, tagged `kubernetes.io/role/internal-elb=1` |
| `aws_nat_gateway` | Single NAT (dev); prod should use one per AZ |
| `aws_eks_cluster` | EKS with public+private endpoints, OIDC issuer |
| `aws_iam_openid_connect_provider` | IRSA (IAM Roles for Service Accounts) |
| `aws_eks_node_group` | Managed nodes with auto-scaling |
**Dev**: t3.medium, 2 nodes (min 1, max 4), eu-west-1a/b, K8s 1.30
**Prod**: m5.xlarge, 3 nodes (min 3, max 6), eu-west-1a/b/c
**Workload**: Adds IRSA role for external-dns with Route53 permissions (ChangeResourceRecordSets, ListHostedZones, ListResourceRecordSets, ListTagsForResource)
**Variables**:
- `region` — AWS region
- `vpc_cidr` — default `10.100.0.0/16`
- `availability_zones` — list of AZs (23 recommended)
- `node_instance_type`, `node_count`, `node_min_count`, `node_max_count`
- `kubernetes_version` — default `1.30`
#### GKE (Google Kubernetes Engine)
| Resource | Description |
|----------|-------------|
| `google_project_service` | Enables compute and container APIs |
| `google_compute_network` | Custom VPC (no auto subnets) |
| `google_compute_subnetwork` | Primary `10.100.0.0/22`, pods `10.200.0.0/14`, services `10.204.0.0/20` |
| `google_container_cluster` | Regional cluster, VPC-native, Workload Identity |
| `google_container_node_pool` | Auto-repair, auto-upgrade, GKE_METADATA mode |
**Dev**: e2-standard-2, 2 nodes/zone, no deletion protection
**Prod**: e2-standard-4, 3 nodes/zone, deletion protection enabled
**Workload**: Adds Google SA for external-dns with `dns.admin` role + Workload Identity binding
**Variables**:
- `project_id` — GCP project (required)
- `region` — GCP region
- `node_machine_type`, `node_count`
- `kubernetes_version``null` = STABLE release channel
- `deletion_protection` — default `false`
#### UPC (UpCloud Kubernetes)
| Resource | Description |
|----------|-------------|
| `upcloud_router` | Private router for cluster network |
| `upcloud_gateway` | NAT gateway for outbound internet |
| `upcloud_network` | Private network, DHCP, default `10.100.0.0/24` |
| `upcloud_kubernetes_cluster` | Managed K8s, private node groups |
| `upcloud_kubernetes_node_group` | Anti-affinity if node_count > 1 |
**Dev**: DEV-1xCPU-2GB, 2 nodes, no-svg1
**Prod**: 4xCPU-8GB, 3 nodes, no-svg1
**Workload**: 2xCPU-4GB, 2 nodes, no-svg1, CIDR `10.110.0.0/24`
> **Note**: UpCloud has no native workload identity — external-DNS integration not available.
### Workload Identity & External-DNS
Workload clusters include keyless cloud access for external-DNS:
| Platform | Identity Mechanism | DNS Permissions |
|----------|--------------------|-----------------|
| **AKS** | Azure Workload Identity (federated credential) | DNS Zone Contributor |
| **EKS** | IRSA (OIDC federation) | Route53 ChangeResourceRecordSets, ListHostedZones |
| **GKE** | Workload Identity (K8s SA → Google SA) | dns.admin role |
| **UPC** | N/A | N/A |
### Naming Conventions
- Cluster: `<prefix>-aks` / `-eks` / `-gke` (derived from platform)
- Resource groups: `<prefix>-rg` (Azure only)
- VPCs/Networks: `<prefix>-vpc`
- Node groups: `<prefix>-nodes`
- Dev prefix: `clst-dev`, Prod prefix: `clst`, Workload prefix: `clst-workload`
### Provider Authentication
| Platform | Auth Method | Config Source |
|----------|-------------|---------------|
| **AKS** | Azure CLI or env vars (`ARM_SUBSCRIPTION_ID`, `ARM_TENANT_ID`) | `configs/aks.env` |
| **EKS** | AWS CLI profile or explicit credentials | `configs/eks.env` |
| **GKE** | Application Default Credentials or SA JSON | `configs/gke.env` |
| **UPC** | API token (`UPCLOUD_TOKEN`) | `configs/upc.env` |
### Scripts Reference
#### `setup-cluster.sh`
```bash
./setup-cluster.sh <platform>-<env> [--plan] [--destroy] [--auto]
```
| Flag | Effect |
|------|--------|
| (none) | Interactive: plan → prompt → apply |
| `--plan` | Dry-run only (tofu plan) |
| `--destroy` | Destroy infrastructure |
| `--auto` | Skip confirmation prompts |
#### `teardown-cluster.sh`
```bash
./teardown-cluster.sh <platform>-<env>
# Delegates to: setup-cluster.sh "$@" --destroy
```
#### `get-kubeconfig.sh`
```bash
./get-kubeconfig.sh <platform>-<env>
# Checks cache: private/<cluster>/kubeconfig
# Falls back to platform CLI if no cache
```
---
#### Key Files #### Key Files
**`bootstrap.sh`** **`bootstrap.sh`**
@@ -1242,9 +1446,18 @@ The realm uses a custom browser authentication flow (`browser-auto-idp`) that sk
**Resources**: **Resources**:
- `ServiceAccount`: `keycloak-client-registrar` (namespace: `keycloak`) - `ServiceAccount`: `keycloak-client-registrar` (namespace: `keycloak`)
- `ClusterRole`: `keycloak-client-registrar` (secrets: get/list/create/update/patch; namespaces: get/list) - `ClusterRole`: `keycloak-client-registrar`
- Secrets: `get`, `list`, `create`, `update`, `patch`
- Namespaces: `get`, `list`
- `ClusterRoleBinding`: `keycloak-client-registrar` - `ClusterRoleBinding`: `keycloak-client-registrar`
- `CronJob`: `keycloak-client-registrar` - `CronJob`: `keycloak-client-registrar`
- **Schedule**: `*/2 * * * *` (every 2 minutes)
- **Concurrency Policy**: `Forbid` (prevents concurrent runs)
- **Backoff Limit**: 3 retries per job
- **History**: 1 successful job, 3 failed jobs retained
- **Resources**: 50m CPU / 64Mi memory (requests), 200m CPU / 128Mi memory (limits)
**Container**: Alpine 3.20 with `curl` and `jq` installed
**Kyverno Policy**: `keycloak-client-config-cloner` — clones labeled Secrets from app namespaces to `keycloak` namespace (see [Kyverno Policies](#kyverno-policies)) **Kyverno Policy**: `keycloak-client-config-cloner` — clones labeled Secrets from app namespaces to `keycloak` namespace (see [Kyverno Policies](#kyverno-policies))
@@ -1523,7 +1736,7 @@ spec:
2. `generate-auth-oidc-secret` - Creates Secret for OIDC mode 2. `generate-auth-oidc-secret` - Creates Secret for OIDC mode
3. `inject-sidecar-token` - Injects auth sidecar for token mode 3. `inject-sidecar-token` - Injects auth sidecar for token mode
4. `inject-sidecar-oidc` - Injects auth sidecar for OIDC mode 4. `inject-sidecar-oidc` - Injects auth sidecar for OIDC mode
5. `inject-sidecar-mcp` - Injects auth sidecar for MCP OAuth mode (RFC 9728 / RFC 7591) 5. `inject-sidecar-mcp` - Injects auth sidecar for MCP OAuth mode (RFC 9728)
6. `generate-auth-network-policy` - Creates NetworkPolicy to restrict ingress 6. `generate-auth-network-policy` - Creates NetworkPolicy to restrict ingress
#### Trigger Annotation #### Trigger Annotation
@@ -1563,7 +1776,7 @@ policies.forteapps.io/auth-image: "ghcr.io/fortedigital/auth-sidecar"
policies.forteapps.io/auth-image-version: "latest" policies.forteapps.io/auth-image-version: "latest"
``` ```
**MCP Mode** (OAuth 2.0 for MCP servers, implements RFC 9728 / RFC 7591): **MCP Mode** (OAuth 2.0 for MCP servers, implements RFC 9728; MCP clients use Keycloak's native RFC 7591 endpoint for Dynamic Client Registration):
```yaml ```yaml
# Annotations (required) # Annotations (required)
policies.forteapps.io/auth: "true" policies.forteapps.io/auth: "true"
@@ -1791,7 +2004,7 @@ Pod: Auth Sidecar (port 8080)
├─ Validate credentials ├─ Validate credentials
│ • Token mode: Check Bearer token │ • Token mode: Check Bearer token
│ • OIDC mode: Validate session or redirect to IdP │ • OIDC mode: Validate session or redirect to IdP
│ • MCP mode: OAuth 2.0 via RFC 9728 discovery / RFC 7591 dynamic registration │ • MCP mode: OAuth 2.0 via RFC 9728 discovery; Keycloak handles RFC 7591 dynamic registration natively
Forward to Application (localhost:3000) Forward to Application (localhost:3000)

View File

@@ -1,9 +1,30 @@
apiVersion: kustomize.config.k8s.io/v1beta1 apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization kind: Kustomization
resources: resources:
- ../../base - ../../base/cert-manager-application
- ../../base/cluster-resources-application
- ../../base/enterprise-apps
- ../../base/fluent-bit
- ../../base/gitea
- ../../base/gitea-actions
- ../../base/grafana
- ../../base/grafana-dashboards
- ../../base/homepage
- ../../base/karpor
- ../../base/keycloak
- ../../base/kyverno
- ../../base/kyverno-policies
- ../../base/loki
- ../../base/opencost
- ../../base/prometheus
- ../../base/renovate
- ../../base/sealedsecrets
- ../../base/tempo
- ../../base/traefik-application
- ../../base/vault
- vaultwarden-postgresql - vaultwarden-postgresql
- vaultwarden - vaultwarden
- wildcard-tls-certificate.yaml
# No patches needed — base already has "upc-dev" paths # No patches needed — base already has "upc-dev" paths
# upc-dev is the default/base cluster # upc-dev is the default/base cluster

View File

@@ -0,0 +1,38 @@
---
# Wildcard Certificate for *.forteapps.net
# This creates a certificate that covers ALL subdomains of forteapps.net
# Once created, you can use it for any app like:
# - myapp.forteapps.net
# - api.forteapps.net
# - git.forteapps.net
# - vaultwarden.forteapps.net
# - etc.
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: wildcard-forteapps-net
namespace: cert-manager # Can be in any namespace, cert-manager namespace is common
spec:
# The secret where the TLS certificate will be stored
# This secret can be referenced by IngressRoutes in any namespace
secretName: wildcard-forteapps-net-tls
# Use the production issuer (use letsencrypt-staging for testing)
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
# DNS names this certificate will cover
# Both wildcard AND apex domain are recommended
dnsNames:
- '*.forteapps.net' # Covers: myapp.forteapps.net, api.forteapps.net, etc.
- 'forteapps.net' # Also include apex domain explicitly
# Optional: Configure certificate duration and renewal
duration: 2160h0m0s # 90 days (Let's Encrypt default)
renewBefore: 720h0m0s # Renew 30 days before expiry
# Optional: Private key settings
privateKey:
algorithm: RSA
encoding: PKCS1
size: 4096

View File

@@ -22,7 +22,8 @@ ingress:
# TLS configuration # TLS configuration
tls: tls:
enabled: true # Set to true to enable TLS enabled: true # Set to true to enable TLS
secretName: "databunker-tls" # Name of the secret containing TLS certificate # secretName: "databunker-tls" # Name of the secret containing TLS certificate
secretName: "wildcard-forteapps-net-tls" # Name of the secret containing TLS certificate
# Pin PostgreSQL password — chart uses randAlphaNum without lookup, # Pin PostgreSQL password — chart uses randAlphaNum without lookup,
# so each ArgoCD sync would regenerate the password while PVC keeps the old one. # so each ArgoCD sync would regenerate the password while PVC keeps the old one.

View File

@@ -602,3 +602,148 @@ extraDeploy:
items: items:
- key: admin-password - key: admin-password
path: admin-password path: admin-password
# -- ServiceAccount for the client cleanup CronJob
- apiVersion: v1
kind: ServiceAccount
metadata:
name: keycloak-client-cleanup
namespace: keycloak
# -- CronJob: cleans up stale dynamically registered Keycloak clients
- apiVersion: batch/v1
kind: CronJob
metadata:
name: keycloak-client-cleanup
namespace: keycloak
spec:
schedule: "0 3 * * 0"
concurrencyPolicy: Forbid
successfulJobsHistoryLimit: 1
failedJobsHistoryLimit: 3
jobTemplate:
spec:
backoffLimit: 3
template:
spec:
serviceAccountName: keycloak-client-cleanup
restartPolicy: Never
containers:
- name: cleanup
image: alpine:3.20
command: [ "/bin/sh", "-c" ]
args:
- |
set -e
apk add --no-cache curl jq > /dev/null 2>&1
KEYCLOAK_URL="http://keycloak:80"
REALM="forte"
ADMIN_USER="admin"
ADMIN_PASS=$(cat /secrets/admin-password)
DRY_RUN="${DRY_RUN:-true}"
MIN_AGE_DAYS="${MIN_AGE_DAYS:-7}"
if [ -z "$CLIENT_ID_PATTERN" ]; then
CLIENT_ID_PATTERN='^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$'
fi
echo "=== Keycloak DCR client cleanup ==="
echo "Dry run: ${DRY_RUN}"
echo "Min age: ${MIN_AGE_DAYS} days"
echo "Pattern: ${CLIENT_ID_PATTERN}"
# Authenticate to Keycloak Admin API
TOKEN=$(curl -sf -X POST "${KEYCLOAK_URL}/realms/master/protocol/openid-connect/token" \
-d "client_id=admin-cli" \
-d "username=${ADMIN_USER}" \
-d "password=${ADMIN_PASS}" \
-d "grant_type=password" | jq -r '.access_token')
if [ -z "$TOKEN" ] || [ "$TOKEN" = "null" ]; then
echo "ERROR: Failed to authenticate to Keycloak"
exit 1
fi
NOW_SEC=$(date +%s)
MIN_AGE_SEC=$((MIN_AGE_DAYS * 86400))
# Hardcoded protected clients (never delete these)
PROTECTED_JSON='["gitea","grafana","argocd","vaultwarden","account","account-console","admin-cli","broker","realm-management","security-admin-console"]'
echo "Fetching clients from realm '${REALM}'..."
CLIENTS=$(curl -sf -H "Authorization: Bearer ${TOKEN}" \
"${KEYCLOAK_URL}/admin/realms/${REALM}/clients")
CANDIDATES=$(echo "$CLIENTS" | jq -c --argjson protected "$PROTECTED_JSON" --argjson now "$NOW_SEC" --argjson min_age "$MIN_AGE_SEC" --arg pattern "$CLIENT_ID_PATTERN" '
[
.[]
| select(.clientId as $cid | $protected | index($cid) | not)
| select(.attributes["k8s.secret.sync"] != "true")
| select(.clientId | test($pattern; "i"))
| select((.createdTimestamp // 0) / 1000 < ($now - $min_age))
]
')
COUNT=$(echo "$CANDIDATES" | jq 'length')
echo "Found ${COUNT} candidate client(s) matching pattern and age threshold"
if [ "$COUNT" -eq 0 ]; then
echo "Nothing to clean up."
exit 0
fi
while IFS= read -r CLIENT; do
CLIENT_ID=$(echo "$CLIENT" | jq -r '.clientId')
CLIENT_UUID=$(echo "$CLIENT" | jq -r '.id')
CREATED=$(echo "$CLIENT" | jq -r '.createdTimestamp')
# Check active sessions
SESSION_RESPONSE=$(curl -s -H "Authorization: Bearer ${TOKEN}" \
"${KEYCLOAK_URL}/admin/realms/${REALM}/clients/${CLIENT_UUID}/session-count" || true)
SESSION_COUNT=$(echo "$SESSION_RESPONSE" | jq -r '.count // 0')
if [ "$SESSION_COUNT" -gt 0 ]; then
echo " SKIP: '${CLIENT_ID}' has ${SESSION_COUNT} active session(s)"
continue
fi
AGE_DAYS=$(( (NOW_SEC - (CREATED / 1000)) / 86400 ))
if [ "$DRY_RUN" = "true" ]; then
echo " DRY-RUN: would delete '${CLIENT_ID}' (uuid: ${CLIENT_UUID}, age: ${AGE_DAYS}d)"
else
echo " DELETE: '${CLIENT_ID}' (uuid: ${CLIENT_UUID}, age: ${AGE_DAYS}d)"
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" \
-H "Authorization: Bearer ${TOKEN}" \
-X DELETE \
"${KEYCLOAK_URL}/admin/realms/${REALM}/clients/${CLIENT_UUID}" || echo "000")
if [ "$HTTP_CODE" != "204" ] && [ "$HTTP_CODE" != "200" ]; then
echo " ERROR: Failed to delete '${CLIENT_ID}' (HTTP ${HTTP_CODE})"
fi
fi
done < <(echo "$CANDIDATES" | jq -c '.[]')
echo "Cleanup run complete."
env:
- name: DRY_RUN
value: "false"
- name: MIN_AGE_DAYS
value: "7"
volumeMounts:
- name: keycloak-credentials
mountPath: /secrets
readOnly: true
resources:
requests:
cpu: 50m
memory: 64Mi
limits:
cpu: 200m
memory: 128Mi
volumes:
- name: keycloak-credentials
secret:
secretName: keycloak-credentials
items:
- key: admin-password
path: admin-password

View File

@@ -1,10 +1,10 @@
ingress: ingress:
enabled: true enabled: true
host: databunker.forteapps.net host: bunker.forteapps.net
annotations: annotations:
gethomepage.dev/enabled: "true" gethomepage.dev/enabled: "true"
gethomepage.dev/name: "Databunker" gethomepage.dev/name: "Databunker"
gethomepage.dev/description: "Secure Database for PII and PCI Records" gethomepage.dev/description: "Secure Database for PII and PCI Records"
gethomepage.dev/group: "Security" gethomepage.dev/group: "Security"
gethomepage.dev/icon: "double-take" gethomepage.dev/icon: "double-take"
gethomepage.dev/href: "https://databunker.forteapps.net" gethomepage.dev/href: "https://bunker.forteapps.net"