# Kubernetes Cluster - GitOps Configuration > **Kubernetes cluster bootstrapping and GitOps configuration repository** using ArgoCD for multi-cloud Kubernetes (UpCloud, AWS EKS, Azure AKS, GCP GKE) [![GitOps](https://img.shields.io/badge/GitOps-ArgoCD-blue)](https://argoproj.github.io/cd/) [![Kubernetes](https://img.shields.io/badge/Kubernetes-Multi--Cloud-orange)]() --- ## 📚 Complete Documentation **New developers and operators**: Please refer to our comprehensive documentation for detailed guides and references: ### 🎯 [**START HERE: Documentation Index**](docs/README.md) | Document | Description | Audience | |----------|-------------|----------| | **[GitOps Architecture](docs/GITOPS-ARCHITECTURE.md)** | System architecture, repository structure, GitOps workflows, security model | Everyone (start here) | | **[Developer Guide](docs/DEVELOPER-GUIDE.md)** | Local setup, deploying apps, managing secrets, troubleshooting | Developers | | **[Operations Runbook](docs/OPERATIONS-RUNBOOK.md)** | Cluster bootstrap, day-to-day operations, incident response, maintenance | Platform Engineers, SREs | | **[Technical Reference](docs/REFERENCE.md)** | Component specs, Helm charts, ArgoCD config, Kyverno policies, API docs | Everyone (reference) | --- ## 🚀 Quick Start ### For New Developers ```bash # 1. Clone repositories git clone https://git.forteapps.net/Forte/launchpad.git git clone ssh://git@git.forteapps.net:2222/Forte/helm-prod-values.git # 2. Read the guides # - Start: docs/GITOPS-ARCHITECTURE.md # - Follow: docs/DEVELOPER-GUIDE.md # 3. Deploy your first app (see Developer Guide) ``` ### For Operators ```bash # 1. Bootstrap new cluster ./bootstrap.sh # 2. Verify deployment kubectl get applications -n argocd kubectl get pods --all-namespaces # 3. Read Operations Runbook for day-to-day tasks ``` --- ## 📋 Overview This repository contains the complete GitOps configuration for our Kubernetes cluster, using the **App-of-Apps pattern** with ArgoCD. ### What's Inside - **Infrastructure Applications**: Traefik, Cert-Manager, Kyverno, Prometheus, Grafana, Loki, Tempo, Sealed Secrets, Homepage (platform dashboard) - **Business Applications**: MCP10X, MusicMan, Dot-AI Stack, ArgoCD MCP - **Policies**: Kyverno security policies for secret management, namespace controls, pod verification - **Monitoring**: Full observability stack with metrics, logs, traces, and alerting - **Secrets**: Sealed Secrets for secure Git storage ### Key Features ✅ **GitOps-Native**: Git is the single source of truth ✅ **Auto-Sync**: Changes automatically deployed (60s reconciliation) ✅ **Self-Healing**: Manual cluster changes are reverted ✅ **Multi-Source**: Separate chart templates from configuration ✅ **Policy Enforcement**: Kyverno ensures security and compliance ✅ **Authentication**: Automatic sidecar injection (token & OIDC support) ✅ **TLS Everywhere**: Automatic Let's Encrypt certificates ✅ **Full Observability**: Prometheus, Grafana, Loki, Tempo integration --- ## 🗂️ Repository Structure ``` . ├── bootstrap.sh # Cluster initialization script ├── _app-of-apps.yaml # Root ArgoCD Application (App-of-Apps pattern) │ ├── infra/ # Infrastructure ArgoCD Applications (Kustomize multi-cluster) │ ├── base/ # Base ArgoCD Application manifests (one dir per component) │ │ ├── kustomization.yaml # Aggregates all component subdirectories │ │ ├── traefik-application/ │ │ │ ├── kustomization.yaml │ │ │ └── traefik-application.yaml │ │ ├── keycloak/ │ │ │ ├── kustomization.yaml │ │ │ └── keycloak.yaml │ │ ├── grafana/ │ │ ├── prometheus/ │ │ ├── ... # Each component in its own subdirectory │ │ └── secrets/ │ ├── overlays/ # Per-cluster overrides (Kustomize) │ │ ├── upc-dev/ # UpCloud Dev — includes all base components │ │ ├── upc-prod/ # UpCloud Prod — all components + patches │ │ ├── aks-dev/ # Azure AKS Dev — selective components only │ │ ├── aks-prod/ # Azure AKS Prod │ │ ├── eks-dev/ # AWS EKS Dev │ │ ├── eks-prod/ # AWS EKS Prod │ │ ├── gke-dev/ # GCP GKE Dev │ │ └── gke-prod/ # GCP GKE Prod │ ├── dashboards/ # Grafana dashboard ConfigMaps │ └── values/ # Helm value overrides │ ├── base/ # Shared cloud-agnostic values │ ├── upc-dev/ # UpCloud Dev (storage, LB, pricing) │ ├── upc-prod/ # UpCloud Prod │ ├── eks-dev/ # AWS EKS Dev │ ├── eks-prod/ # AWS EKS Prod │ ├── aks-dev/ # Azure AKS Dev │ ├── aks-prod/ # Azure AKS Prod │ ├── gke-dev/ # GCP GKE Dev │ └── gke-prod/ # GCP GKE Prod │ ├── apps/ # Business Applications (Kustomize, same pattern as infra) │ ├── base/ # One subdirectory per app │ │ ├── kustomization.yaml │ │ ├── musicman/ │ │ ├── mcp10x/ │ │ ├── dot-ai-stack/ │ │ ├── ts-mcp/ │ │ └── argo-mcp/ │ └── overlays/ # Per-cluster: cherry-pick or include all │ ├── upc-dev/ # All apps │ ├── upc-prod/ # All apps + patches │ └── aks-dev/ # Selective apps only │ ├── cluster-resources/ # Cluster-wide Kubernetes resources │ ├── letsencrypt-issuer.yaml │ ├── kyverno-config.yaml │ ├── *-sealed.yaml # Sealed secrets │ └── policies/ # Kyverno policies │ ├── secret-cloner.yaml │ ├── default-ns-blocker.yaml │ ├── bare-pod-cleaner.yaml │ └── auth-sidecar-injector.yaml │ ├── secrets/ # Application secrets (sealed) │ └── *-credentials-sealed.yaml │ ├── private/ # Local-only files (Git-ignored) │ └── *.yaml # Unsealed secrets (never committed) │ └── docs/ # 📚 Comprehensive documentation ├── README.md # Documentation index ├── GITOPS-ARCHITECTURE.md # Architecture guide ├── DEVELOPER-GUIDE.md # Developer onboarding ├── OPERATIONS-RUNBOOK.md # Operations procedures └── REFERENCE.md # Technical reference ``` **See [GitOps Architecture - Repository Structure](docs/GITOPS-ARCHITECTURE.md#repository-structure) for detailed explanation.** --- ## 🏗️ Architecture ### Three-Repository Pattern | Repository | Purpose | Who Edits | How Often | |------------|---------|-----------|-----------| | **[launchpad](https://git.forteapps.net/Forte/launchpad)** (this repo) | ArgoCD Applications, cluster resources | Platform / DevOps engineers | ✅ Often | | **[forte-helm](https://git.forteapps.net/Forte/forte-helm)** | Generic Helm chart templates | Platform engineers | ❌ Rarely | | **[helm-prod-values](ssh://git@git.forteapps.net:2222/Forte/helm-prod-values.git)** | App-specific configuration & versions | Developers / CI pipelines | ✅ Sometimes | ### GitOps Workflow ``` Developer commits code → CI/CD builds image → Updates helm-prod-values → ArgoCD syncs → Deployed to cluster ``` **Learn more**: [GitOps Architecture - GitOps Workflow](docs/GITOPS-ARCHITECTURE.md#gitops-workflow) --- ## 🔧 Common Tasks ### Deploy a New Application **See detailed guide**: [Developer Guide - Deploying Your First Application](docs/DEVELOPER-GUIDE.md#deploying-your-first-application) **Quick version**: 1. Create `apps/myapp.yaml` (ArgoCD Application manifest) 2. Create `helm-prod-values/myapp/values.yaml` (configuration) 3. Create sealed secrets if needed 4. Commit and push - ArgoCD auto-syncs! ### Update an Existing Application **See detailed guide**: [Developer Guide - Updating an Existing Application](docs/DEVELOPER-GUIDE.md#updating-an-existing-application) **Quick version**: - **Update code**: Push to app repo → CI/CD updates image tag in helm-prod-values - **Update config**: Edit `helm-prod-values/myapp/values.yaml` → commit → push ### Manage Secrets **See detailed guide**: [Developer Guide - Working with Secrets](docs/DEVELOPER-GUIDE.md#working-with-secrets) ```bash # Create plain secret kubectl create secret generic myapp-creds \ --from-literal=KEY=value \ --dry-run=client -o yaml > private/myapp-creds.yaml # Seal it kubeseal --format=yaml --cert=pub-cert.pem \ < private/myapp-creds.yaml > secrets/myapp-creds-sealed.yaml # Commit sealed version git add secrets/myapp-creds-sealed.yaml git commit -m "Add myapp credentials" git push ``` ### Enable Authentication **See detailed guide**: [Developer Guide - Enabling Authentication](docs/DEVELOPER-GUIDE.md#enabling-authentication-for-applications) **Quick version**: ```yaml # In helm-prod-values/myapp/values.yaml # Token-based auth (simple) auth: enabled: true type: token tokens: - your-secret-token-here # OIDC auth (SSO) auth: enabled: true type: oidc oidc: authority: https://auth.example.com/realms/master clientId: myapp ``` Then create OIDC secret (if using OIDC): ```bash kubectl create secret generic auth-oidc \ --from-literal=client-secret=your-oidc-secret \ --from-literal=cookie-secret=$(openssl rand -hex 32) \ --namespace=myapp | \ kubeseal --format=yaml --cert=pub-cert.pem --namespace=myapp | \ kubectl apply -f - ``` ### Bootstrap Cluster **See detailed guide**: [Operations Runbook - Cluster Bootstrap](docs/OPERATIONS-RUNBOOK.md#cluster-bootstrap) ```bash # Initialize new cluster ./bootstrap.sh # Verify kubectl get applications -n argocd kubectl get pods --all-namespaces ``` --- ## 🛠️ Quick Reference ### Monitor Applications ```bash # List all ArgoCD applications kubectl get applications -n argocd # Watch sync status kubectl get applications -n argocd -w # Check specific application kubectl describe application myapp -n argocd # View application logs kubectl logs -n myapp ``` ### Access UIs ```bash # ArgoCD UI kubectl port-forward svc/argocd-server -n argocd 8080:443 # Access: https://localhost:8080 (no auth required) # Grafana kubectl port-forward -n monitoring svc/grafana 3000:80 # Access: http://localhost:3000 # Prometheus kubectl port-forward -n monitoring svc/prometheus-server 9090:80 # Access: http://localhost:9090 ``` ### Troubleshooting ```bash # Check pod status kubectl get pods -n myapp # View pod logs kubectl logs -n myapp # Check pod events kubectl describe pod -n myapp # Check ArgoCD sync errors kubectl describe application myapp -n argocd # Force sync kubectl patch application myapp -n argocd \ --type merge -p '{"metadata":{"annotations":{"argocd.argoproj.io/refresh":"hard"}}}' ``` **Full troubleshooting guide**: [Developer Guide - Troubleshooting](docs/DEVELOPER-GUIDE.md#troubleshooting) --- ## 🔐 Security ### Secret Management - ✅ Sealed Secrets for Git storage - ✅ Kyverno auto-clones secrets to namespaces - ❌ Never commit plain secrets ### Network Security - ✅ All traffic TLS-encrypted (Let's Encrypt) - ✅ HTTP → HTTPS redirect - ✅ Traefik IngressRoute per application ### Policy Enforcement - ✅ Kyverno policies for security - ✅ Default namespace blocked - ✅ Bare pods not allowed - ✅ Optional authentication sidecar injection **Learn more**: [GitOps Architecture - Security Model](docs/GITOPS-ARCHITECTURE.md#security-model) --- ## 📊 Infrastructure Components | Component | Purpose | Namespace | Replicas | |-----------|---------|-----------|----------| | **ArgoCD** | GitOps controller | `argocd` | 1 | | **Traefik** | Ingress controller | `traefik` | 2 | | **Cert-Manager** | TLS certificates | `cert-manager` | 1 | | **Kyverno** | Policy engine | `kyverno` | 1 | | **Sealed Secrets** | Secret encryption | `kube-system` | 1 | | **Prometheus** | Metrics | `monitoring` | 1 | | **Grafana** | Dashboards | `monitoring` | 1 | | **Loki** | Logs | `monitoring` | 1 | | **Tempo** | Distributed tracing | `monitoring` | 1 | | **Fluent-Bit** | Log shipping | `monitoring` | DaemonSet | | **OpenCost** | Cost monitoring | `monitoring` | 1 | | **Renovate** | Dependency updates | `renovate` | CronJob | **Full specs**: [Technical Reference - Infrastructure Components](docs/REFERENCE.md#infrastructure-components) --- ## 🌐 Domains & Networking - **Local development**: `*.127.0.0.1.nip.io` - **Production**: `*.forteapps.net` - **DNS**: Manual configuration (contact platform team) - **TLS**: Automatic via Let's Encrypt --- ## 📖 Key Concepts ### App-of-Apps Pattern `_app-of-apps-{cluster}.yaml` is the root Application that manages all other Applications in `infra/`. Each component in `infra/base/` lives in its own subdirectory (e.g., `infra/base/grafana/`). Overlays can either include **all** components (via `../../base`) or **cherry-pick** specific ones (via `../../base/grafana`, `../../base/prometheus`, etc.). Per-cluster patches swap Helm value file paths. Supported clusters: `upc-dev`, `upc-prod`, `eks-dev`, `eks-prod`, `aks-dev`, `aks-prod`, `gke-dev`, `gke-prod`. ### Multi-Source Pattern Applications reference both: 1. **Helm charts** from `forte-helm` (templates) 2. **Values** from `helm-prod-values` (configuration) This separates reusable templates from environment-specific config. ### Sync Waves Applications deploy in order using `argocd.argoproj.io/sync-wave`: - Wave `-1`: Namespaces - Wave `0`: Kyverno (policies) - Wave `1`: Infrastructure - Wave `2+`: Applications ### Auto-Sync & Self-Heal - **Auto-Sync**: ArgoCD automatically deploys Git changes (60s polling) - **Self-Heal**: Manual cluster changes are reverted to match Git - **Prune**: Deleted resources in Git are removed from cluster **Learn more**: [GitOps Architecture - GitOps Workflow](docs/GITOPS-ARCHITECTURE.md#gitops-workflow) --- ## ⚙️ Configuration ### ArgoCD Settings - **Reconciliation**: Every 60 seconds - **Sync timeout**: 5 minutes per application - **Retry policy**: 5 attempts with exponential backoff - **Authentication**: Disabled (internal use only) ### Application Defaults - **Auto-sync**: Enabled - **Self-heal**: Enabled - **Prune**: Enabled - **Validation**: Server-side validation enabled - **Server-side apply**: Enabled **Full configuration**: [Technical Reference - ArgoCD Configuration](docs/REFERENCE.md#argocd-configuration) --- ## 🆘 Getting Help ### Documentation 1. **Start here**: [Documentation Index](docs/README.md) 2. **For development**: [Developer Guide](docs/DEVELOPER-GUIDE.md) 3. **For operations**: [Operations Runbook](docs/OPERATIONS-RUNBOOK.md) 4. **For reference**: [Technical Reference](docs/REFERENCE.md) ### Support - **Slack**: #platform-support - **Issues**: Contact platform team - **Emergencies**: Escalate via Slack ### Common Questions | Question | Answer | |----------|--------| | How do I deploy an app? | [Developer Guide - Deploying Your First Application](docs/DEVELOPER-GUIDE.md#deploying-your-first-application) | | How do I manage secrets? | [Developer Guide - Working with Secrets](docs/DEVELOPER-GUIDE.md#working-with-secrets) | | App won't sync? | [Developer Guide - Troubleshooting](docs/DEVELOPER-GUIDE.md#troubleshooting) | | How do I bootstrap a cluster? | [Operations Runbook - Cluster Bootstrap](docs/OPERATIONS-RUNBOOK.md#cluster-bootstrap) | | Where are the logs? | [Operations Runbook - Monitoring & Alerting](docs/OPERATIONS-RUNBOOK.md#monitoring--alerting) | --- ## 🤝 Contributing ### Adding a New Application 1. Read [Developer Guide - Deploying Your First Application](docs/DEVELOPER-GUIDE.md#deploying-your-first-application) 2. Create ArgoCD Application manifest in `apps/` 3. Create Helm values in `helm-prod-values/` 4. Create sealed secrets if needed 5. Commit and push - ArgoCD handles the rest! ### Modifying Infrastructure 1. Read [Operations Runbook](docs/OPERATIONS-RUNBOOK.md) 2. Update relevant files in `infra/` or `cluster-resources/` 3. Test changes in isolated namespace if possible 4. Commit and push 5. Monitor sync status in Slack/ArgoCD UI ### Updating Documentation Documentation lives in `docs/`. To update: 1. Edit relevant markdown file 2. Update "Last Updated" date 3. Submit PR or push directly 4. Notify team of significant changes --- ## 📝 Notes ### Current Environment - **Provider**: Multi-cloud (UpCloud, AWS EKS, Azure AKS, GCP GKE) - **Active clusters**: UpCloud (upc-dev, upc-prod) - **Environment**: Production (internal use only) - **Auth**: Disabled for ArgoCD (internal access) - **Backup**: Gitea daily backup to S3-compatible storage ### Known Limitations - Secret rotation not automated - DNS management is manual **Future improvements**: See [Operations Runbook - Disaster Recovery](docs/OPERATIONS-RUNBOOK.md#disaster-recovery) --- ## 📚 Additional Resources ### External Documentation - [ArgoCD Documentation](https://argo-cd.readthedocs.io/) - [Kyverno Documentation](https://kyverno.io/docs/) - [Traefik Documentation](https://doc.traefik.io/traefik/) - [Cert-Manager Documentation](https://cert-manager.io/docs/) - [Grafana Tempo Documentation](https://grafana.com/docs/tempo/) - [Sealed Secrets](https://github.com/bitnami-labs/sealed-secrets) ### Related Repositories - [forte-helm](https://git.forteapps.net/Forte/forte-helm) - Helm chart templates - [helm-prod-values](git@github.com:fortedigital/helm-prod-values.git) - Application values --- ## 📄 License Internal use only. Not for public distribution. --- ## 👥 Maintainers **Platform Team** - Contact: #platform-support on Slack - Issues: Create issue in repository or contact team directly --- **Last Updated**: 2026-04-22 **Documentation Version**: 1.0.0 **🚀 Ready to get started? Check out the [Documentation Index](docs/README.md)!**