diff --git a/ARGOCD_COMPREHENSIVE_ANALYSIS.md b/ARGOCD_COMPREHENSIVE_ANALYSIS.md index d2ad989..ddeaef3 100644 --- a/ARGOCD_COMPREHENSIVE_ANALYSIS.md +++ b/ARGOCD_COMPREHENSIVE_ANALYSIS.md @@ -20,13 +20,7 @@ Analyzed 11 ArgoCD Application manifests in `/argocd/apps/`. This report details - Unpredictable application behavior - **Fix:** Pin to specific git tags or commit SHAs -### 3. Placeholder URLs (HIGH) -**Files:** fluent-bit.yaml, grafana.yaml -- Second source still has `https://github.com/YOUR_ORG/YOUR_GITOPS_REPO.git` -- Applications fail to deploy -- **Fix:** Update to actual repository URL - -### 4. Undersized Resources (HIGH) +### 3. Undersized Resources (HIGH) **Files:** cert-manager, loki, prometheus, trivy - cert-manager: 100m CPU limit (too tight for control plane) - loki: 200m CPU, 512Mi memory (drops logs under load) @@ -34,7 +28,7 @@ Analyzed 11 ArgoCD Application manifests in `/argocd/apps/`. This report details - **Impact:** Performance degradation, OOM kills, dropped logs - **Fix:** Increase resource limits across all monitoring stack -### 5. No Data Persistence (HIGH) +### 4. No Data Persistence (HIGH) **Files:** loki.yaml (filesystem storage), prometheus.yaml - Loki using filesystem storage (ephemeral, lost on restart) - Prometheus likely ephemeral (no PVC visible) diff --git a/README.md b/README.md index 987e0c4..a8eab3e 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,6 @@ -# README.md +# CLAUDE.md + +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Overview @@ -11,23 +13,25 @@ This is a **Kubernetes cluster bootstrapping and GitOps configuration repository ├── bootstrap.sh # Main bootstrap script to initialize ArgoCD and cluster ├── argocd/ # ArgoCD configuration (primary entrypoint) │ ├── _app-of-apps.yaml # App-of-apps pattern: parent Application that manages all child apps -│ ├── apps/ # Individual ArgoCD Application resources -│ │ ├── traefik-application.yaml # Ingress controller (Traefik) -│ │ ├── cert-manager-application.yaml # TLS certificate management -│ │ ├── kyverno.yaml # Policy engine for security -│ │ ├── prometheus.yaml # Metrics & monitoring -│ │ ├── grafana.yaml # Monitoring visualization -│ │ ├── loki.yaml # Log aggregation -│ │ ├── fluent-bit.yaml # Log shipping -│ │ ├── trivy.yaml # Container scanning -│ │ ├── sealedsecrets.yaml # Secret encryption -│ │ └── cluster-resources-application.yaml # Cluster-wide resources +│ ├── infra/ # Individual ArgoCD Application resources for infrastructure +│ │ ├── traefik-application.yaml # Ingress controller (Traefik) +│ │ ├── cert-manager-application.yaml # TLS certificate management +│ │ ├── kyverno.yaml # Policy engine for security +│ │ ├── prometheus.yaml # Metrics & monitoring +│ │ ├── grafana.yaml # Monitoring visualization +│ │ ├── loki.yaml # Log aggregation +│ │ ├── fluent-bit.yaml # Log shipping +│ │ ├── trivy.yaml # Container scanning +│ │ ├── sealedsecrets.yaml # Secret encryption +│ │ └── cluster-resources-application.yaml # Cluster-wide resources +│ ├── apps/ # Application resources (currently unused/empty) │ └── values/ # Helm value overrides for ArgoCD and services -│ ├── argocd-values.yaml # ArgoCD server configuration +│ ├── argocd-values.yaml # ArgoCD server configuration │ ├── prometheus-values.yaml │ ├── grafana-values.yaml -│ └── loki-values.yaml -└── cluster-resources/ # Cluster-level configurations +│ ├── loki-values.yaml +│ └── fluent-bit-values.yaml +└── cluster-resources/ # Cluster-level configurations managed by cluster-resources-application.yaml ├── cert-manager-namespace.yaml ├── letsencrypt-issuer.yaml # TLS certificate issuer └── kyverno-config.yaml # Security policies and secret syncing @@ -37,7 +41,7 @@ This is a **Kubernetes cluster bootstrapping and GitOps configuration repository ### GitOps Model - **App-of-Apps Pattern**: `argocd/_app-of-apps.yaml` is the root Application that manages all child applications -- **Source of Truth**: GitHub repository (`https://github.com/snothub/scaling-parakeet.git`) is the single source of truth +- **Source of Truth**: GitHub repository (`https://github.com/fortedigital/sturdy-adventure.git`) is the single source of truth - **Auto-sync**: All Applications have automated sync enabled with auto-pruning and self-healing - **Namespace Creation**: `CreateNamespace=true` allows ArgoCD to create namespaces as needed @@ -102,25 +106,38 @@ kubectl get secrets -n ``` ### Deploy Changes -- Changes to YAML files in `argocd/` or `cluster-resources/` are automatically synced by ArgoCD +- Changes to YAML files in `argocd/infra/`, `argocd/values/`, or `cluster-resources/` are automatically synced by ArgoCD - Push changes to the GitHub repository for them to be reflected - ArgoCD reconciliation happens every 60s (`timeout.reconciliation: 60s`) +- Each application has a 5-minute sync timeout to prevent stalled deployments ### Review Helm Values -Application-specific Helm value overrides are in `argocd/values/` and referenced within each Application's `values` field. +Application-specific Helm value overrides are in `argocd/values/` and referenced within each Application's Helm configuration. Each application manifest uses both external value files and inline overrides where needed. + +### Application Organization & Sync Ordering +- Infrastructure applications use `argocd.argoproj.io/sync-wave` annotations for ordered deployment +- Kyverno (sync-wave: 0) deploys before cluster-resources (sync-wave: 1) to ensure policies are ready +- All applications have resource requests and limits configured to prevent resource starvation +- Applications are labeled with `app.kubernetes.io/part-of` to indicate their component type (platform, monitoring-stack, application) ## Important Notes - **No admin auth in development**: ArgoCD has `admin.enabled: "false"` - suitable for local/dev only - **Insecure server mode**: `--insecure` and `--disable-auth` flags are set - not for production +- **Folder organization**: + - `argocd/infra/` contains infrastructure/platform components (Traefik, Cert-Manager, Prometheus, Grafana, Loki, etc.) + - `argocd/apps/` is reserved for business applications (currently empty) - **Replica counts**: Traefik runs 2 replicas; other services run 1 replica -- **Retry policy**: All applications retry up to 5 times with exponential backoff (max 3m) +- **Retry policy**: All applications retry up to 5 times with exponential backoff (max 3m timeout per application) - **Ignore replica scaling**: Deployments ignore replica count differences to allow HPA/manual scaling +- **Sync validation**: All applications validate manifests before applying (`Validate=true`) +- **Server-side apply**: All applications use `ServerSideApply=true` for safer field ownership tracking ## Development Tips - **Check ArgoCD logs**: `kubectl logs -n argocd deployment/argocd-application-controller` - **Validate YAML**: Files are validated server-side (`Validate=true`) before applying - **Resource tracking**: Uses annotation-based method (`application.resourceTrackingMethod: annotation`) -- **Modify applications**: Edit the corresponding YAML in `argocd/apps/` and push to trigger sync -- **Add new services**: Create a new Application YAML following the pattern of existing ones, then reference it from `_app-of-apps.yaml` +- **Modify applications**: Edit the corresponding YAML in `argocd/infra/` and push to trigger sync +- **Add new services**: Create a new Application YAML in `argocd/infra/` following the pattern of existing ones, then it will be auto-discovered by the app-of-apps +- **Application folder naming**: Infrastructure components are in `argocd/infra/`; `argocd/apps/` is reserved for business applications (currently empty) diff --git a/argocd/_app-of-apps.yaml b/argocd/_app-of-apps.yaml index 402bbd8..c37f427 100644 --- a/argocd/_app-of-apps.yaml +++ b/argocd/_app-of-apps.yaml @@ -1,22 +1,25 @@ apiVersion: argoproj.io/v1alpha1 kind: Application metadata: - name: musicman-app-of-apps + name: app-of-apps namespace: argocd labels: - scope: music-man + scope: infra spec: project: default source: - repoURL: https://github.com/snothub/scaling-parakeet.git + repoURL: https://github.com/fortedigital/sturdy-adventure.git targetRevision: HEAD path: argocd destination: server: https://kubernetes.default.svc - namespace: music-man + namespace: argocd syncPolicy: automated: prune: true selfHeal: true syncOptions: - CreateNamespace=true + - Validate=true + - ServerSideApply=true + timeout: 300s diff --git a/argocd/apps/cert-manager-application.yaml b/argocd/infra/cert-manager-application.yaml similarity index 100% rename from argocd/apps/cert-manager-application.yaml rename to argocd/infra/cert-manager-application.yaml diff --git a/argocd/apps/cluster-resources-application.yaml b/argocd/infra/cluster-resources-application.yaml similarity index 93% rename from argocd/apps/cluster-resources-application.yaml rename to argocd/infra/cluster-resources-application.yaml index c38542e..b22a5c5 100644 --- a/argocd/apps/cluster-resources-application.yaml +++ b/argocd/infra/cluster-resources-application.yaml @@ -15,7 +15,7 @@ spec: project: default source: - repoURL: https://github.com/snothub/scaling-parakeet.git + repoURL: https://github.com/fortedigital/sturdy-adventure.git targetRevision: HEAD path: cluster-resources diff --git a/argocd/apps/fluent-bit.yaml b/argocd/infra/fluent-bit.yaml similarity index 91% rename from argocd/apps/fluent-bit.yaml rename to argocd/infra/fluent-bit.yaml index 091ebb0..c484005 100644 --- a/argocd/apps/fluent-bit.yaml +++ b/argocd/infra/fluent-bit.yaml @@ -21,8 +21,8 @@ spec: valueFiles: - $values/argocd/values/fluent-bit-values.yaml - - repoURL: https://github.com/YOUR_ORG/YOUR_GITOPS_REPO.git - targetRevision: main + - repoURL: https://github.com/fortedigital/sturdy-adventure.git + targetRevision: HEAD ref: values destination: diff --git a/argocd/apps/grafana.yaml b/argocd/infra/grafana.yaml similarity index 91% rename from argocd/apps/grafana.yaml rename to argocd/infra/grafana.yaml index 0971fca..9c4fbf1 100644 --- a/argocd/apps/grafana.yaml +++ b/argocd/infra/grafana.yaml @@ -21,8 +21,8 @@ spec: valueFiles: - $values/argocd/values/grafana-values.yaml - - repoURL: https://github.com/YOUR_ORG/YOUR_GITOPS_REPO.git - targetRevision: main + - repoURL: https://github.com/fortedigital/sturdy-adventure.git + targetRevision: HEAD ref: values destination: diff --git a/argocd/apps/kyverno.yaml b/argocd/infra/kyverno.yaml similarity index 100% rename from argocd/apps/kyverno.yaml rename to argocd/infra/kyverno.yaml diff --git a/argocd/apps/loki.yaml b/argocd/infra/loki.yaml similarity index 91% rename from argocd/apps/loki.yaml rename to argocd/infra/loki.yaml index e41dd76..7584ca8 100644 --- a/argocd/apps/loki.yaml +++ b/argocd/infra/loki.yaml @@ -21,8 +21,8 @@ spec: valueFiles: - $values/argocd/values/loki-values.yaml - - repoURL: https://github.com/YOUR_ORG/YOUR_GITOPS_REPO.git - targetRevision: main + - repoURL: https://github.com/fortedigital/sturdy-adventure.git + targetRevision: HEAD ref: values destination: diff --git a/argocd/apps/prometheus.yaml b/argocd/infra/prometheus.yaml similarity index 91% rename from argocd/apps/prometheus.yaml rename to argocd/infra/prometheus.yaml index fa5f83d..557434e 100644 --- a/argocd/apps/prometheus.yaml +++ b/argocd/infra/prometheus.yaml @@ -21,8 +21,8 @@ spec: valueFiles: - $values/argocd/values/prometheus-values.yaml - - repoURL: https://github.com/YOUR_ORG/YOUR_GITOPS_REPO.git - targetRevision: main + - repoURL: https://github.com/fortedigital/sturdy-adventure.git + targetRevision: HEAD ref: values destination: diff --git a/argocd/apps/sealedsecrets.yaml b/argocd/infra/sealedsecrets.yaml similarity index 100% rename from argocd/apps/sealedsecrets.yaml rename to argocd/infra/sealedsecrets.yaml diff --git a/argocd/apps/traefik-application.yaml b/argocd/infra/traefik-application.yaml similarity index 100% rename from argocd/apps/traefik-application.yaml rename to argocd/infra/traefik-application.yaml diff --git a/argocd/apps/trivy.yaml b/argocd/infra/trivy.yaml similarity index 100% rename from argocd/apps/trivy.yaml rename to argocd/infra/trivy.yaml