This commit is contained in:
Danijel Simeunovic
2026-02-08 10:42:10 +01:00
parent a42e94672e
commit bec3b6310a
13 changed files with 56 additions and 42 deletions

View File

@@ -20,13 +20,7 @@ Analyzed 11 ArgoCD Application manifests in `/argocd/apps/`. This report details
- Unpredictable application behavior - Unpredictable application behavior
- **Fix:** Pin to specific git tags or commit SHAs - **Fix:** Pin to specific git tags or commit SHAs
### 3. Placeholder URLs (HIGH) ### 3. Undersized Resources (HIGH)
**Files:** fluent-bit.yaml, grafana.yaml
- Second source still has `https://github.com/YOUR_ORG/YOUR_GITOPS_REPO.git`
- Applications fail to deploy
- **Fix:** Update to actual repository URL
### 4. Undersized Resources (HIGH)
**Files:** cert-manager, loki, prometheus, trivy **Files:** cert-manager, loki, prometheus, trivy
- cert-manager: 100m CPU limit (too tight for control plane) - cert-manager: 100m CPU limit (too tight for control plane)
- loki: 200m CPU, 512Mi memory (drops logs under load) - loki: 200m CPU, 512Mi memory (drops logs under load)
@@ -34,7 +28,7 @@ Analyzed 11 ArgoCD Application manifests in `/argocd/apps/`. This report details
- **Impact:** Performance degradation, OOM kills, dropped logs - **Impact:** Performance degradation, OOM kills, dropped logs
- **Fix:** Increase resource limits across all monitoring stack - **Fix:** Increase resource limits across all monitoring stack
### 5. No Data Persistence (HIGH) ### 4. No Data Persistence (HIGH)
**Files:** loki.yaml (filesystem storage), prometheus.yaml **Files:** loki.yaml (filesystem storage), prometheus.yaml
- Loki using filesystem storage (ephemeral, lost on restart) - Loki using filesystem storage (ephemeral, lost on restart)
- Prometheus likely ephemeral (no PVC visible) - Prometheus likely ephemeral (no PVC visible)

View File

@@ -1,4 +1,6 @@
# README.md # CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Overview ## Overview
@@ -11,7 +13,7 @@ This is a **Kubernetes cluster bootstrapping and GitOps configuration repository
├── bootstrap.sh # Main bootstrap script to initialize ArgoCD and cluster ├── bootstrap.sh # Main bootstrap script to initialize ArgoCD and cluster
├── argocd/ # ArgoCD configuration (primary entrypoint) ├── argocd/ # ArgoCD configuration (primary entrypoint)
│ ├── _app-of-apps.yaml # App-of-apps pattern: parent Application that manages all child apps │ ├── _app-of-apps.yaml # App-of-apps pattern: parent Application that manages all child apps
│ ├── apps/ # Individual ArgoCD Application resources │ ├── infra/ # Individual ArgoCD Application resources for infrastructure
│ │ ├── traefik-application.yaml # Ingress controller (Traefik) │ │ ├── traefik-application.yaml # Ingress controller (Traefik)
│ │ ├── cert-manager-application.yaml # TLS certificate management │ │ ├── cert-manager-application.yaml # TLS certificate management
│ │ ├── kyverno.yaml # Policy engine for security │ │ ├── kyverno.yaml # Policy engine for security
@@ -22,12 +24,14 @@ This is a **Kubernetes cluster bootstrapping and GitOps configuration repository
│ │ ├── trivy.yaml # Container scanning │ │ ├── trivy.yaml # Container scanning
│ │ ├── sealedsecrets.yaml # Secret encryption │ │ ├── sealedsecrets.yaml # Secret encryption
│ │ └── cluster-resources-application.yaml # Cluster-wide resources │ │ └── cluster-resources-application.yaml # Cluster-wide resources
│ ├── apps/ # Application resources (currently unused/empty)
│ └── values/ # Helm value overrides for ArgoCD and services │ └── values/ # Helm value overrides for ArgoCD and services
│ ├── argocd-values.yaml # ArgoCD server configuration │ ├── argocd-values.yaml # ArgoCD server configuration
│ ├── prometheus-values.yaml │ ├── prometheus-values.yaml
│ ├── grafana-values.yaml │ ├── grafana-values.yaml
── loki-values.yaml ── loki-values.yaml
└── cluster-resources/ # Cluster-level configurations │ └── fluent-bit-values.yaml
└── cluster-resources/ # Cluster-level configurations managed by cluster-resources-application.yaml
├── cert-manager-namespace.yaml ├── cert-manager-namespace.yaml
├── letsencrypt-issuer.yaml # TLS certificate issuer ├── letsencrypt-issuer.yaml # TLS certificate issuer
└── kyverno-config.yaml # Security policies and secret syncing └── kyverno-config.yaml # Security policies and secret syncing
@@ -37,7 +41,7 @@ This is a **Kubernetes cluster bootstrapping and GitOps configuration repository
### GitOps Model ### GitOps Model
- **App-of-Apps Pattern**: `argocd/_app-of-apps.yaml` is the root Application that manages all child applications - **App-of-Apps Pattern**: `argocd/_app-of-apps.yaml` is the root Application that manages all child applications
- **Source of Truth**: GitHub repository (`https://github.com/snothub/scaling-parakeet.git`) is the single source of truth - **Source of Truth**: GitHub repository (`https://github.com/fortedigital/sturdy-adventure.git`) is the single source of truth
- **Auto-sync**: All Applications have automated sync enabled with auto-pruning and self-healing - **Auto-sync**: All Applications have automated sync enabled with auto-pruning and self-healing
- **Namespace Creation**: `CreateNamespace=true` allows ArgoCD to create namespaces as needed - **Namespace Creation**: `CreateNamespace=true` allows ArgoCD to create namespaces as needed
@@ -102,25 +106,38 @@ kubectl get secrets -n <namespace>
``` ```
### Deploy Changes ### Deploy Changes
- Changes to YAML files in `argocd/` or `cluster-resources/` are automatically synced by ArgoCD - Changes to YAML files in `argocd/infra/`, `argocd/values/`, or `cluster-resources/` are automatically synced by ArgoCD
- Push changes to the GitHub repository for them to be reflected - Push changes to the GitHub repository for them to be reflected
- ArgoCD reconciliation happens every 60s (`timeout.reconciliation: 60s`) - ArgoCD reconciliation happens every 60s (`timeout.reconciliation: 60s`)
- Each application has a 5-minute sync timeout to prevent stalled deployments
### Review Helm Values ### Review Helm Values
Application-specific Helm value overrides are in `argocd/values/` and referenced within each Application's `values` field. Application-specific Helm value overrides are in `argocd/values/` and referenced within each Application's Helm configuration. Each application manifest uses both external value files and inline overrides where needed.
### Application Organization & Sync Ordering
- Infrastructure applications use `argocd.argoproj.io/sync-wave` annotations for ordered deployment
- Kyverno (sync-wave: 0) deploys before cluster-resources (sync-wave: 1) to ensure policies are ready
- All applications have resource requests and limits configured to prevent resource starvation
- Applications are labeled with `app.kubernetes.io/part-of` to indicate their component type (platform, monitoring-stack, application)
## Important Notes ## Important Notes
- **No admin auth in development**: ArgoCD has `admin.enabled: "false"` - suitable for local/dev only - **No admin auth in development**: ArgoCD has `admin.enabled: "false"` - suitable for local/dev only
- **Insecure server mode**: `--insecure` and `--disable-auth` flags are set - not for production - **Insecure server mode**: `--insecure` and `--disable-auth` flags are set - not for production
- **Folder organization**:
- `argocd/infra/` contains infrastructure/platform components (Traefik, Cert-Manager, Prometheus, Grafana, Loki, etc.)
- `argocd/apps/` is reserved for business applications (currently empty)
- **Replica counts**: Traefik runs 2 replicas; other services run 1 replica - **Replica counts**: Traefik runs 2 replicas; other services run 1 replica
- **Retry policy**: All applications retry up to 5 times with exponential backoff (max 3m) - **Retry policy**: All applications retry up to 5 times with exponential backoff (max 3m timeout per application)
- **Ignore replica scaling**: Deployments ignore replica count differences to allow HPA/manual scaling - **Ignore replica scaling**: Deployments ignore replica count differences to allow HPA/manual scaling
- **Sync validation**: All applications validate manifests before applying (`Validate=true`)
- **Server-side apply**: All applications use `ServerSideApply=true` for safer field ownership tracking
## Development Tips ## Development Tips
- **Check ArgoCD logs**: `kubectl logs -n argocd deployment/argocd-application-controller` - **Check ArgoCD logs**: `kubectl logs -n argocd deployment/argocd-application-controller`
- **Validate YAML**: Files are validated server-side (`Validate=true`) before applying - **Validate YAML**: Files are validated server-side (`Validate=true`) before applying
- **Resource tracking**: Uses annotation-based method (`application.resourceTrackingMethod: annotation`) - **Resource tracking**: Uses annotation-based method (`application.resourceTrackingMethod: annotation`)
- **Modify applications**: Edit the corresponding YAML in `argocd/apps/` and push to trigger sync - **Modify applications**: Edit the corresponding YAML in `argocd/infra/` and push to trigger sync
- **Add new services**: Create a new Application YAML following the pattern of existing ones, then reference it from `_app-of-apps.yaml` - **Add new services**: Create a new Application YAML in `argocd/infra/` following the pattern of existing ones, then it will be auto-discovered by the app-of-apps
- **Application folder naming**: Infrastructure components are in `argocd/infra/`; `argocd/apps/` is reserved for business applications (currently empty)

View File

@@ -1,22 +1,25 @@
apiVersion: argoproj.io/v1alpha1 apiVersion: argoproj.io/v1alpha1
kind: Application kind: Application
metadata: metadata:
name: musicman-app-of-apps name: app-of-apps
namespace: argocd namespace: argocd
labels: labels:
scope: music-man scope: infra
spec: spec:
project: default project: default
source: source:
repoURL: https://github.com/snothub/scaling-parakeet.git repoURL: https://github.com/fortedigital/sturdy-adventure.git
targetRevision: HEAD targetRevision: HEAD
path: argocd path: argocd
destination: destination:
server: https://kubernetes.default.svc server: https://kubernetes.default.svc
namespace: music-man namespace: argocd
syncPolicy: syncPolicy:
automated: automated:
prune: true prune: true
selfHeal: true selfHeal: true
syncOptions: syncOptions:
- CreateNamespace=true - CreateNamespace=true
- Validate=true
- ServerSideApply=true
timeout: 300s

View File

@@ -15,7 +15,7 @@ spec:
project: default project: default
source: source:
repoURL: https://github.com/snothub/scaling-parakeet.git repoURL: https://github.com/fortedigital/sturdy-adventure.git
targetRevision: HEAD targetRevision: HEAD
path: cluster-resources path: cluster-resources

View File

@@ -21,8 +21,8 @@ spec:
valueFiles: valueFiles:
- $values/argocd/values/fluent-bit-values.yaml - $values/argocd/values/fluent-bit-values.yaml
- repoURL: https://github.com/YOUR_ORG/YOUR_GITOPS_REPO.git - repoURL: https://github.com/fortedigital/sturdy-adventure.git
targetRevision: main targetRevision: HEAD
ref: values ref: values
destination: destination:

View File

@@ -21,8 +21,8 @@ spec:
valueFiles: valueFiles:
- $values/argocd/values/grafana-values.yaml - $values/argocd/values/grafana-values.yaml
- repoURL: https://github.com/YOUR_ORG/YOUR_GITOPS_REPO.git - repoURL: https://github.com/fortedigital/sturdy-adventure.git
targetRevision: main targetRevision: HEAD
ref: values ref: values
destination: destination:

View File

@@ -21,8 +21,8 @@ spec:
valueFiles: valueFiles:
- $values/argocd/values/loki-values.yaml - $values/argocd/values/loki-values.yaml
- repoURL: https://github.com/YOUR_ORG/YOUR_GITOPS_REPO.git - repoURL: https://github.com/fortedigital/sturdy-adventure.git
targetRevision: main targetRevision: HEAD
ref: values ref: values
destination: destination:

View File

@@ -21,8 +21,8 @@ spec:
valueFiles: valueFiles:
- $values/argocd/values/prometheus-values.yaml - $values/argocd/values/prometheus-values.yaml
- repoURL: https://github.com/YOUR_ORG/YOUR_GITOPS_REPO.git - repoURL: https://github.com/fortedigital/sturdy-adventure.git
targetRevision: main targetRevision: HEAD
ref: values ref: values
destination: destination: