136 lines
7.0 KiB
Markdown
136 lines
7.0 KiB
Markdown
## Overview
|
|
|
|
This is a **Kubernetes cluster bootstrapping and GitOps configuration repository** using ArgoCD. It defines the infrastructure-as-code for deploying and managing applications, services, and policies on Kubernetes clusters.
|
|
|
|
## Repository Structure
|
|
|
|
```
|
|
.
|
|
├── bootstrap.sh # Main bootstrap script to initialize ArgoCD and cluster
|
|
├── apps/ # Application resources (currently unused/empty)
|
|
├── infra/ # Individual ArgoCD Application resources for infrastructure
|
|
│ ├── _app-of-apps.yaml # App-of-apps pattern: parent Infra Application that manages all infrastructure apps
|
|
│ ├── traefik-application.yaml # Ingress controller (Traefik)
|
|
│ ├── cert-manager-application.yaml # TLS certificate management
|
|
│ ├── kyverno.yaml # Policy engine for security
|
|
│ ├── prometheus.yaml # Metrics & monitoring
|
|
│ ├── grafana.yaml # Monitoring visualization
|
|
│ ├── loki.yaml # Log aggregation
|
|
│ ├── fluent-bit.yaml # Log shipping
|
|
│ ├── trivy.yaml # Container scanning
|
|
│ ├── sealedsecrets.yaml # Secret encryption
|
|
│ └── cluster-resources-application.yaml # Cluster-wide resources
|
|
│ └── values/ # Helm value overrides for ArgoCD and services
|
|
│ ├── argocd-values.yaml # ArgoCD server configuration
|
|
│ ├── prometheus-values.yaml
|
|
│ ├── grafana-values.yaml
|
|
│ ├── loki-values.yaml
|
|
│ └── fluent-bit-values.yaml
|
|
└── cluster-resources/ # Cluster-level configurations managed by cluster-resources-application.yaml
|
|
├── cert-manager-namespace.yaml
|
|
├── letsencrypt-issuer.yaml # TLS certificate issuer
|
|
└── kyverno-config.yaml # Security policies and secret syncing
|
|
```
|
|
|
|
## Architecture & Key Concepts
|
|
|
|
### GitOps Model
|
|
- **App-of-Apps Pattern**: `infra/_app-of-apps.yaml` is the root Application that manages all infrastructure applications
|
|
- **App-of-Apps Pattern**: `apps/_app-of-apps.yaml` is the root Application that manages all custom applications
|
|
- **Source of Truth**: GitHub repository (`https://github.com/snothub/sturdy-adventure.git`) is the single source of truth
|
|
- **Auto-sync**: All Applications have automated sync enabled with auto-pruning and self-healing
|
|
- **Namespace Creation**: `CreateNamespace=true` allows ArgoCD to create namespaces as needed
|
|
|
|
### Key Components
|
|
|
|
1. **Traefik** - Kubernetes Ingress controller for routing external traffic with HTTP/HTTPS redirect
|
|
2. **Cert-Manager** - Automates TLS certificate management with Let's Encrypt (see `letsencrypt-issuer.yaml`)
|
|
3. **Kyverno** - Policy engine that enforces security rules and syncs secrets across namespaces (via `sync-secret-with-multi-clone` policy)
|
|
4. **Monitoring Stack** - Prometheus (metrics) + Grafana (visualization) + Loki (logs) + Fluent-Bit (log shipping)
|
|
5. **Trivy** - Container vulnerability scanning
|
|
6. **Sealed Secrets** - Encrypts secrets for safe storage in Git
|
|
|
|
### Secret Management
|
|
- **Kyverno ClusterPolicy**: Automatically clones secrets from the `secrets` namespace to new namespaces when they're created
|
|
- Only secrets labeled `allowedToBeCloned: "true"` are cloned
|
|
- Syncing happens automatically via `synchronize: true` in the policy
|
|
|
|
### Network Configuration
|
|
- ArgoCD UI: `argocd.127.0.0.1.nip.io` (local development)
|
|
- Server runs in insecure mode (`--insecure`, `--disable-auth`) - suitable for local/dev clusters
|
|
- Traefik routes to multiple services via Kubernetes Ingress
|
|
|
|
## Common Commands
|
|
|
|
### Bootstrap the Cluster
|
|
```bash
|
|
./bootstrap.sh
|
|
```
|
|
This runs the `Bootstrap()` function which calls `ArgoCd()` to install ArgoCD using Helm.
|
|
|
|
### Monitor ArgoCD Applications
|
|
```bash
|
|
# View all ArgoCD applications
|
|
kubectl get applications -n argocd
|
|
|
|
# Watch sync status
|
|
kubectl get applications -n argocd -w
|
|
|
|
# Describe a specific application
|
|
kubectl describe app <app-name> -n argocd
|
|
```
|
|
|
|
### Manage ArgoCD
|
|
```bash
|
|
# Port forward to access UI
|
|
kubectl port-forward svc/argocd-server -n argocd 8080:443
|
|
|
|
# Access at: https://localhost:8080 (admin auth disabled in dev)
|
|
```
|
|
|
|
### Check Secret Syncing
|
|
```bash
|
|
# Verify Kyverno policy is applied
|
|
kubectl get clusterpolicy sync-secret-with-multi-clone
|
|
|
|
# Check if secrets are synced to a namespace
|
|
kubectl get secrets -n <namespace>
|
|
```
|
|
|
|
### Deploy Changes
|
|
- Changes to YAML files in `apps/`, `infra/`, `**/values/`, or `cluster-resources/` are automatically synced by ArgoCD
|
|
- Push changes to the GitHub repository for them to be reflected
|
|
- ArgoCD reconciliation happens every 60s (`timeout.reconciliation: 60s`)
|
|
- Each application has a 5-minute sync timeout to prevent stalled deployments
|
|
|
|
### Review Helm Values
|
|
Application-specific Helm value overrides are in `**/values/` and referenced within each Application's Helm configuration. Each application manifest uses both external value files and inline overrides where needed.
|
|
|
|
### Application Organization & Sync Ordering
|
|
- Infrastructure applications use `argocd.argoproj.io/sync-wave` annotations for ordered deployment
|
|
- Kyverno (sync-wave: 0) deploys before cluster-resources (sync-wave: 1) to ensure policies are ready
|
|
- All applications have resource requests and limits configured to prevent resource starvation
|
|
- Applications are labeled with `app.kubernetes.io/part-of` to indicate their component type (platform, monitoring-stack, application)
|
|
|
|
## Important Notes
|
|
|
|
- **No admin auth in development**: ArgoCD has `admin.enabled: "false"` - suitable for local/dev only
|
|
- **Insecure server mode**: `--insecure` and `--disable-auth` flags are set - not for production
|
|
- **Folder organization**:
|
|
- `infra/` contains infrastructure/platform components (Traefik, Cert-Manager, Prometheus, Grafana, Loki, etc.)
|
|
- `apps/` is reserved for business applications (currently empty)
|
|
- **Replica counts**: Traefik runs 2 replicas; other services run 1 replica
|
|
- **Retry policy**: All applications retry up to 5 times with exponential backoff (max 3m timeout per application)
|
|
- **Ignore replica scaling**: Deployments ignore replica count differences to allow HPA/manual scaling
|
|
- **Sync validation**: All applications validate manifests before applying (`Validate=true`)
|
|
- **Server-side apply**: All applications use `ServerSideApply=true` for safer field ownership tracking
|
|
|
|
## Development Tips
|
|
|
|
- **Check ArgoCD logs**: `kubectl logs -n argocd deployment/argocd-application-controller`
|
|
- **Validate YAML**: Files are validated server-side (`Validate=true`) before applying
|
|
- **Resource tracking**: Uses annotation-based method (`application.resourceTrackingMethod: annotation`)
|
|
- **Modify applications**: Edit the corresponding YAML in `infra/` and push to trigger sync
|
|
- **Add new services**: Create a new Application YAML in `apps/` following the pattern of existing ones, then it will be auto-discovered by the app-of-apps
|
|
- **Application folder naming**: Infrastructure components are in `infra/`; `apps/` is reserved for business applications (currently empty)
|