Tempo doc
This commit is contained in:
@@ -57,10 +57,10 @@ This repository contains the complete GitOps configuration for our Kubernetes cl
|
||||
|
||||
### What's Inside
|
||||
|
||||
- **Infrastructure Applications**: Traefik, Cert-Manager, Kyverno, Prometheus, Grafana, Loki, Sealed Secrets
|
||||
- **Infrastructure Applications**: Traefik, Cert-Manager, Kyverno, Prometheus, Grafana, Loki, Tempo, Sealed Secrets
|
||||
- **Business Applications**: MCP10X, MusicMan, Dot-AI Stack, ArgoCD MCP
|
||||
- **Policies**: Kyverno security policies for secret management, namespace controls, pod verification
|
||||
- **Monitoring**: Full observability stack with metrics, logs, and alerting
|
||||
- **Monitoring**: Full observability stack with metrics, logs, traces, and alerting
|
||||
- **Secrets**: Sealed Secrets for secure Git storage
|
||||
|
||||
### Key Features
|
||||
@@ -72,7 +72,7 @@ This repository contains the complete GitOps configuration for our Kubernetes cl
|
||||
✅ **Policy Enforcement**: Kyverno ensures security and compliance
|
||||
✅ **Authentication**: Automatic sidecar injection (token & OIDC support)
|
||||
✅ **TLS Everywhere**: Automatic Let's Encrypt certificates
|
||||
✅ **Full Observability**: Prometheus, Grafana, Loki integration
|
||||
✅ **Full Observability**: Prometheus, Grafana, Loki, Tempo integration
|
||||
|
||||
---
|
||||
|
||||
@@ -91,6 +91,7 @@ This repository contains the complete GitOps configuration for our Kubernetes cl
|
||||
│ ├── prometheus.yaml
|
||||
│ ├── grafana.yaml
|
||||
│ ├── loki.yaml
|
||||
│ ├── tempo.yaml
|
||||
│ ├── fluent-bit.yaml
|
||||
│ ├── trivy.yaml
|
||||
│ ├── sealedsecrets.yaml
|
||||
@@ -331,6 +332,7 @@ kubectl patch application myapp -n argocd \
|
||||
| **Prometheus** | Metrics | `monitoring` | 1 |
|
||||
| **Grafana** | Dashboards | `monitoring` | 1 |
|
||||
| **Loki** | Logs | `monitoring` | 1 |
|
||||
| **Tempo** | Distributed tracing | `monitoring` | 1 |
|
||||
| **Fluent-Bit** | Log shipping | `monitoring` | DaemonSet |
|
||||
| **Trivy** | Vulnerability scanning | `trivy-system` | 1 |
|
||||
|
||||
@@ -470,6 +472,7 @@ Documentation lives in `docs/`. To update:
|
||||
- [Kyverno Documentation](https://kyverno.io/docs/)
|
||||
- [Traefik Documentation](https://doc.traefik.io/traefik/)
|
||||
- [Cert-Manager Documentation](https://cert-manager.io/docs/)
|
||||
- [Grafana Tempo Documentation](https://grafana.com/docs/tempo/)
|
||||
- [Sealed Secrets](https://github.com/bitnami-labs/sealed-secrets)
|
||||
|
||||
### Related Repositories
|
||||
|
||||
@@ -21,7 +21,7 @@ This Kubernetes cluster uses a **GitOps approach** powered by **ArgoCD**, where
|
||||
- **Deployment Pattern**: App-of-Apps
|
||||
- **Secret Management**: Sealed Secrets (kubeseal)
|
||||
- **Ingress**: Traefik with Let's Encrypt TLS
|
||||
- **Monitoring**: Prometheus + Grafana + Loki + Fluent-Bit
|
||||
- **Monitoring**: Prometheus + Grafana + Loki + Tempo + Fluent-Bit
|
||||
- **Policy Engine**: Kyverno
|
||||
- **Notifications**: Slack integration for sync status
|
||||
|
||||
@@ -83,6 +83,7 @@ This Kubernetes cluster uses a **GitOps approach** powered by **ArgoCD**, where
|
||||
│ │ - Prometheus │ │
|
||||
│ │ - Grafana │ │
|
||||
│ │ - Loki │ │
|
||||
│ │ - Tempo │ │
|
||||
│ │ - Fluent-Bit │ │
|
||||
│ └──────────────────────────┘ │
|
||||
│ │
|
||||
@@ -127,6 +128,7 @@ sturdy-adventure/
|
||||
│ ├── prometheus.yaml
|
||||
│ ├── grafana.yaml
|
||||
│ ├── loki.yaml
|
||||
│ ├── tempo.yaml
|
||||
│ ├── fluent-bit.yaml
|
||||
│ ├── trivy.yaml
|
||||
│ ├── sealedsecrets.yaml
|
||||
@@ -136,6 +138,7 @@ sturdy-adventure/
|
||||
│ ├── prometheus-values.yaml
|
||||
│ ├── grafana-values.yaml
|
||||
│ ├── loki-values.yaml
|
||||
│ ├── tempo-values.yaml
|
||||
│ └── fluent-bit-values.yaml
|
||||
│
|
||||
├── apps/ # Business Application ArgoCD manifests
|
||||
@@ -301,6 +304,7 @@ _app-of-apps.yaml (Root)
|
||||
│ ├── kyverno
|
||||
│ ├── prometheus
|
||||
│ ├── grafana
|
||||
│ ├── tempo
|
||||
│ └── ... (other infra apps)
|
||||
│
|
||||
└── enterprise-apps (manages apps/)
|
||||
@@ -526,8 +530,9 @@ annotations:
|
||||
1. **Prometheus**: Metrics collection and storage
|
||||
2. **Grafana**: Metrics visualization and dashboards
|
||||
3. **Loki**: Log aggregation
|
||||
4. **Fluent-Bit**: Log shipping from pods to Loki
|
||||
5. **Trivy**: Container vulnerability scanning
|
||||
4. **Tempo**: Distributed tracing (OTLP)
|
||||
5. **Fluent-Bit**: Log shipping from pods to Loki
|
||||
6. **Trivy**: Container vulnerability scanning
|
||||
|
||||
### Slack Notifications
|
||||
|
||||
|
||||
@@ -954,6 +954,33 @@ curl -G -s 'http://localhost:3100/loki/api/v1/query_range' \
|
||||
--data-urlencode 'start=1h' | jq
|
||||
```
|
||||
|
||||
### Tempo Traces
|
||||
|
||||
```bash
|
||||
# Port forward to Tempo query API
|
||||
kubectl port-forward -n monitoring svc/tempo 3200:3200
|
||||
|
||||
# Access: http://localhost:3200
|
||||
```
|
||||
|
||||
**Query traces via Grafana:**
|
||||
1. Open Grafana → Explore
|
||||
2. Select Tempo datasource
|
||||
3. Use TraceQL or search by service name
|
||||
|
||||
**Verify Traefik is sending traces:**
|
||||
```bash
|
||||
# Check Traefik logs for OTLP export errors
|
||||
kubectl logs -n traefik-system -l app.kubernetes.io/name=traefik | grep -i "traces export"
|
||||
|
||||
# Check Tempo is receiving data
|
||||
kubectl logs -n monitoring -l app.kubernetes.io/name=tempo | grep "receiver"
|
||||
```
|
||||
|
||||
**Trace-to-log correlation:**
|
||||
- Click a trace span in Grafana → linked Loki logs appear (by namespace, pod, container)
|
||||
- Trace-to-metrics links to Prometheus by service name
|
||||
|
||||
### Fluent-Bit Log Shipping
|
||||
|
||||
Verify Fluent-Bit is shipping logs:
|
||||
|
||||
@@ -29,6 +29,7 @@
|
||||
| **Secret Management** | Sealed Secrets (Bitnami) |
|
||||
| **Monitoring** | Prometheus + Grafana |
|
||||
| **Logging** | Loki + Fluent-Bit |
|
||||
| **Tracing** | Tempo (OTLP) |
|
||||
| **Container Scanning** | Trivy |
|
||||
|
||||
### Network Architecture
|
||||
@@ -81,6 +82,7 @@ sturdy-adventure/
|
||||
│ ├── prometheus.yaml
|
||||
│ ├── grafana.yaml
|
||||
│ ├── loki.yaml
|
||||
│ ├── tempo.yaml
|
||||
│ ├── fluent-bit.yaml
|
||||
│ ├── trivy.yaml
|
||||
│ ├── sealedsecrets.yaml
|
||||
@@ -90,6 +92,7 @@ sturdy-adventure/
|
||||
│ ├── prometheus-values.yaml
|
||||
│ ├── grafana-values.yaml
|
||||
│ ├── loki-values.yaml
|
||||
│ ├── tempo-values.yaml
|
||||
│ └── fluent-bit-values.yaml
|
||||
│
|
||||
├── apps/ # Business applications
|
||||
@@ -703,6 +706,7 @@ kubeStateMetrics:
|
||||
**Datasources**:
|
||||
- Prometheus
|
||||
- Loki
|
||||
- Tempo
|
||||
|
||||
### Loki
|
||||
|
||||
@@ -720,6 +724,45 @@ promtail:
|
||||
enabled: false # Using Fluent-Bit instead
|
||||
```
|
||||
|
||||
### Tempo
|
||||
|
||||
**Chart**: `grafana/tempo`
|
||||
**Version**: 1.24.4
|
||||
**Namespace**: `monitoring`
|
||||
|
||||
**Purpose**: Distributed tracing backend receiving OTLP traces from Traefik and other instrumented services.
|
||||
|
||||
**Configuration**:
|
||||
```yaml
|
||||
tempo:
|
||||
storage:
|
||||
trace:
|
||||
backend: local
|
||||
local:
|
||||
path: /var/tempo/traces
|
||||
receivers:
|
||||
otlp:
|
||||
protocols:
|
||||
grpc:
|
||||
endpoint: "0.0.0.0:4317"
|
||||
http:
|
||||
endpoint: "0.0.0.0:4318"
|
||||
|
||||
persistence:
|
||||
enabled: true
|
||||
size: 10Gi
|
||||
```
|
||||
|
||||
**Endpoints**:
|
||||
- gRPC OTLP receiver: `:4317`
|
||||
- HTTP OTLP receiver: `:4318`
|
||||
- Query API: `:3200`
|
||||
|
||||
**Grafana Integration**:
|
||||
- Trace-to-logs correlation with Loki (by namespace, pod, container)
|
||||
- Trace-to-metrics correlation with Prometheus (by service name)
|
||||
- Service graph and node graph visualization
|
||||
|
||||
### Fluent-Bit
|
||||
|
||||
**Chart**: `fluent/fluent-bit`
|
||||
@@ -1184,6 +1227,19 @@ GET /api/v1/query_range?query={promql}&start={time}&end={time}&step={duration}
|
||||
GET /api/v1/label/__name__/values
|
||||
```
|
||||
|
||||
### Tempo API
|
||||
|
||||
```
|
||||
# Search traces
|
||||
GET /api/search?q={traceql}
|
||||
|
||||
# Get trace by ID
|
||||
GET /api/traces/{traceID}
|
||||
|
||||
# Service tag values
|
||||
GET /api/v2/search/tag/resource.service.name/values
|
||||
```
|
||||
|
||||
### Loki API
|
||||
|
||||
```
|
||||
@@ -1315,6 +1371,7 @@ team: platform
|
||||
| **Prometheus** | 2.47.0+ | Latest |
|
||||
| **Grafana** | 10.0.0+ | Latest |
|
||||
| **Loki** | 2.9.0+ | Latest |
|
||||
| **Tempo** | 2.6.0+ | 1.24.4 |
|
||||
| **Fluent-Bit** | 2.1.0+ | Latest |
|
||||
| **PostgreSQL** | 16-alpine | N/A |
|
||||
| **Trivy** | Latest | Latest |
|
||||
|
||||
Reference in New Issue
Block a user