Files
rippled/docker/telemetry/docker-compose.yml
Pratik Mankawde 85330920ac feat(telemetry): add Loki service and filelog receiver for Phase 8 log ingestion
Cherry-pick Loki infrastructure from phase-10 back to where it belongs
(Phase 8, Tasks 8.2/8.3):

- Add Loki 3.4.2 service to docker-compose.yml (port 3100)
- Add filelog receiver to OTel Collector config (tails debug.log,
  regex_parser extracts trace_id/span_id/partition/severity)
- Add otlphttp/loki exporter (uses Loki 3.x native OTLP ingestion)
- Add logs pipeline: filelog -> batch -> otlphttp/loki
- Add health_check extension
- Mount xrpld log directory into collector container
- Add prometheus-data and loki-data persistent volumes

StatsD receiver intentionally excluded — Phase 7 migrated to native
OTLP metrics, making the StatsD receiver unnecessary.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 14:55:45 +01:00

123 lines
4.3 KiB
YAML

# Docker Compose stack for xrpld OpenTelemetry observability.
#
# Provides services for local development:
# - otel-collector: receives OTLP traces from xrpld, batches and
# forwards them to Tempo. Also tails xrpld log files
# via filelog receiver and exports to Loki. Listens on ports
# 4317 (gRPC) and 4318 (HTTP).
# - tempo: Grafana Tempo tracing backend, queryable via Grafana Explore
# on port 3000. Recommended for production (S3/GCS storage, TraceQL).
# - loki: Grafana Loki log aggregation backend for centralized log
# ingestion and log-trace correlation (Phase 8).
# - grafana: dashboards on port 3000, pre-configured with Tempo,
# Prometheus, and Loki datasources.
#
# Usage:
# docker compose -f docker/telemetry/docker-compose.yml up -d
#
# Configure xrpld to export traces by adding to xrpld.cfg:
# [telemetry]
# enabled=1
# endpoint=http://localhost:4318/v1/traces
services:
# OpenTelemetry Collector: receives spans from xrpld via OTLP protocol,
# batches them for efficiency, and forwards to Tempo for storage.
otel-collector:
image: otel/opentelemetry-collector-contrib:0.121.0
command: ["--config=/etc/otel-collector-config.yaml"]
ports:
- "4317:4317" # OTLP gRPC
- "4318:4318" # OTLP HTTP (traces + native OTel metrics)
- "8889:8889" # Prometheus metrics (spanmetrics + OTLP)
# StatsD UDP port removed — beast::insight now uses native OTLP.
# Uncomment if using server=statsd fallback:
# - "8125:8125/udp"
volumes:
# Mount collector pipeline config (receivers → processors → exporters)
- ./otel-collector-config.yaml:/etc/otel-collector-config.yaml:ro
# Phase 8: Mount rippled log directory for filelog receiver.
# The integration test writes logs to /tmp/xrpld-integration/;
# mount it read-only so the collector can tail debug.log files.
- /tmp/xrpld-integration:/var/log/rippled:ro
depends_on:
- tempo
- loki
networks:
- xrpld-telemetry
# Grafana Tempo: distributed tracing backend that stores and indexes
# spans. Queryable via TraceQL in Grafana Explore.
tempo:
image: grafana/tempo:2.7.2
command: ["-config.file=/etc/tempo.yaml"]
ports:
- "3200:3200" # Tempo HTTP API (health check, query)
volumes:
# Mount Tempo storage and ingestion config
- ./tempo.yaml:/etc/tempo.yaml:ro
# Persistent volume for trace data (WAL + blocks)
- tempo-data:/var/tempo
networks:
- xrpld-telemetry
# Phase 8: Grafana Loki for centralized log ingestion and log-trace
# correlation. Loki 3.x supports native OTLP ingestion, so the OTel
# Collector exports via otlphttp to Loki's /otlp endpoint.
# Query logs via Grafana Explore -> Loki at http://localhost:3000.
loki:
image: grafana/loki:3.4.2
ports:
- "3100:3100"
command: -config.file=/etc/loki/local-config.yaml
volumes:
- loki-data:/loki
networks:
- xrpld-telemetry
prometheus:
image: prom/prometheus:latest
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
- prometheus-data:/prometheus
depends_on:
- otel-collector
networks:
- xrpld-telemetry
# Grafana: visualization UI with Tempo pre-configured as a datasource.
# Anonymous admin access enabled for local development convenience.
grafana:
image: grafana/grafana:11.5.2
environment:
- GF_AUTH_ANONYMOUS_ENABLED=true # No login required for local dev
- GF_AUTH_ANONYMOUS_ORG_ROLE=Admin # Full access without auth
ports:
- "3000:3000" # Grafana web UI
volumes:
# Auto-provision Tempo datasource and search filters on startup
- ./grafana/provisioning:/etc/grafana/provisioning:ro
- ./grafana/dashboards:/var/lib/grafana/dashboards:ro
depends_on:
- tempo
- prometheus
- loki
networks:
- xrpld-telemetry
# Named volume for Tempo trace storage (WAL and compacted blocks).
# Data persists across container restarts. Remove with:
# docker compose -f docker/telemetry/docker-compose.yml down -v
volumes:
tempo-data:
prometheus-data:
loki-data:
# Isolated bridge network so services communicate by container name
# (e.g., the collector reaches Tempo at http://tempo:4317).
networks:
xrpld-telemetry:
driver: bridge