Phase 10: Synthetic workload generation and telemetry validation tools

Add comprehensive workload harness for end-to-end validation of the Phases 1-9 telemetry stack: Task 10.1 — Multi-node test harness: - docker-compose.workload.yaml with full OTel stack (Collector, Jaeger, Tempo, Prometheus, Loki, Grafana) - generate-validator-keys.sh for automated key generation - xrpld-validator.cfg.template for node configuration Task 10.2 — RPC load generator: - rpc_load_generator.py with WebSocket client, configurable rates, realistic command distribution (40% health, 30% wallet, 15% explorer, 10% tx lookups, 5% DEX), W3C traceparent injection Task 10.3 — Transaction submitter: - tx_submitter.py with 10 transaction types (Payment, OfferCreate, OfferCancel, TrustSet, NFTokenMint, NFTokenCreateOffer, EscrowCreate, EscrowFinish, AMMCreate, AMMDeposit), auto-funded test accounts Task 10.4 — Telemetry validation suite: - validate_telemetry.py checking spans (Jaeger), metrics (Prometheus), log-trace correlation (Loki), dashboards (Grafana) - expected_spans.json (17 span types, 22 attributes, 3 hierarchies) - expected_metrics.json (SpanMetrics, StatsD, Phase 9, dashboards) Task 10.5 — Performance benchmark suite: - benchmark.sh for baseline vs telemetry comparison - collect_system_metrics.sh for CPU/memory/latency sampling - Thresholds: <3% CPU, <5MB memory, <2ms RPC p99, <5% TPS, <1% consensus Task 10.6 — CI integration: - telemetry-validation.yml GitHub Actions workflow - run-full-validation.sh orchestrator script - Manual trigger + telemetry branch auto-trigger Task 10.7 — Documentation: - workload/README.md with quick start and tool reference - Updated telemetry-runbook.md with validation and benchmark sections - Updated 09-data-collection-reference.md with validation inventory Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-29 15:37:57 +00:00 · 2026-03-10 16:15:55 +00:00
parent 010ac78fc3
commit 787b496484
18 changed files with 4330 additions and 10 deletions
--- a/.github/workflows/telemetry-validation.yml
+++ b/.github/workflows/telemetry-validation.yml
@@ -0,0 +1,164 @@
+# Telemetry Validation CI Workflow
+#
+# Builds rippled with telemetry enabled, runs the multi-node workload
+# harness, validates all telemetry data, and runs performance benchmarks.
+#
+# This is a separate workflow from the main CI. It runs:
+#   - On manual dispatch (workflow_dispatch)
+#   - On pushes to telemetry-related branches
+#
+# The workflow is intentionally heavyweight (builds rippled, starts Docker
+# services, runs a multi-node cluster) — it validates the full telemetry
+# stack end-to-end rather than individual unit tests.
+
+name: Telemetry Validation
+
+on:
+  workflow_dispatch:
+    inputs:
+      rpc_rate:
+        description: "RPC load rate (requests per second)"
+        required: false
+        default: "50"
+      rpc_duration:
+        description: "RPC load duration (seconds)"
+        required: false
+        default: "120"
+      tx_tps:
+        description: "Transaction submit rate (TPS)"
+        required: false
+        default: "5"
+      tx_duration:
+        description: "Transaction submit duration (seconds)"
+        required: false
+        default: "120"
+      run_benchmark:
+        description: "Run performance benchmarks"
+        required: false
+        type: boolean
+        default: false
+
+  push:
+    branches:
+      - "pratik/otel-phase*"
+      - "feature/otel-*"
+      - "feature/telemetry-*"
+    paths:
+      - "docker/telemetry/**"
+      - "include/xrpl/basics/Telemetry*.h"
+      - "src/xrpld/app/misc/Telemetry*"
+
+concurrency:
+  group: telemetry-validation-${{ github.ref }}
+  cancel-in-progress: true
+
+jobs:
+  validate-telemetry:
+    name: Telemetry Stack Validation
+    runs-on: ubuntu-latest
+    timeout-minutes: 60
+
+    services:
+      # Docker-in-Docker not needed — we use docker compose directly.
+      # The runner has Docker pre-installed.
+      docker:
+        image: docker:dind
+        options: --privileged
+
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v4
+
+      - name: Install system dependencies
+        run: |
+          sudo apt-get update
+          sudo apt-get install -y curl jq bc python3 python3-pip
+
+      - name: Install Python dependencies
+        run: pip3 install -r docker/telemetry/workload/requirements.txt
+
+      - name: Set up Conan and build cache
+        uses: actions/cache@v4
+        with:
+          path: |
+            ~/.conan2
+            .build
+          key: telemetry-build-${{ runner.os }}-${{ hashFiles('conanfile.py', 'CMakeLists.txt') }}
+          restore-keys: |
+            telemetry-build-${{ runner.os }}-
+
+      - name: Build rippled with telemetry
+        run: |
+          conan install . --build=missing -o telemetry=True
+          cmake --preset default -Dtelemetry=ON
+          cmake --build --preset default --parallel $(nproc)
+
+      - name: Make scripts executable
+        run: |
+          chmod +x docker/telemetry/workload/*.sh
+
+      - name: Run full telemetry validation
+        id: validation
+        env:
+          XRPLD: .build/xrpld
+        run: |
+          ARGS="--xrpld .build/xrpld --skip-loki"
+          ARGS="$ARGS --rpc-rate ${{ github.event.inputs.rpc_rate || '50' }}"
+          ARGS="$ARGS --rpc-duration ${{ github.event.inputs.rpc_duration || '120' }}"
+          ARGS="$ARGS --tx-tps ${{ github.event.inputs.tx_tps || '5' }}"
+          ARGS="$ARGS --tx-duration ${{ github.event.inputs.tx_duration || '120' }}"
+          if [ "${{ github.event.inputs.run_benchmark }}" = "true" ]; then
+            ARGS="$ARGS --with-benchmark"
+          fi
+          docker/telemetry/workload/run-full-validation.sh $ARGS
+        continue-on-error: true
+
+      - name: Upload validation reports
+        if: always()
+        uses: actions/upload-artifact@v4
+        with:
+          name: telemetry-validation-reports
+          path: /tmp/xrpld-validation/reports/
+          retention-days: 30
+
+      - name: Upload node logs
+        if: failure()
+        uses: actions/upload-artifact@v4
+        with:
+          name: xrpld-node-logs
+          path: /tmp/xrpld-validation/node*/debug.log
+          retention-days: 7
+
+      - name: Print validation summary
+        if: always()
+        run: |
+          REPORT="/tmp/xrpld-validation/reports/validation-report.json"
+          if [ -f "$REPORT" ]; then
+            echo "## Telemetry Validation Results" >> "$GITHUB_STEP_SUMMARY"
+            echo "" >> "$GITHUB_STEP_SUMMARY"
+            TOTAL=$(jq '.summary.total' "$REPORT")
+            PASSED=$(jq '.summary.passed' "$REPORT")
+            FAILED=$(jq '.summary.failed' "$REPORT")
+            echo "| Metric | Value |" >> "$GITHUB_STEP_SUMMARY"
+            echo "|--------|-------|" >> "$GITHUB_STEP_SUMMARY"
+            echo "| Total Checks | $TOTAL |" >> "$GITHUB_STEP_SUMMARY"
+            echo "| Passed | $PASSED |" >> "$GITHUB_STEP_SUMMARY"
+            echo "| Failed | $FAILED |" >> "$GITHUB_STEP_SUMMARY"
+            echo "" >> "$GITHUB_STEP_SUMMARY"
+            if [ "$FAILED" -gt 0 ]; then
+              echo "### Failed Checks" >> "$GITHUB_STEP_SUMMARY"
+              echo "" >> "$GITHUB_STEP_SUMMARY"
+              jq -r '.checks[] | select(.passed == false) | "- **\(.name)**: \(.message)"' "$REPORT" >> "$GITHUB_STEP_SUMMARY"
+            fi
+          fi
+
+      - name: Cleanup
+        if: always()
+        run: |
+          docker/telemetry/workload/run-full-validation.sh --cleanup 2>/dev/null || true
+
+      - name: Check validation result
+        if: steps.validation.outcome == 'failure'
+        run: |
+          echo "Telemetry validation failed. Check the uploaded reports for details."
+          exit 1
--- a/OpenTelemetryPlan/09-data-collection-reference.md
+++ b/OpenTelemetryPlan/09-data-collection-reference.md
@@ -665,23 +665,46 @@ Tracked types: `Transaction`, `Ledger`, `NodeObject`, `STTx`, `STLedgerEntry`, `

 ## 5c. Future: Synthetic Workload Generation & Telemetry Validation (Phase 10)

-> **Status**: Planned, not yet implemented.
 > **Plan details**: [06-implementation-phases.md §6.8.3](./06-implementation-phases.md) — motivation, architecture
 > **Task breakdown**: [Phase10_taskList.md](./Phase10_taskList.md) — per-task implementation details
+> **Tools**: [docker/telemetry/workload/](../docker/telemetry/workload/) — RPC load generator, transaction submitter, validation suite, benchmarks

 Phase 10 builds a 5-node validator docker-compose harness with RPC load generators, transaction submitters, and automated validation scripts that verify all spans, metrics, dashboards, and log-trace correlation work end-to-end. Includes a benchmark suite comparing telemetry-ON vs telemetry-OFF overhead.

+### Running the Validation Suite
+
+```bash
+# Full end-to-end validation (start cluster, generate load, validate):
+docker/telemetry/workload/run-full-validation.sh --xrpld .build/xrpld
+
+# Validation only (assumes stack and cluster are already running):
+python3 docker/telemetry/workload/validate_telemetry.py --report /tmp/report.json
+
+# Performance benchmark (baseline vs telemetry):
+docker/telemetry/workload/benchmark.sh --xrpld .build/xrpld --duration 300
+```
+
 ### Validated Telemetry Inventory

-| Category           | Expected Count | Validation Method                |
-| ------------------ | -------------- | -------------------------------- |
-| Trace spans        | 16             | Jaeger/Tempo API query           |
-| Span attributes    | 22             | Per-span attribute assertion     |
-| StatsD metrics     | 255+           | Prometheus query                 |
-| Phase 9 metrics    | 50+            | Prometheus query                 |
-| SpanMetrics RED    | 4 per span     | Prometheus query                 |
-| Grafana dashboards | 10             | Dashboard API "no data" check    |
-| Log-trace links    | Present        | Loki query + Tempo reverse check |
+| Category           | Expected Count | Validation Method                | Config File             |
+| ------------------ | -------------- | -------------------------------- | ----------------------- |
+| Trace spans        | 17             | Jaeger/Tempo API query           | `expected_spans.json`   |
+| Span attributes    | 22             | Per-span attribute assertion     | `expected_spans.json`   |
+| StatsD metrics     | 255+           | Prometheus query                 | `expected_metrics.json` |
+| Phase 9 metrics    | 50+            | Prometheus query                 | `expected_metrics.json` |
+| SpanMetrics RED    | 4 per span     | Prometheus query                 | `expected_metrics.json` |
+| Grafana dashboards | 10             | Dashboard API "no data" check    | `expected_metrics.json` |
+| Log-trace links    | Present        | Loki query + Tempo reverse check | —                       |
+
+### Performance Overhead Targets
+
+| Metric            | Target       | Measurement Method                  |
+| ----------------- | ------------ | ----------------------------------- |
+| CPU overhead      | < 3%         | ps avg CPU% baseline vs telemetry   |
+| Memory overhead   | < 5MB        | ps peak RSS baseline vs telemetry   |
+| RPC p99 latency   | < 2ms impact | server_info round-trip timing       |
+| Throughput impact | < 5%         | Ledger close rate comparison        |
+| Consensus impact  | < 1%         | Consensus round time p95 comparison |

 ---

--- a/docker/telemetry/docker-compose.workload.yaml
+++ b/docker/telemetry/docker-compose.workload.yaml
@@ -0,0 +1,137 @@
+# Docker Compose workload harness for Phase 10 telemetry validation.
+#
+# Runs a 5-node validator cluster with full OTel telemetry stack:
+#   - 5 rippled validator nodes (consensus network)
+#   - OTel Collector (traces + StatsD metrics)
+#   - Jaeger (trace search UI)
+#   - Tempo (production trace backend)
+#   - Prometheus (metrics)
+#   - Loki (log aggregation for log-trace correlation)
+#   - Grafana (dashboards + trace/log exploration)
+#
+# Usage:
+#   # Start the harness (requires pre-built xrpld image or mount binary):
+#   docker compose -f docker/telemetry/docker-compose.workload.yaml up -d
+#
+#   # Or use the orchestrator:
+#   docker/telemetry/workload/run-full-validation.sh
+#
+# Prerequisites:
+#   - xrpld binary built with -DXRPL_ENABLE_TELEMETRY=ON
+#   - Validator keys generated via generate-validator-keys.sh
+#   - Node configs generated by run-full-validation.sh
+
+version: "3.8"
+
+services:
+  # ---------------------------------------------------------------------------
+  # Telemetry Backend Stack
+  # ---------------------------------------------------------------------------
+
+  otel-collector:
+    image: otel/opentelemetry-collector-contrib:latest
+    command: ["--config=/etc/otel-collector-config.yaml"]
+    ports:
+      - "4317:4317" # OTLP gRPC
+      - "4318:4318" # OTLP HTTP
+      - "8125:8125/udp" # StatsD UDP (beast::insight metrics)
+      - "8889:8889" # Prometheus metrics endpoint
+      - "13133:13133" # Health check
+    volumes:
+      - ../otel-collector-config.yaml:/etc/otel-collector-config.yaml:ro
+    depends_on:
+      - jaeger
+      - tempo
+    networks:
+      - workload-net
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:13133/"]
+      interval: 5s
+      timeout: 3s
+      retries: 10
+
+  jaeger:
+    image: jaegertracing/all-in-one:latest
+    environment:
+      - COLLECTOR_OTLP_ENABLED=true
+    ports:
+      - "16686:16686" # Jaeger UI
+      - "14250:14250" # gRPC
+    networks:
+      - workload-net
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:16686/"]
+      interval: 5s
+      timeout: 3s
+      retries: 10
+
+  tempo:
+    image: grafana/tempo:2.7.2
+    command: ["-config.file=/etc/tempo.yaml"]
+    ports:
+      - "3200:3200" # Tempo HTTP API
+    volumes:
+      - ../tempo.yaml:/etc/tempo.yaml:ro
+      - tempo-data:/var/tempo
+    networks:
+      - workload-net
+
+  prometheus:
+    image: prom/prometheus:latest
+    ports:
+      - "9090:9090"
+    volumes:
+      - ../prometheus.yml:/etc/prometheus/prometheus.yml:ro
+    depends_on:
+      otel-collector:
+        condition: service_healthy
+    networks:
+      - workload-net
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:9090/-/healthy"]
+      interval: 5s
+      timeout: 3s
+      retries: 10
+
+  loki:
+    image: grafana/loki:2.9.4
+    ports:
+      - "3100:3100" # Loki HTTP API
+    command: ["-config.file=/etc/loki/local-config.yaml"]
+    networks:
+      - workload-net
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:3100/ready"]
+      interval: 5s
+      timeout: 3s
+      retries: 10
+
+  grafana:
+    image: grafana/grafana:latest
+    environment:
+      - GF_AUTH_ANONYMOUS_ENABLED=true
+      - GF_AUTH_ANONYMOUS_ORG_ROLE=Admin
+    ports:
+      - "3000:3000"
+    volumes:
+      - ../grafana/provisioning:/etc/grafana/provisioning:ro
+      - ../grafana/dashboards:/var/lib/grafana/dashboards:ro
+    depends_on:
+      - jaeger
+      - tempo
+      - prometheus
+      - loki
+    networks:
+      - workload-net
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:3000/api/health"]
+      interval: 5s
+      timeout: 3s
+      retries: 10
+
+volumes:
+  tempo-data:
+
+networks:
+  workload-net:
+    driver: bridge
--- a/docker/telemetry/workload/README.md
+++ b/docker/telemetry/workload/README.md
@@ -0,0 +1,197 @@
+# Telemetry Workload Tools
+
+Synthetic workload generation and validation tools for rippled's OpenTelemetry telemetry stack. These tools validate that all spans, metrics, dashboards, and log-trace correlation work end-to-end under controlled load.
+
+## Quick Start
+
+```bash
+# Build rippled with telemetry enabled
+conan install . --build=missing -o telemetry=True
+cmake --preset default -Dtelemetry=ON
+cmake --build --preset default
+
+# Run full validation (starts everything, runs load, validates)
+docker/telemetry/workload/run-full-validation.sh --xrpld .build/xrpld
+
+# Cleanup when done
+docker/telemetry/workload/run-full-validation.sh --cleanup
+```
+
+## Architecture
+
+```
+run-full-validation.sh (orchestrator)
+  |
+  |-- docker-compose.workload.yaml
+  |     |-- otel-collector (traces + StatsD)
+  |     |-- jaeger (trace search)
+  |     |-- tempo (trace storage)
+  |     |-- prometheus (metrics)
+  |     |-- loki (log aggregation)
+  |     |-- grafana (dashboards)
+  |
+  |-- generate-validator-keys.sh
+  |     -> validator-keys.json, validators.txt
+  |
+  |-- 5x xrpld nodes (local processes, full telemetry)
+  |
+  |-- rpc_load_generator.py (WebSocket RPC traffic)
+  |-- tx_submitter.py (transaction diversity)
+  |
+  |-- validate_telemetry.py (pass/fail checks)
+  |     -> validation-report.json
+  |
+  |-- benchmark.sh (baseline vs telemetry comparison)
+        -> benchmark-report-*.md
+```
+
+## Tools Reference
+
+### run-full-validation.sh
+
+Orchestrates the complete validation pipeline. Starts the telemetry stack, starts a multi-node rippled cluster, generates load, and validates the results.
+
+```bash
+# Full validation with defaults
+./run-full-validation.sh --xrpld /path/to/xrpld
+
+# Custom load parameters
+./run-full-validation.sh --xrpld /path/to/xrpld \
+    --rpc-rate 100 --rpc-duration 300 \
+    --tx-tps 10 --tx-duration 300
+
+# Include performance benchmarks
+./run-full-validation.sh --xrpld /path/to/xrpld --with-benchmark
+
+# Skip Loki checks (if Phase 8 not deployed)
+./run-full-validation.sh --xrpld /path/to/xrpld --skip-loki
+```
+
+### rpc_load_generator.py
+
+Generates RPC traffic matching realistic production distribution:
+
+- 40% health checks (server_info, fee)
+- 30% wallet queries (account_info, account_lines, account_objects)
+- 15% explorer queries (ledger, ledger_data)
+- 10% transaction lookups (tx, account_tx)
+- 5% DEX queries (book_offers, amm_info)
+
+```bash
+# Basic usage
+python3 rpc_load_generator.py --endpoints ws://localhost:6006 --rate 50 --duration 120
+
+# Multiple endpoints (round-robin)
+python3 rpc_load_generator.py \
+    --endpoints ws://localhost:6006 ws://localhost:6007 \
+    --rate 100 --duration 300
+
+# Custom weights
+python3 rpc_load_generator.py --endpoints ws://localhost:6006 \
+    --weights '{"server_info": 80, "account_info": 20}'
+```
+
+### tx_submitter.py
+
+Submits diverse transaction types to exercise the full span and metric surface:
+
+- Payment (XRP transfers)
+- OfferCreate / OfferCancel (DEX activity)
+- TrustSet (trust line creation)
+- NFTokenMint / NFTokenCreateOffer (NFT activity)
+- EscrowCreate / EscrowFinish (escrow lifecycle)
+- AMMCreate / AMMDeposit (AMM pool operations)
+
+```bash
+# Basic usage
+python3 tx_submitter.py --endpoint ws://localhost:6006 --tps 5 --duration 120
+
+# Custom mix
+python3 tx_submitter.py --endpoint ws://localhost:6006 \
+    --weights '{"Payment": 60, "OfferCreate": 20, "TrustSet": 20}'
+```
+
+### validate_telemetry.py
+
+Automated validation that all expected telemetry data exists:
+
+- **Span validation**: All 16+ span types with required attributes
+- **Metric validation**: SpanMetrics, StatsD, Phase 9 metrics
+- **Log-trace correlation**: trace_id/span_id in Loki logs
+- **Dashboard validation**: All 10 Grafana dashboards accessible
+
+```bash
+# Run all validations
+python3 validate_telemetry.py --report /tmp/report.json
+
+# Skip Loki checks
+python3 validate_telemetry.py --skip-loki --report /tmp/report.json
+```
+
+### benchmark.sh
+
+Compares baseline (no telemetry) vs telemetry-enabled performance:
+
+```bash
+./benchmark.sh --xrpld /path/to/xrpld --duration 300
+```
+
+Thresholds (configurable via environment):
+
+| Metric            | Threshold | Env Variable                |
+| ----------------- | --------- | --------------------------- |
+| CPU overhead      | < 3%      | BENCH_CPU_OVERHEAD_PCT      |
+| Memory overhead   | < 5MB     | BENCH_MEM_OVERHEAD_MB       |
+| RPC p99 latency   | < 2ms     | BENCH_RPC_LATENCY_IMPACT_MS |
+| Throughput impact | < 5%      | BENCH_TPS_IMPACT_PCT        |
+| Consensus impact  | < 1%      | BENCH_CONSENSUS_IMPACT_PCT  |
+
+## Reading Validation Reports
+
+The validation report (`validation-report.json`) is structured as:
+
+```json
+{
+  "summary": {
+    "total": 45,
+    "passed": 42,
+    "failed": 3,
+    "all_passed": false
+  },
+  "checks": [
+    {
+      "name": "span.rpc.request",
+      "category": "span",
+      "passed": true,
+      "message": "rpc.request: 15 traces found",
+      "details": { "trace_count": 15 }
+    }
+  ]
+}
+```
+
+Categories:
+
+- **span**: Span type existence and attribute validation
+- **metric**: Prometheus metric existence
+- **log**: Log-trace correlation checks
+- **dashboard**: Grafana dashboard accessibility
+
+## CI Integration
+
+The validation runs as a GitHub Actions workflow (`.github/workflows/telemetry-validation.yml`):
+
+- Triggered manually or on pushes to telemetry branches
+- Builds rippled, starts the full stack, runs load, validates
+- Uploads reports as artifacts
+- Posts summary to PR
+
+## Configuration Files
+
+| File                           | Purpose                                         |
+| ------------------------------ | ----------------------------------------------- |
+| `expected_spans.json`          | Span inventory (names, attributes, hierarchies) |
+| `expected_metrics.json`        | Metric inventory (SpanMetrics, StatsD, Phase 9) |
+| `test_accounts.json`           | Test account roles (keys generated at runtime)  |
+| `xrpld-validator.cfg.template` | Node config template with placeholders          |
+| `requirements.txt`             | Python dependencies                             |
--- a/docker/telemetry/workload/benchmark-results/.gitkeep
+++ b/docker/telemetry/workload/benchmark-results/.gitkeep
--- a/docker/telemetry/workload/benchmark.sh
+++ b/docker/telemetry/workload/benchmark.sh
@@ -0,0 +1,379 @@
+#!/usr/bin/env bash
+# benchmark.sh — Performance benchmark for rippled telemetry overhead.
+#
+# Runs two identical workloads against a rippled cluster:
+#   1. Baseline: telemetry disabled ([telemetry] enabled=0)
+#   2. Telemetry: full telemetry enabled (traces + StatsD + all categories)
+#
+# Compares CPU, memory, RPC latency, TPS, and consensus round time.
+# Outputs a Markdown table with pass/fail against configured thresholds.
+#
+# Usage:
+#   ./benchmark.sh --xrpld /path/to/xrpld --duration 300
+#
+# Thresholds (configurable via environment variables):
+#   BENCH_CPU_OVERHEAD_PCT=3       CPU overhead < 3%
+#   BENCH_MEM_OVERHEAD_MB=5        Memory overhead < 5MB
+#   BENCH_RPC_LATENCY_IMPACT_MS=2  RPC p99 latency impact < 2ms
+#   BENCH_TPS_IMPACT_PCT=5         Throughput impact < 5%
+#   BENCH_CONSENSUS_IMPACT_PCT=1   Consensus round time impact < 1%
+
+set -euo pipefail
+
+# ---------------------------------------------------------------------------
+# Colored output helpers
+# ---------------------------------------------------------------------------
+log()   { printf "\033[1;34m[BENCH]\033[0m  %s\n" "$*"; }
+ok()    { printf "\033[1;32m[BENCH]\033[0m  %s\n" "$*"; }
+warn()  { printf "\033[1;33m[BENCH]\033[0m  %s\n" "$*"; }
+fail()  { printf "\033[1;31m[BENCH]\033[0m  %s\n" "$*"; }
+die()   { printf "\033[1;31m[BENCH]\033[0m  %s\n" "$*" >&2; exit 1; }
+
+# ---------------------------------------------------------------------------
+# Defaults and thresholds
+# ---------------------------------------------------------------------------
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+REPO_ROOT="$(cd "$SCRIPT_DIR/../../.." && pwd)"
+
+# Configurable thresholds via environment variables.
+CPU_THRESHOLD="${BENCH_CPU_OVERHEAD_PCT:-3}"
+MEM_THRESHOLD="${BENCH_MEM_OVERHEAD_MB:-5}"
+RPC_THRESHOLD="${BENCH_RPC_LATENCY_IMPACT_MS:-2}"
+TPS_THRESHOLD="${BENCH_TPS_IMPACT_PCT:-5}"
+CONSENSUS_THRESHOLD="${BENCH_CONSENSUS_IMPACT_PCT:-1}"
+
+XRPLD="${BENCH_XRPLD:-$REPO_ROOT/.build/xrpld}"
+DURATION=300
+NUM_NODES=3
+WORKDIR="/tmp/xrpld-benchmark"
+RESULTS_DIR="$SCRIPT_DIR/benchmark-results"
+RPC_PORT_BASE=5020
+PEER_PORT_BASE=51250
+
+# ---------------------------------------------------------------------------
+# Argument parsing
+# ---------------------------------------------------------------------------
+usage() {
+    echo "Usage: $0 [OPTIONS]"
+    echo ""
+    echo "Options:"
+    echo "  --xrpld PATH      Path to xrpld binary (default: \$REPO_ROOT/.build/xrpld)"
+    echo "  --duration SECS   Benchmark duration per run (default: 300)"
+    echo "  --nodes NUM       Number of validator nodes (default: 3)"
+    echo "  --output DIR      Results output directory"
+    echo "  -h, --help        Show this help"
+    exit 0
+}
+
+while [ $# -gt 0 ]; do
+    case "$1" in
+        --xrpld)   XRPLD="$2"; shift 2 ;;
+        --duration) DURATION="$2"; shift 2 ;;
+        --nodes)   NUM_NODES="$2"; shift 2 ;;
+        --output)  RESULTS_DIR="$2"; shift 2 ;;
+        -h|--help) usage ;;
+        *) die "Unknown option: $1" ;;
+    esac
+done
+
+# Validate prerequisites.
+[ -x "$XRPLD" ] || die "xrpld not found at $XRPLD"
+command -v jq >/dev/null 2>&1 || die "jq not found"
+command -v bc >/dev/null 2>&1 || die "bc not found"
+command -v curl >/dev/null 2>&1 || die "curl not found"
+
+mkdir -p "$RESULTS_DIR"
+TIMESTAMP=$(date +%Y%m%d_%H%M%S)
+
+# ---------------------------------------------------------------------------
+# Node cluster management
+# ---------------------------------------------------------------------------
+start_cluster() {
+    local telemetry_enabled="$1"
+    local label="$2"
+
+    log "Starting $NUM_NODES-node cluster ($label, telemetry=$telemetry_enabled)..."
+
+    rm -rf "$WORKDIR"
+    mkdir -p "$WORKDIR"
+
+    # Generate keys using first node.
+    bash "$SCRIPT_DIR/generate-validator-keys.sh" "$XRPLD" "$NUM_NODES" "$WORKDIR"
+
+    # Build per-node configs.
+    for i in $(seq 1 "$NUM_NODES"); do
+        local node_dir="$WORKDIR/node$i"
+        mkdir -p "$node_dir/nudb" "$node_dir/db"
+
+        local rpc_port=$((RPC_PORT_BASE + i - 1))
+        local peer_port=$((PEER_PORT_BASE + i - 1))
+        local seed
+        seed=$(jq -r ".[$((i-1))].seed" "$WORKDIR/validator-keys.json")
+
+        # Build ips_fixed list.
+        local ips_fixed=""
+        for j in $(seq 1 "$NUM_NODES"); do
+            if [ "$j" -ne "$i" ]; then
+                ips_fixed="${ips_fixed}127.0.0.1 $((PEER_PORT_BASE + j - 1))
+"
+            fi
+        done
+
+        # Build telemetry section.
+        local telemetry_section=""
+        if [ "$telemetry_enabled" = "1" ]; then
+            telemetry_section="
+[telemetry]
+enabled=1
+service_instance_id=bench-node-${i}
+endpoint=http://localhost:4318/v1/traces
+exporter=otlp_http
+sampling_ratio=1.0
+batch_size=512
+batch_delay_ms=2000
+max_queue_size=2048
+trace_rpc=1
+trace_transactions=1
+trace_consensus=1
+trace_peer=1
+trace_ledger=1
+
+[insight]
+server=statsd
+address=127.0.0.1:8125
+prefix=rippled"
+        else
+            telemetry_section="
+[telemetry]
+enabled=0"
+        fi
+
+        cat > "$node_dir/xrpld.cfg" <<EOCFG
+[server]
+port_rpc
+port_peer
+
+[port_rpc]
+port = $rpc_port
+ip = 127.0.0.1
+admin = 127.0.0.1
+protocol = http
+
+[port_peer]
+port = $peer_port
+ip = 0.0.0.0
+protocol = peer
+
+[node_db]
+type=NuDB
+path=$node_dir/nudb
+online_delete=256
+
+[database_path]
+$node_dir/db
+
+[debug_logfile]
+$node_dir/debug.log
+
+[validation_seed]
+$seed
+
+[validators_file]
+$WORKDIR/validators.txt
+
+[ips_fixed]
+${ips_fixed}
+[peer_private]
+1
+${telemetry_section}
+
+[rpc_startup]
+{ "command": "log_level", "severity": "warning" }
+
+[ssl_verify]
+0
+EOCFG
+
+        "$XRPLD" --conf "$node_dir/xrpld.cfg" --start > "$node_dir/stdout.log" 2>&1 &
+        echo $! > "$node_dir/xrpld.pid"
+    done
+
+    # Wait for consensus.
+    log "Waiting for consensus..."
+    for attempt in $(seq 1 120); do
+        local ready=0
+        for i in $(seq 1 "$NUM_NODES"); do
+            local port=$((RPC_PORT_BASE + i - 1))
+            local state
+            state=$(curl -sf "http://localhost:$port" \
+                -d '{"method":"server_info"}' 2>/dev/null \
+                | jq -r '.result.info.server_state' 2>/dev/null || echo "")
+            if [ "$state" = "proposing" ]; then
+                ready=$((ready + 1))
+            fi
+        done
+        if [ "$ready" -ge "$NUM_NODES" ]; then
+            ok "All $NUM_NODES nodes proposing (attempt $attempt)"
+            break
+        fi
+        if [ "$attempt" -eq 120 ]; then
+            warn "Consensus timeout — $ready/$NUM_NODES nodes ready"
+        fi
+        sleep 1
+    done
+
+    # Let the cluster stabilize.
+    sleep 5
+}
+
+stop_cluster() {
+    log "Stopping cluster..."
+    for i in $(seq 1 "$NUM_NODES"); do
+        local pidfile="$WORKDIR/node$i/xrpld.pid"
+        if [ -f "$pidfile" ]; then
+            kill "$(cat "$pidfile")" 2>/dev/null || true
+        fi
+    done
+    pkill -f "$WORKDIR" 2>/dev/null || true
+    sleep 3
+}
+
+# Build RPC ports CSV string.
+rpc_ports_csv() {
+    local ports=""
+    for i in $(seq 1 "$NUM_NODES"); do
+        [ -n "$ports" ] && ports="$ports,"
+        ports="$ports$((RPC_PORT_BASE + i - 1))"
+    done
+    echo "$ports"
+}
+
+# ---------------------------------------------------------------------------
+# Run benchmark
+# ---------------------------------------------------------------------------
+log "="
+log "  rippled Telemetry Performance Benchmark"
+log "  Nodes: $NUM_NODES | Duration: ${DURATION}s | Binary: $XRPLD"
+log "="
+
+# --- Baseline run ---
+BASELINE_FILE="$RESULTS_DIR/baseline-${TIMESTAMP}.json"
+start_cluster "0" "baseline"
+bash "$SCRIPT_DIR/collect_system_metrics.sh" "$(rpc_ports_csv)" "$DURATION" "$BASELINE_FILE"
+stop_cluster
+
+# --- Telemetry run ---
+TELEMETRY_FILE="$RESULTS_DIR/telemetry-${TIMESTAMP}.json"
+start_cluster "1" "telemetry"
+bash "$SCRIPT_DIR/collect_system_metrics.sh" "$(rpc_ports_csv)" "$DURATION" "$TELEMETRY_FILE"
+stop_cluster
+
+# ---------------------------------------------------------------------------
+# Compare results
+# ---------------------------------------------------------------------------
+log "Comparing results..."
+
+read_metric() {
+    local file="$1"
+    local key="$2"
+    jq -r ".$key // 0" "$file"
+}
+
+BASE_CPU=$(read_metric "$BASELINE_FILE" "cpu_pct_avg")
+TELE_CPU=$(read_metric "$TELEMETRY_FILE" "cpu_pct_avg")
+CPU_DELTA=$(echo "scale=2; $TELE_CPU - $BASE_CPU" | bc 2>/dev/null || echo "0")
+
+BASE_MEM=$(read_metric "$BASELINE_FILE" "memory_rss_mb_peak")
+TELE_MEM=$(read_metric "$TELEMETRY_FILE" "memory_rss_mb_peak")
+MEM_DELTA=$(echo "scale=2; $TELE_MEM - $BASE_MEM" | bc 2>/dev/null || echo "0")
+
+BASE_RPC=$(read_metric "$BASELINE_FILE" "rpc_p99_ms")
+TELE_RPC=$(read_metric "$TELEMETRY_FILE" "rpc_p99_ms")
+RPC_DELTA=$(echo "scale=2; $TELE_RPC - $BASE_RPC" | bc 2>/dev/null || echo "0")
+
+BASE_TPS=$(read_metric "$BASELINE_FILE" "tps")
+TELE_TPS=$(read_metric "$TELEMETRY_FILE" "tps")
+if [ "$(echo "$BASE_TPS > 0" | bc 2>/dev/null)" = "1" ]; then
+    TPS_IMPACT=$(echo "scale=2; ($BASE_TPS - $TELE_TPS) / $BASE_TPS * 100" | bc 2>/dev/null || echo "0")
+else
+    TPS_IMPACT="0"
+fi
+
+BASE_CONS=$(read_metric "$BASELINE_FILE" "consensus_round_p95_ms")
+TELE_CONS=$(read_metric "$TELEMETRY_FILE" "consensus_round_p95_ms")
+if [ "$(echo "$BASE_CONS > 0" | bc 2>/dev/null)" = "1" ]; then
+    CONS_IMPACT=$(echo "scale=2; ($TELE_CONS - $BASE_CONS) / $BASE_CONS * 100" | bc 2>/dev/null || echo "0")
+else
+    CONS_IMPACT="0"
+fi
+
+# ---------------------------------------------------------------------------
+# Pass/fail checks
+# ---------------------------------------------------------------------------
+PASS_COUNT=0
+FAIL_COUNT=0
+
+check_threshold() {
+    local name="$1"
+    local actual="$2"
+    local threshold="$3"
+    local unit="$4"
+
+    # Compare: actual <= threshold
+    if [ "$(echo "$actual <= $threshold" | bc 2>/dev/null)" = "1" ]; then
+        ok "$name: ${actual}${unit} <= ${threshold}${unit} PASS"
+        PASS_COUNT=$((PASS_COUNT + 1))
+        echo "PASS"
+    else
+        fail "$name: ${actual}${unit} > ${threshold}${unit} FAIL"
+        FAIL_COUNT=$((FAIL_COUNT + 1))
+        echo "FAIL"
+    fi
+}
+
+CPU_RESULT=$(check_threshold "CPU overhead" "$CPU_DELTA" "$CPU_THRESHOLD" "%")
+MEM_RESULT=$(check_threshold "Memory overhead" "$MEM_DELTA" "$MEM_THRESHOLD" "MB")
+RPC_RESULT=$(check_threshold "RPC p99 impact" "$RPC_DELTA" "$RPC_THRESHOLD" "ms")
+TPS_RESULT=$(check_threshold "TPS impact" "$TPS_IMPACT" "$TPS_THRESHOLD" "%")
+CONS_RESULT=$(check_threshold "Consensus impact" "$CONS_IMPACT" "$CONSENSUS_THRESHOLD" "%")
+
+# ---------------------------------------------------------------------------
+# Output Markdown table
+# ---------------------------------------------------------------------------
+REPORT_FILE="$RESULTS_DIR/benchmark-report-${TIMESTAMP}.md"
+
+cat > "$REPORT_FILE" <<EOMD
+# Telemetry Performance Benchmark Report
+
+**Date**: $(date -u +"%Y-%m-%d %H:%M:%S UTC")
+**Nodes**: $NUM_NODES | **Duration**: ${DURATION}s per run
+**Binary**: $XRPLD
+
+## Results
+
+| Metric | Baseline | Telemetry | Delta | Threshold | Result |
+|--------|----------|-----------|-------|-----------|--------|
+| CPU (avg %) | ${BASE_CPU}% | ${TELE_CPU}% | ${CPU_DELTA}% | < ${CPU_THRESHOLD}% | ${CPU_RESULT} |
+| Memory RSS (peak MB) | ${BASE_MEM} MB | ${TELE_MEM} MB | ${MEM_DELTA} MB | < ${MEM_THRESHOLD} MB | ${MEM_RESULT} |
+| RPC p99 Latency (ms) | ${BASE_RPC} ms | ${TELE_RPC} ms | ${RPC_DELTA} ms | < ${RPC_THRESHOLD} ms | ${RPC_RESULT} |
+| Throughput (TPS) | ${BASE_TPS} | ${TELE_TPS} | ${TPS_IMPACT}% | < ${TPS_THRESHOLD}% | ${TPS_RESULT} |
+| Consensus Round p95 (ms) | ${BASE_CONS} ms | ${TELE_CONS} ms | ${CONS_IMPACT}% | < ${CONSENSUS_THRESHOLD}% | ${CONS_RESULT} |
+
+## Summary
+
+- **Passed**: $PASS_COUNT / $((PASS_COUNT + FAIL_COUNT))
+- **Failed**: $FAIL_COUNT / $((PASS_COUNT + FAIL_COUNT))
+
+## Raw Data
+
+- Baseline: \`$(basename "$BASELINE_FILE")\`
+- Telemetry: \`$(basename "$TELEMETRY_FILE")\`
+EOMD
+
+ok "Benchmark report written to $REPORT_FILE"
+cat "$REPORT_FILE"
+
+# Exit with failure if any check failed.
+if [ "$FAIL_COUNT" -gt 0 ]; then
+    exit 1
+fi
--- a/docker/telemetry/workload/collect_system_metrics.sh
+++ b/docker/telemetry/workload/collect_system_metrics.sh
@@ -0,0 +1,233 @@
+#!/usr/bin/env bash
+# collect_system_metrics.sh — Collect CPU, memory, and RPC latency metrics
+# from running xrpld nodes for benchmark comparison.
+#
+# Samples system metrics at regular intervals and writes a JSON summary.
+# Used by benchmark.sh for baseline vs telemetry comparison.
+#
+# Usage:
+#   ./collect_system_metrics.sh <rpc_ports_csv> <duration_seconds> <output_file>
+#
+# Example:
+#   ./collect_system_metrics.sh "5005,5006,5007" 300 /tmp/metrics-baseline.json
+#
+# Output JSON format:
+#   {
+#     "cpu_pct_avg": 12.5,
+#     "memory_rss_mb_peak": 450.2,
+#     "rpc_p99_ms": 15.3,
+#     "tps": 4.8,
+#     "consensus_round_p95_ms": 3200,
+#     "samples": 60
+#   }
+
+set -euo pipefail
+
+# ---------------------------------------------------------------------------
+# Colored output helpers
+# ---------------------------------------------------------------------------
+log()  { printf "\033[1;34m[METRICS]\033[0m %s\n" "$*"; }
+ok()   { printf "\033[1;32m[METRICS]\033[0m %s\n" "$*"; }
+die()  { printf "\033[1;31m[METRICS]\033[0m %s\n" "$*" >&2; exit 1; }
+
+# ---------------------------------------------------------------------------
+# Argument parsing
+# ---------------------------------------------------------------------------
+usage() {
+    echo "Usage: $0 <rpc_ports_csv> <duration_seconds> <output_file>"
+    echo ""
+    echo "Arguments:"
+    echo "  rpc_ports_csv     Comma-separated RPC ports (e.g., 5005,5006,5007)"
+    echo "  duration_seconds  How long to collect metrics"
+    echo "  output_file       Path to write JSON results"
+    exit 1
+}
+
+if [ $# -lt 3 ]; then
+    usage
+fi
+
+RPC_PORTS_CSV="$1"
+DURATION="$2"
+OUTPUT_FILE="$3"
+
+IFS=',' read -ra RPC_PORTS <<< "$RPC_PORTS_CSV"
+SAMPLE_INTERVAL=5
+SAMPLES=$((DURATION / SAMPLE_INTERVAL))
+
+log "Collecting metrics for ${DURATION}s (${SAMPLES} samples, ${#RPC_PORTS[@]} nodes)..."
+
+# ---------------------------------------------------------------------------
+# Temporary files for aggregation
+# ---------------------------------------------------------------------------
+TMPDIR_METRICS="$(mktemp -d)"
+CPU_FILE="$TMPDIR_METRICS/cpu.txt"
+MEM_FILE="$TMPDIR_METRICS/mem.txt"
+RPC_FILE="$TMPDIR_METRICS/rpc.txt"
+LEDGER_FILE="$TMPDIR_METRICS/ledger.txt"
+
+touch "$CPU_FILE" "$MEM_FILE" "$RPC_FILE" "$LEDGER_FILE"
+
+cleanup() {
+    rm -rf "$TMPDIR_METRICS"
+}
+trap cleanup EXIT
+
+# ---------------------------------------------------------------------------
+# Get initial ledger sequence for TPS calculation
+# ---------------------------------------------------------------------------
+INITIAL_SEQ=0
+INITIAL_TIME=$(date +%s)
+for port in "${RPC_PORTS[@]}"; do
+    seq=$(curl -sf "http://localhost:$port" \
+        -d '{"method":"server_info"}' 2>/dev/null \
+        | jq -r '.result.info.validated_ledger.seq // 0' 2>/dev/null || echo 0)
+    if [ "$seq" -gt "$INITIAL_SEQ" ]; then
+        INITIAL_SEQ=$seq
+    fi
+done
+log "Initial validated ledger seq: $INITIAL_SEQ"
+
+# ---------------------------------------------------------------------------
+# Sampling loop
+# ---------------------------------------------------------------------------
+for sample in $(seq 1 "$SAMPLES"); do
+    # Collect CPU usage for xrpld processes.
+    # Uses ps to find all xrpld processes and average their CPU%.
+    cpu_sum=0
+    cpu_count=0
+    while IFS= read -r line; do
+        cpu_val=$(echo "$line" | awk '{print $1}')
+        if [ -n "$cpu_val" ] && [ "$cpu_val" != "0.0" ]; then
+            cpu_sum=$(echo "$cpu_sum + $cpu_val" | bc 2>/dev/null || echo "$cpu_sum")
+            cpu_count=$((cpu_count + 1))
+        fi
+    done < <(ps aux 2>/dev/null | grep '[x]rpld' | awk '{print $3}')
+
+    if [ "$cpu_count" -gt 0 ]; then
+        cpu_avg=$(echo "scale=2; $cpu_sum / $cpu_count" | bc 2>/dev/null || echo "0")
+        echo "$cpu_avg" >> "$CPU_FILE"
+    fi
+
+    # Collect memory RSS for xrpld processes.
+    while IFS= read -r line; do
+        rss_kb=$(echo "$line" | awk '{print $1}')
+        if [ -n "$rss_kb" ] && [ "$rss_kb" != "0" ]; then
+            rss_mb=$(echo "scale=2; $rss_kb / 1024" | bc 2>/dev/null || echo "0")
+            echo "$rss_mb" >> "$MEM_FILE"
+        fi
+    done < <(ps aux 2>/dev/null | grep '[x]rpld' | awk '{print $6}')
+
+    # Collect RPC latency from each node.
+    for port in "${RPC_PORTS[@]}"; do
+        start_ms=$(date +%s%N)
+        curl -sf "http://localhost:$port" \
+            -d '{"method":"server_info"}' > /dev/null 2>&1 || true
+        end_ms=$(date +%s%N)
+        latency_ms=$(( (end_ms - start_ms) / 1000000 ))
+        echo "$latency_ms" >> "$RPC_FILE"
+    done
+
+    # Record current validated ledger seq.
+    for port in "${RPC_PORTS[@]}"; do
+        seq=$(curl -sf "http://localhost:$port" \
+            -d '{"method":"server_info"}' 2>/dev/null \
+            | jq -r '.result.info.validated_ledger.seq // 0' 2>/dev/null || echo 0)
+        echo "$seq" >> "$LEDGER_FILE"
+        break  # Only need one node's seq per sample.
+    done
+
+    # Progress indicator.
+    if [ $((sample % 10)) -eq 0 ]; then
+        log "  Sample $sample/$SAMPLES..."
+    fi
+
+    sleep "$SAMPLE_INTERVAL"
+done
+
+# ---------------------------------------------------------------------------
+# Compute aggregated metrics
+# ---------------------------------------------------------------------------
+log "Computing aggregated metrics..."
+
+# CPU average.
+if [ -s "$CPU_FILE" ]; then
+    CPU_AVG=$(awk '{ sum += $1; n++ } END { if (n>0) printf "%.2f", sum/n; else print "0" }' "$CPU_FILE")
+else
+    CPU_AVG="0"
+fi
+
+# Memory peak RSS (MB).
+if [ -s "$MEM_FILE" ]; then
+    MEM_PEAK=$(sort -n "$MEM_FILE" | tail -1)
+else
+    MEM_PEAK="0"
+fi
+
+# RPC latency p99 (ms).
+if [ -s "$RPC_FILE" ]; then
+    RPC_COUNT=$(wc -l < "$RPC_FILE")
+    P99_INDEX=$(echo "scale=0; $RPC_COUNT * 99 / 100" | bc)
+    RPC_P99=$(sort -n "$RPC_FILE" | sed -n "${P99_INDEX}p")
+    [ -z "$RPC_P99" ] && RPC_P99="0"
+else
+    RPC_P99="0"
+fi
+
+# TPS calculation from ledger sequence advancement.
+FINAL_SEQ=0
+for port in "${RPC_PORTS[@]}"; do
+    seq=$(curl -sf "http://localhost:$port" \
+        -d '{"method":"server_info"}' 2>/dev/null \
+        | jq -r '.result.info.validated_ledger.seq // 0' 2>/dev/null || echo 0)
+    if [ "$seq" -gt "$FINAL_SEQ" ]; then
+        FINAL_SEQ=$seq
+    fi
+done
+FINAL_TIME=$(date +%s)
+ELAPSED=$((FINAL_TIME - INITIAL_TIME))
+LEDGER_ADVANCE=$((FINAL_SEQ - INITIAL_SEQ))
+if [ "$ELAPSED" -gt 0 ] && [ "$LEDGER_ADVANCE" -gt 0 ]; then
+    # Rough TPS: assume ~avg_txs_per_ledger * ledgers / elapsed.
+    # Without tx count, use ledger close rate as proxy.
+    TPS=$(echo "scale=2; $LEDGER_ADVANCE / $ELAPSED" | bc 2>/dev/null || echo "0")
+else
+    TPS="0"
+fi
+
+# Consensus round time p95 (from ledger close interval).
+# Approximate by looking at ledger sequence progression intervals.
+if [ -s "$LEDGER_FILE" ]; then
+    # Calculate intervals between consecutive ledger sequences.
+    LEDGER_COUNT=$(wc -l < "$LEDGER_FILE")
+    # Rough estimate: DURATION / number_of_distinct_ledgers * 1000 ms
+    UNIQUE_LEDGERS=$(sort -u "$LEDGER_FILE" | wc -l)
+    if [ "$UNIQUE_LEDGERS" -gt 1 ]; then
+        CONSENSUS_P95=$(echo "scale=0; $DURATION * 1000 / ($UNIQUE_LEDGERS - 1)" | bc 2>/dev/null || echo "0")
+    else
+        CONSENSUS_P95="0"
+    fi
+else
+    CONSENSUS_P95="0"
+fi
+
+# ---------------------------------------------------------------------------
+# Write output JSON
+# ---------------------------------------------------------------------------
+cat > "$OUTPUT_FILE" <<EOJSON
+{
+  "cpu_pct_avg": $CPU_AVG,
+  "memory_rss_mb_peak": $MEM_PEAK,
+  "rpc_p99_ms": $RPC_P99,
+  "tps": $TPS,
+  "consensus_round_p95_ms": $CONSENSUS_P95,
+  "samples": $SAMPLES,
+  "duration_seconds": $DURATION,
+  "node_count": ${#RPC_PORTS[@]},
+  "initial_ledger_seq": $INITIAL_SEQ,
+  "final_ledger_seq": $FINAL_SEQ
+}
+EOJSON
+
+ok "Metrics written to $OUTPUT_FILE"
+cat "$OUTPUT_FILE"
--- a/docker/telemetry/workload/expected_metrics.json
+++ b/docker/telemetry/workload/expected_metrics.json
@@ -0,0 +1,101 @@
+{
+  "description": "Expected metric inventory for rippled telemetry validation. Sourced from 09-data-collection-reference.md.",
+  "spanmetrics": {
+    "description": "SpanMetrics-derived RED metrics from the OTel Collector spanmetrics connector.",
+    "metrics": [
+      "traces_span_metrics_calls_total",
+      "traces_span_metrics_duration_milliseconds_bucket",
+      "traces_span_metrics_duration_milliseconds_count",
+      "traces_span_metrics_duration_milliseconds_sum"
+    ],
+    "required_labels": [
+      "span_name",
+      "status_code",
+      "service_name",
+      "span_kind"
+    ],
+    "dimension_labels": [
+      "xrpl_rpc_command",
+      "xrpl_rpc_status",
+      "xrpl_consensus_mode",
+      "xrpl_tx_local",
+      "xrpl_peer_proposal_trusted",
+      "xrpl_peer_validation_trusted"
+    ]
+  },
+  "statsd_gauges": {
+    "description": "beast::insight gauges emitted via StatsD UDP.",
+    "metrics": [
+      "rippled_LedgerMaster_Validated_Ledger_Age",
+      "rippled_LedgerMaster_Published_Ledger_Age",
+      "rippled_State_Accounting_Full_duration",
+      "rippled_Peer_Finder_Active_Inbound_Peers",
+      "rippled_Peer_Finder_Active_Outbound_Peers",
+      "rippled_job_count"
+    ]
+  },
+  "statsd_counters": {
+    "description": "beast::insight counters emitted via StatsD UDP.",
+    "metrics": ["rippled_rpc_requests", "rippled_ledger_fetches"]
+  },
+  "statsd_histograms": {
+    "description": "beast::insight timers/histograms emitted via StatsD UDP.",
+    "metrics": ["rippled_rpc_time", "rippled_rpc_size", "rippled_ios_latency"]
+  },
+  "overlay_traffic": {
+    "description": "Overlay traffic metrics (subset — full list has 45+ categories).",
+    "metrics": [
+      "rippled_total_Bytes_In",
+      "rippled_total_Bytes_Out",
+      "rippled_total_Messages_In",
+      "rippled_total_Messages_Out"
+    ]
+  },
+  "phase9_nodestore": {
+    "description": "Phase 9 NodeStore I/O metrics (via beast::insight extensions).",
+    "metrics": [
+      "rippled_nodestore_reads_total",
+      "rippled_nodestore_writes",
+      "rippled_nodestore_read_bytes",
+      "rippled_nodestore_written_bytes"
+    ]
+  },
+  "phase9_cache": {
+    "description": "Phase 9 cache hit rate metrics (via OTel MetricsRegistry).",
+    "metrics": ["rippled_cache_SLE_hit_rate", "rippled_cache_treenode_size"]
+  },
+  "phase9_txq": {
+    "description": "Phase 9 transaction queue metrics (via OTel MetricsRegistry).",
+    "metrics": ["rippled_txq_count", "rippled_txq_max_size"]
+  },
+  "phase9_rpc_method": {
+    "description": "Phase 9 per-RPC-method metrics (via OTel Metrics SDK).",
+    "metrics": [
+      "rippled_rpc_method_started_total",
+      "rippled_rpc_method_finished_total"
+    ]
+  },
+  "phase9_objects": {
+    "description": "Phase 9 counted object instances.",
+    "metrics": ["rippled_object_count"]
+  },
+  "phase9_load": {
+    "description": "Phase 9 fee escalation and load factor metrics.",
+    "metrics": ["rippled_load_factor"]
+  },
+  "grafana_dashboards": {
+    "description": "All 10 Grafana dashboards that must render data.",
+    "uids": [
+      "rippled-rpc-perf",
+      "rippled-transactions",
+      "rippled-consensus",
+      "rippled-ledger-ops",
+      "rippled-peer-net",
+      "rippled-statsd-node-health",
+      "rippled-statsd-network",
+      "rippled-statsd-rpc",
+      "rippled-statsd-overlay-detail",
+      "rippled-statsd-ledger-sync"
+    ]
+  }
+}
--- a/docker/telemetry/workload/expected_spans.json
+++ b/docker/telemetry/workload/expected_spans.json
@@ -0,0 +1,172 @@
+{
+  "description": "Expected span inventory for rippled telemetry validation. Sourced from 09-data-collection-reference.md.",
+  "spans": [
+    {
+      "name": "rpc.request",
+      "category": "rpc",
+      "parent": null,
+      "required_attributes": [],
+      "config_flag": "trace_rpc"
+    },
+    {
+      "name": "rpc.process",
+      "category": "rpc",
+      "parent": "rpc.request",
+      "required_attributes": [],
+      "config_flag": "trace_rpc"
+    },
+    {
+      "name": "rpc.ws_message",
+      "category": "rpc",
+      "parent": null,
+      "required_attributes": [],
+      "config_flag": "trace_rpc"
+    },
+    {
+      "name": "rpc.command.*",
+      "category": "rpc",
+      "parent": "rpc.process",
+      "required_attributes": [
+        "xrpl.rpc.command",
+        "xrpl.rpc.version",
+        "xrpl.rpc.role",
+        "xrpl.rpc.status",
+        "xrpl.rpc.duration_ms"
+      ],
+      "config_flag": "trace_rpc",
+      "note": "Wildcard — matches rpc.command.server_info, rpc.command.ledger, etc."
+    },
+    {
+      "name": "tx.process",
+      "category": "transaction",
+      "parent": null,
+      "required_attributes": ["xrpl.tx.hash", "xrpl.tx.local", "xrpl.tx.path"],
+      "config_flag": "trace_transactions"
+    },
+    {
+      "name": "tx.receive",
+      "category": "transaction",
+      "parent": null,
+      "required_attributes": [
+        "xrpl.peer.id",
+        "xrpl.tx.hash",
+        "xrpl.tx.suppressed",
+        "xrpl.tx.status"
+      ],
+      "config_flag": "trace_transactions"
+    },
+    {
+      "name": "tx.apply",
+      "category": "transaction",
+      "parent": "ledger.build",
+      "required_attributes": [
+        "xrpl.ledger.seq",
+        "xrpl.ledger.tx_count",
+        "xrpl.ledger.tx_failed"
+      ],
+      "config_flag": "trace_transactions"
+    },
+    {
+      "name": "consensus.proposal.send",
+      "category": "consensus",
+      "parent": null,
+      "required_attributes": ["xrpl.consensus.round"],
+      "config_flag": "trace_consensus"
+    },
+    {
+      "name": "consensus.ledger_close",
+      "category": "consensus",
+      "parent": null,
+      "required_attributes": [
+        "xrpl.consensus.ledger.seq",
+        "xrpl.consensus.mode"
+      ],
+      "config_flag": "trace_consensus"
+    },
+    {
+      "name": "consensus.accept",
+      "category": "consensus",
+      "parent": null,
+      "required_attributes": ["xrpl.consensus.proposers"],
+      "config_flag": "trace_consensus"
+    },
+    {
+      "name": "consensus.validation.send",
+      "category": "consensus",
+      "parent": null,
+      "required_attributes": [
+        "xrpl.consensus.ledger.seq",
+        "xrpl.consensus.proposing"
+      ],
+      "config_flag": "trace_consensus"
+    },
+    {
+      "name": "consensus.accept.apply",
+      "category": "consensus",
+      "parent": null,
+      "required_attributes": [
+        "xrpl.consensus.close_time",
+        "xrpl.consensus.ledger.seq"
+      ],
+      "config_flag": "trace_consensus"
+    },
+    {
+      "name": "ledger.build",
+      "category": "ledger",
+      "parent": null,
+      "required_attributes": [
+        "xrpl.ledger.seq",
+        "xrpl.ledger.tx_count",
+        "xrpl.ledger.tx_failed"
+      ],
+      "config_flag": "trace_ledger"
+    },
+    {
+      "name": "ledger.validate",
+      "category": "ledger",
+      "parent": null,
+      "required_attributes": ["xrpl.ledger.seq", "xrpl.ledger.validations"],
+      "config_flag": "trace_ledger"
+    },
+    {
+      "name": "ledger.store",
+      "category": "ledger",
+      "parent": null,
+      "required_attributes": ["xrpl.ledger.seq"],
+      "config_flag": "trace_ledger"
+    },
+    {
+      "name": "peer.proposal.receive",
+      "category": "peer",
+      "parent": null,
+      "required_attributes": ["xrpl.peer.id", "xrpl.peer.proposal.trusted"],
+      "config_flag": "trace_peer"
+    },
+    {
+      "name": "peer.validation.receive",
+      "category": "peer",
+      "parent": null,
+      "required_attributes": ["xrpl.peer.id", "xrpl.peer.validation.trusted"],
+      "config_flag": "trace_peer"
+    }
+  ],
+  "parent_child_relationships": [
+    {
+      "parent": "rpc.request",
+      "child": "rpc.process",
+      "description": "RPC request contains processing span"
+    },
+    {
+      "parent": "rpc.process",
+      "child": "rpc.command.*",
+      "description": "Processing span contains per-command span"
+    },
+    {
+      "parent": "ledger.build",
+      "child": "tx.apply",
+      "description": "Ledger build contains transaction application"
+    }
+  ],
+  "total_span_types": 17,
+  "total_unique_attributes": 22
+}
--- a/docker/telemetry/workload/generate-validator-keys.sh
+++ b/docker/telemetry/workload/generate-validator-keys.sh
@@ -0,0 +1,150 @@
+#!/usr/bin/env bash
+# generate-validator-keys.sh — Generate validator key pairs for the workload harness.
+#
+# Uses a temporary standalone xrpld instance to call `validation_create` RPC
+# for each node. Outputs a JSON file mapping node index to seed + public key.
+#
+# Usage:
+#   ./generate-validator-keys.sh <xrpld_binary> <num_nodes> <output_dir>
+#
+# Output:
+#   <output_dir>/validator-keys.json — JSON array of {index, seed, public_key}
+#   <output_dir>/validators.txt     — [validators] section for xrpld.cfg
+
+set -euo pipefail
+
+# ---------------------------------------------------------------------------
+# Colored output helpers
+# ---------------------------------------------------------------------------
+log()  { printf "\033[1;34m[KEYGEN]\033[0m %s\n" "$*"; }
+ok()   { printf "\033[1;32m[KEYGEN]\033[0m %s\n" "$*"; }
+die()  { printf "\033[1;31m[KEYGEN]\033[0m %s\n" "$*" >&2; exit 1; }
+
+# ---------------------------------------------------------------------------
+# Argument parsing
+# ---------------------------------------------------------------------------
+usage() {
+    echo "Usage: $0 <xrpld_binary> <num_nodes> <output_dir>"
+    echo ""
+    echo "Arguments:"
+    echo "  xrpld_binary  Path to xrpld binary (built with telemetry=ON)"
+    echo "  num_nodes     Number of validator key pairs to generate (1-20)"
+    echo "  output_dir    Directory to write validator-keys.json and validators.txt"
+    exit 1
+}
+
+if [ $# -lt 3 ]; then
+    usage
+fi
+
+XRPLD="$1"
+NUM_NODES="$2"
+OUTPUT_DIR="$3"
+
+# Validate arguments
+[ -x "$XRPLD" ] || die "xrpld binary not found or not executable: $XRPLD"
+[[ "$NUM_NODES" =~ ^[0-9]+$ ]] || die "num_nodes must be a positive integer"
+[ "$NUM_NODES" -ge 1 ] && [ "$NUM_NODES" -le 20 ] || die "num_nodes must be between 1 and 20"
+
+mkdir -p "$OUTPUT_DIR"
+
+# ---------------------------------------------------------------------------
+# Start a temporary standalone xrpld for key generation
+# ---------------------------------------------------------------------------
+TEMP_DIR="$(mktemp -d)"
+TEMP_PORT=5099
+TEMP_CFG="$TEMP_DIR/xrpld.cfg"
+
+log "Starting temporary xrpld for key generation (port $TEMP_PORT)..."
+
+cat > "$TEMP_CFG" <<EOCFG
+[server]
+port_rpc_keygen
+
+[port_rpc_keygen]
+port = $TEMP_PORT
+ip = 127.0.0.1
+admin = 127.0.0.1
+protocol = http
+
+[node_db]
+type=NuDB
+path=$TEMP_DIR/nudb
+online_delete=256
+
+[database_path]
+$TEMP_DIR/db
+
+[debug_logfile]
+$TEMP_DIR/debug.log
+
+[ssl_verify]
+0
+EOCFG
+
+"$XRPLD" --conf "$TEMP_CFG" -a --start > "$TEMP_DIR/stdout.log" 2>&1 &
+TEMP_PID=$!
+
+# Ensure cleanup on exit
+cleanup_temp() {
+    kill "$TEMP_PID" 2>/dev/null || true
+    wait "$TEMP_PID" 2>/dev/null || true
+    rm -rf "$TEMP_DIR"
+}
+trap cleanup_temp EXIT
+
+# Wait for RPC to become available
+for attempt in $(seq 1 30); do
+    if curl -sf "http://localhost:$TEMP_PORT" \
+        -d '{"method":"server_info"}' >/dev/null 2>&1; then
+        log "Temporary xrpld RPC ready (attempt $attempt)."
+        break
+    fi
+    if [ "$attempt" -eq 30 ]; then
+        die "Temporary xrpld RPC not ready after 30s"
+    fi
+    sleep 1
+done
+
+# ---------------------------------------------------------------------------
+# Generate key pairs
+# ---------------------------------------------------------------------------
+log "Generating $NUM_NODES validator key pairs..."
+
+KEYS_JSON="["
+VALIDATORS_TXT="[validators]"
+
+for i in $(seq 1 "$NUM_NODES"); do
+    result=$(curl -sf "http://localhost:$TEMP_PORT" \
+        -d '{"method":"validation_create"}')
+    seed=$(echo "$result" | jq -r '.result.validation_seed')
+    pubkey=$(echo "$result" | jq -r '.result.validation_public_key')
+
+    if [ -z "$seed" ] || [ "$seed" = "null" ]; then
+        die "Failed to generate key pair for node $i"
+    fi
+
+    log "  Node $i: ${pubkey:0:20}..."
+
+    # Build JSON entry
+    entry="{\"index\": $i, \"seed\": \"$seed\", \"public_key\": \"$pubkey\"}"
+    if [ "$i" -gt 1 ]; then
+        KEYS_JSON="$KEYS_JSON,"
+    fi
+    KEYS_JSON="$KEYS_JSON$entry"
+
+    VALIDATORS_TXT="$VALIDATORS_TXT
+$pubkey"
+done
+
+KEYS_JSON="$KEYS_JSON]"
+
+# ---------------------------------------------------------------------------
+# Write output files
+# ---------------------------------------------------------------------------
+echo "$KEYS_JSON" | jq '.' > "$OUTPUT_DIR/validator-keys.json"
+echo "$VALIDATORS_TXT" > "$OUTPUT_DIR/validators.txt"
+
+ok "Generated $NUM_NODES key pairs:"
+ok "  Keys:       $OUTPUT_DIR/validator-keys.json"
+ok "  Validators: $OUTPUT_DIR/validators.txt"
--- a/docker/telemetry/workload/requirements.txt
+++ b/docker/telemetry/workload/requirements.txt
@@ -0,0 +1,6 @@
+# Python dependencies for Phase 10 workload tools.
+#
+# Install: pip install -r requirements.txt
+
+websockets>=12.0
+aiohttp>=3.9.0
--- a/docker/telemetry/workload/rpc_load_generator.py
+++ b/docker/telemetry/workload/rpc_load_generator.py
@@ -0,0 +1,459 @@
+#!/usr/bin/env python3
+"""RPC Load Generator for rippled telemetry validation.
+
+Connects to one or more rippled WebSocket endpoints and fires all traced
+RPC commands at configurable rates with realistic production-like
+distribution.
+
+Command distribution (default weights):
+  40%  Health checks:   server_info, fee
+  30%  Wallet queries:  account_info, account_lines, account_objects
+  15%  Explorer:        ledger, ledger_data
+  10%  TX lookups:      tx, account_tx
+   5%  DEX queries:     book_offers, amm_info
+
+Usage:
+    python3 rpc_load_generator.py --endpoints ws://localhost:6006 --rate 50 --duration 120
+
+    # Multiple endpoints (round-robin):
+    python3 rpc_load_generator.py \\
+        --endpoints ws://localhost:6006 ws://localhost:6007 \\
+        --rate 100 --duration 300
+
+    # Custom weights:
+    python3 rpc_load_generator.py --endpoints ws://localhost:6006 \\
+        --weights '{"server_info":60,"account_info":30,"ledger":10}'
+"""
+
+import argparse
+import asyncio
+import json
+import logging
+import random
+import sys
+import time
+import uuid
+from dataclasses import dataclass, field
+from typing import Any
+
+import websockets
+
+# ---------------------------------------------------------------------------
+# Configuration
+# ---------------------------------------------------------------------------
+
+# Default command distribution matching realistic production ratios.
+# Keys are RPC command names; values are relative weights.
+DEFAULT_WEIGHTS: dict[str, int] = {
+    # 40% health checks
+    "server_info": 25,
+    "fee": 15,
+    # 30% wallet queries
+    "account_info": 15,
+    "account_lines": 8,
+    "account_objects": 7,
+    # 15% explorer
+    "ledger": 10,
+    "ledger_data": 5,
+    # 10% tx lookups
+    "tx": 5,
+    "account_tx": 5,
+    # 5% DEX queries
+    "book_offers": 3,
+    "amm_info": 2,
+}
+
+# Well-known genesis account for queries that require an account parameter.
+GENESIS_ACCOUNT = "rHb9CJAWyB4rj91VRWn96DkukG4bwdtyTh"
+
+logger = logging.getLogger("rpc_load_generator")
+
+
+# ---------------------------------------------------------------------------
+# Data classes
+# ---------------------------------------------------------------------------
+
+
+@dataclass
+class LoadStats:
+    """Tracks request counts and latencies during a load run.
+
+    Attributes:
+        total_sent:     Total RPC requests dispatched.
+        total_success:  Requests that returned a valid result.
+        total_errors:   Requests that returned an error or timed out.
+        latencies:      Per-command list of round-trip times in seconds.
+        command_counts: Per-command request count.
+    """
+
+    total_sent: int = 0
+    total_success: int = 0
+    total_errors: int = 0
+    latencies: dict[str, list[float]] = field(default_factory=dict)
+    command_counts: dict[str, int] = field(default_factory=dict)
+
+    def record(self, command: str, latency: float, success: bool) -> None:
+        """Record the outcome of a single RPC call."""
+        self.total_sent += 1
+        if success:
+            self.total_success += 1
+        else:
+            self.total_errors += 1
+        self.latencies.setdefault(command, []).append(latency)
+        self.command_counts[command] = self.command_counts.get(command, 0) + 1
+
+    def summary(self) -> dict[str, Any]:
+        """Return a summary dict suitable for JSON serialization."""
+        per_command: dict[str, Any] = {}
+        for cmd, lats in self.latencies.items():
+            sorted_lats = sorted(lats)
+            n = len(sorted_lats)
+            per_command[cmd] = {
+                "count": self.command_counts.get(cmd, 0),
+                "p50_ms": round(sorted_lats[n // 2] * 1000, 2) if n else 0,
+                "p95_ms": (round(sorted_lats[int(n * 0.95)] * 1000, 2) if n else 0),
+                "p99_ms": (round(sorted_lats[int(n * 0.99)] * 1000, 2) if n else 0),
+            }
+        return {
+            "total_sent": self.total_sent,
+            "total_success": self.total_success,
+            "total_errors": self.total_errors,
+            "error_rate_pct": (
+                round(self.total_errors / self.total_sent * 100, 2)
+                if self.total_sent
+                else 0
+            ),
+            "per_command": per_command,
+        }
+
+
+# ---------------------------------------------------------------------------
+# RPC command builders
+# ---------------------------------------------------------------------------
+
+
+def build_rpc_request(command: str) -> dict[str, Any]:
+    """Build a JSON-RPC request object for the given command.
+
+    Args:
+        command: The rippled RPC command name.
+
+    Returns:
+        A dict representing the JSON-RPC request body.
+    """
+    base: dict[str, Any] = {"method": command, "params": [{}]}
+
+    if command == "server_info":
+        pass  # No params needed.
+    elif command == "fee":
+        pass  # No params needed.
+    elif command == "account_info":
+        base["params"] = [{"account": GENESIS_ACCOUNT}]
+    elif command == "account_lines":
+        base["params"] = [{"account": GENESIS_ACCOUNT}]
+    elif command == "account_objects":
+        base["params"] = [{"account": GENESIS_ACCOUNT, "limit": 10}]
+    elif command == "ledger":
+        base["params"] = [{"ledger_index": "validated"}]
+    elif command == "ledger_data":
+        base["params"] = [{"ledger_index": "validated", "limit": 5}]
+    elif command == "tx":
+        # Use a dummy hash — will return "txnNotFound" but still exercises
+        # the full RPC span pipeline (rpc.request -> rpc.process -> rpc.command.tx).
+        base["params"] = [{"transaction": "0" * 64, "binary": False}]
+    elif command == "account_tx":
+        base["params"] = [
+            {
+                "account": GENESIS_ACCOUNT,
+                "ledger_index_min": -1,
+                "ledger_index_max": -1,
+                "limit": 5,
+            }
+        ]
+    elif command == "book_offers":
+        base["params"] = [
+            {
+                "taker_pays": {"currency": "XRP"},
+                "taker_gets": {
+                    "currency": "USD",
+                    "issuer": GENESIS_ACCOUNT,
+                },
+                "limit": 5,
+            }
+        ]
+    elif command == "amm_info":
+        # AMM may not exist — the span is still created on the server side.
+        base["params"] = [
+            {
+                "asset": {"currency": "XRP"},
+                "asset2": {
+                    "currency": "USD",
+                    "issuer": GENESIS_ACCOUNT,
+                },
+            }
+        ]
+
+    return base
+
+
+def choose_command(weights: dict[str, int]) -> str:
+    """Select a random RPC command based on configured weights.
+
+    Args:
+        weights: Mapping of command name to relative weight.
+
+    Returns:
+        A command name string.
+    """
+    commands = list(weights.keys())
+    w = [weights[c] for c in commands]
+    return random.choices(commands, weights=w, k=1)[0]
+
+
+# ---------------------------------------------------------------------------
+# WebSocket RPC client
+# ---------------------------------------------------------------------------
+
+
+async def send_rpc(
+    ws: websockets.WebSocketClientProtocol,
+    command: str,
+    stats: LoadStats,
+    inject_traceparent: bool = True,
+) -> None:
+    """Send a single RPC request over WebSocket and record the result.
+
+    Args:
+        ws:                  Open WebSocket connection.
+        command:             RPC command name.
+        stats:               LoadStats instance to record results.
+        inject_traceparent:  If True, add a W3C traceparent header field
+                             to the request for context propagation testing.
+    """
+    request = build_rpc_request(command)
+
+    # Inject W3C traceparent for context propagation testing.
+    # The rippled WebSocket handler extracts this from the JSON body
+    # when present (Phase 2 context propagation).
+    if inject_traceparent:
+        trace_id = uuid.uuid4().hex
+        span_id = uuid.uuid4().hex[:16]
+        request["traceparent"] = f"00-{trace_id}-{span_id}-01"
+
+    t0 = time.monotonic()
+    try:
+        await ws.send(json.dumps(request))
+        raw = await asyncio.wait_for(ws.recv(), timeout=10.0)
+        latency = time.monotonic() - t0
+        response = json.loads(raw)
+        success = "result" in response
+        stats.record(command, latency, success)
+    except (asyncio.TimeoutError, websockets.exceptions.WebSocketException) as exc:
+        latency = time.monotonic() - t0
+        stats.record(command, latency, False)
+        logger.debug("RPC %s failed: %s", command, exc)
+
+
+async def run_load(
+    endpoints: list[str],
+    rate: float,
+    duration: float,
+    weights: dict[str, int],
+    inject_traceparent: bool,
+) -> LoadStats:
+    """Run the RPC load generator against the given endpoints.
+
+    Distributes requests round-robin across endpoints at the specified
+    rate (requests per second) for the given duration.
+
+    Args:
+        endpoints:          List of WebSocket URLs (ws://host:port).
+        rate:               Target requests per second.
+        duration:           Total run time in seconds.
+        weights:            Command distribution weights.
+        inject_traceparent: Whether to inject W3C traceparent headers.
+
+    Returns:
+        LoadStats with aggregated results.
+    """
+    stats = LoadStats()
+    interval = 1.0 / rate if rate > 0 else 0.1
+
+    # Open persistent connections to all endpoints.
+    connections: list[websockets.WebSocketClientProtocol] = []
+    for ep in endpoints:
+        try:
+            ws = await websockets.connect(ep, ping_interval=20, ping_timeout=10)
+            connections.append(ws)
+            logger.info("Connected to %s", ep)
+        except Exception as exc:
+            logger.error("Failed to connect to %s: %s", ep, exc)
+
+    if not connections:
+        logger.error("No connections established. Aborting.")
+        return stats
+
+    logger.info(
+        "Starting load: rate=%s RPS, duration=%ss, endpoints=%d",
+        rate,
+        duration,
+        len(connections),
+    )
+
+    start = time.monotonic()
+    conn_idx = 0
+
+    try:
+        while (time.monotonic() - start) < duration:
+            command = choose_command(weights)
+            ws = connections[conn_idx % len(connections)]
+            conn_idx += 1
+
+            # Fire-and-forget style with bounded concurrency via sleep.
+            asyncio.create_task(send_rpc(ws, command, stats, inject_traceparent))
+            await asyncio.sleep(interval)
+
+            # Periodic progress log.
+            elapsed = time.monotonic() - start
+            if stats.total_sent % 100 == 0 and stats.total_sent > 0:
+                actual_rps = stats.total_sent / elapsed if elapsed > 0 else 0
+                logger.info(
+                    "Progress: %d sent, %d errors, %.1f RPS (%.0fs elapsed)",
+                    stats.total_sent,
+                    stats.total_errors,
+                    actual_rps,
+                    elapsed,
+                )
+    except asyncio.CancelledError:
+        logger.info("Load generation cancelled.")
+    finally:
+        # Allow in-flight requests to complete.
+        await asyncio.sleep(2)
+        for ws in connections:
+            await ws.close()
+
+    elapsed = time.monotonic() - start
+    logger.info(
+        "Load complete: %d sent, %d success, %d errors in %.1fs (%.1f RPS)",
+        stats.total_sent,
+        stats.total_success,
+        stats.total_errors,
+        elapsed,
+        stats.total_sent / elapsed if elapsed > 0 else 0,
+    )
+
+    return stats
+
+
+# ---------------------------------------------------------------------------
+# CLI entry point
+# ---------------------------------------------------------------------------
+
+
+def parse_args() -> argparse.Namespace:
+    """Parse command-line arguments."""
+    parser = argparse.ArgumentParser(
+        description="RPC Load Generator for rippled telemetry validation",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+  # Basic usage (50 RPS for 2 minutes):
+  python3 rpc_load_generator.py --endpoints ws://localhost:6006 --rate 50 --duration 120
+
+  # Multiple endpoints with custom weights:
+  python3 rpc_load_generator.py \\
+      --endpoints ws://localhost:6006 ws://localhost:6007 \\
+      --rate 100 --duration 300 \\
+      --weights '{"server_info": 80, "account_info": 20}'
+        """,
+    )
+    parser.add_argument(
+        "--endpoints",
+        nargs="+",
+        default=["ws://localhost:6006"],
+        help="WebSocket endpoints (default: ws://localhost:6006)",
+    )
+    parser.add_argument(
+        "--rate",
+        type=float,
+        default=50.0,
+        help="Target requests per second (default: 50)",
+    )
+    parser.add_argument(
+        "--duration",
+        type=float,
+        default=120.0,
+        help="Run duration in seconds (default: 120)",
+    )
+    parser.add_argument(
+        "--weights",
+        type=str,
+        default=None,
+        help="JSON string of command weights (overrides defaults)",
+    )
+    parser.add_argument(
+        "--no-traceparent",
+        action="store_true",
+        help="Disable W3C traceparent injection",
+    )
+    parser.add_argument(
+        "--output",
+        type=str,
+        default=None,
+        help="Write JSON summary to this file path",
+    )
+    parser.add_argument(
+        "--verbose",
+        action="store_true",
+        help="Enable debug logging",
+    )
+    return parser.parse_args()
+
+
+def main() -> None:
+    """Main entry point for the RPC load generator."""
+    args = parse_args()
+
+    logging.basicConfig(
+        level=logging.DEBUG if args.verbose else logging.INFO,
+        format="%(asctime)s [%(name)s] %(levelname)s %(message)s",
+    )
+
+    # Parse custom weights if provided.
+    weights = DEFAULT_WEIGHTS.copy()
+    if args.weights:
+        try:
+            custom = json.loads(args.weights)
+            weights = {k: int(v) for k, v in custom.items()}
+            logger.info("Using custom weights: %s", weights)
+        except (json.JSONDecodeError, ValueError) as exc:
+            logger.error("Invalid --weights JSON: %s", exc)
+            sys.exit(1)
+
+    # Run the load generator.
+    stats = asyncio.run(
+        run_load(
+            endpoints=args.endpoints,
+            rate=args.rate,
+            duration=args.duration,
+            weights=weights,
+            inject_traceparent=not args.no_traceparent,
+        )
+    )
+
+    summary = stats.summary()
+    print(json.dumps(summary, indent=2))
+
+    if args.output:
+        with open(args.output, "w") as f:
+            json.dump(summary, f, indent=2)
+        logger.info("Summary written to %s", args.output)
+
+    # Exit with error if error rate exceeds 50%.
+    if summary["error_rate_pct"] > 50:
+        logger.error("High error rate: %.1f%%", summary["error_rate_pct"])
+        sys.exit(1)
+
+
+if __name__ == "__main__":
+    main()
--- a/docker/telemetry/workload/run-full-validation.sh
+++ b/docker/telemetry/workload/run-full-validation.sh
@@ -0,0 +1,413 @@
+#!/usr/bin/env bash
+# run-full-validation.sh — Orchestrates the full telemetry validation pipeline.
+#
+# Sequence:
+#   1. Start the observability stack (OTel Collector, Jaeger, Tempo, Prometheus, Loki, Grafana)
+#   2. Start a multi-node rippled cluster with full telemetry enabled
+#   3. Wait for consensus
+#   4. Run the RPC load generator
+#   5. Run the transaction submitter
+#   6. Wait for telemetry data to propagate
+#   7. Run the telemetry validation suite
+#   8. (Optional) Run the performance benchmark
+#
+# Usage:
+#   ./run-full-validation.sh --xrpld /path/to/xrpld
+#   ./run-full-validation.sh --xrpld /path/to/xrpld --with-benchmark
+#   ./run-full-validation.sh --cleanup
+#
+# Exit codes:
+#   0 — All validation checks passed
+#   1 — One or more validation checks failed
+#   2 — Infrastructure error (cluster/stack failed to start)
+
+set -euo pipefail
+
+# ---------------------------------------------------------------------------
+# Colored output helpers
+# ---------------------------------------------------------------------------
+log()   { printf "\033[1;34m[VALIDATE]\033[0m %s\n" "$*"; }
+ok()    { printf "\033[1;32m[VALIDATE]\033[0m %s\n" "$*"; }
+warn()  { printf "\033[1;33m[VALIDATE]\033[0m %s\n" "$*"; }
+fail()  { printf "\033[1;31m[VALIDATE]\033[0m %s\n" "$*"; }
+die()   { printf "\033[1;31m[VALIDATE]\033[0m %s\n" "$*" >&2; exit 2; }
+
+# ---------------------------------------------------------------------------
+# Configuration
+# ---------------------------------------------------------------------------
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+TELEMETRY_DIR="$(cd "$SCRIPT_DIR/.." && pwd)"
+REPO_ROOT="$(cd "$TELEMETRY_DIR/../.." && pwd)"
+COMPOSE_FILE="$TELEMETRY_DIR/docker-compose.workload.yaml"
+WORKDIR="/tmp/xrpld-validation"
+
+XRPLD="${XRPLD:-$REPO_ROOT/.build/xrpld}"
+NUM_NODES=5
+RPC_PORT_BASE=5005
+WS_PORT_BASE=6006
+PEER_PORT_BASE=51235
+RPC_RATE=50
+RPC_DURATION=120
+TX_TPS=5
+TX_DURATION=120
+WITH_BENCHMARK=false
+SKIP_LOKI=false
+REPORT_DIR="$WORKDIR/reports"
+
+GENESIS_ACCOUNT="rHb9CJAWyB4rj91VRWn96DkukG4bwdtyTh"
+GENESIS_SEED="snoPBrXtMeMyMHUVTgbuqAfg1SUTb"
+
+# ---------------------------------------------------------------------------
+# Argument parsing
+# ---------------------------------------------------------------------------
+usage() {
+    echo "Usage: $0 [OPTIONS]"
+    echo ""
+    echo "Options:"
+    echo "  --xrpld PATH         Path to xrpld binary"
+    echo "  --nodes NUM          Number of validator nodes (default: 5)"
+    echo "  --rpc-rate RPS       RPC load rate (default: 50)"
+    echo "  --rpc-duration SECS  RPC load duration (default: 120)"
+    echo "  --tx-tps TPS         Transaction submit rate (default: 5)"
+    echo "  --tx-duration SECS   Transaction submit duration (default: 120)"
+    echo "  --with-benchmark     Also run performance benchmarks"
+    echo "  --skip-loki          Skip Loki log-trace correlation checks"
+    echo "  --cleanup            Tear down everything and exit"
+    echo "  -h, --help           Show this help"
+    exit 0
+}
+
+while [ $# -gt 0 ]; do
+    case "$1" in
+        --xrpld)         XRPLD="$2"; shift 2 ;;
+        --nodes)         NUM_NODES="$2"; shift 2 ;;
+        --rpc-rate)      RPC_RATE="$2"; shift 2 ;;
+        --rpc-duration)  RPC_DURATION="$2"; shift 2 ;;
+        --tx-tps)        TX_TPS="$2"; shift 2 ;;
+        --tx-duration)   TX_DURATION="$2"; shift 2 ;;
+        --with-benchmark) WITH_BENCHMARK=true; shift ;;
+        --skip-loki)     SKIP_LOKI=true; shift ;;
+        --cleanup)       # Cleanup mode
+            log "Cleaning up..."
+            pkill -f "$WORKDIR" 2>/dev/null || true
+            docker compose -f "$COMPOSE_FILE" down 2>/dev/null || true
+            rm -rf "$WORKDIR"
+            ok "Cleanup complete."
+            exit 0
+            ;;
+        -h|--help)       usage ;;
+        *)               die "Unknown option: $1" ;;
+    esac
+done
+
+# ---------------------------------------------------------------------------
+# Prerequisites
+# ---------------------------------------------------------------------------
+log "Checking prerequisites..."
+[ -x "$XRPLD" ] || die "xrpld binary not found: $XRPLD"
+command -v docker >/dev/null 2>&1 || die "docker not found"
+docker compose version >/dev/null 2>&1 || die "docker compose (v2) not found"
+command -v python3 >/dev/null 2>&1 || die "python3 not found"
+command -v curl >/dev/null 2>&1 || die "curl not found"
+command -v jq >/dev/null 2>&1 || die "jq not found"
+[ -f "$COMPOSE_FILE" ] || die "docker-compose.workload.yaml not found"
+
+# Install Python dependencies.
+log "Installing Python dependencies..."
+pip3 install -q -r "$SCRIPT_DIR/requirements.txt" 2>/dev/null || \
+    pip install -q -r "$SCRIPT_DIR/requirements.txt" 2>/dev/null || \
+    warn "Could not install Python dependencies — they may already be present"
+
+ok "Prerequisites verified."
+
+# ---------------------------------------------------------------------------
+# Cleanup previous run
+# ---------------------------------------------------------------------------
+log "Cleaning up previous run..."
+pkill -f "$WORKDIR" 2>/dev/null || true
+sleep 2
+rm -rf "$WORKDIR"
+mkdir -p "$WORKDIR" "$REPORT_DIR"
+
+# ---------------------------------------------------------------------------
+# Step 1: Start observability stack
+# ---------------------------------------------------------------------------
+log "Step 1: Starting observability stack..."
+docker compose -f "$COMPOSE_FILE" up -d
+
+log "Waiting for OTel Collector..."
+for attempt in $(seq 1 30); do
+    status=$(curl -so /dev/null -w '%{http_code}' http://localhost:4318/ 2>/dev/null || echo 000)
+    if [ "$status" != "000" ]; then
+        ok "OTel Collector ready (attempt $attempt)"
+        break
+    fi
+    [ "$attempt" -eq 30 ] && die "OTel Collector not ready after 30s"
+    sleep 1
+done
+
+log "Waiting for Jaeger..."
+for attempt in $(seq 1 30); do
+    if curl -sf "http://localhost:16686/" >/dev/null 2>&1; then
+        ok "Jaeger ready (attempt $attempt)"
+        break
+    fi
+    [ "$attempt" -eq 30 ] && die "Jaeger not ready after 30s"
+    sleep 1
+done
+
+log "Waiting for Prometheus..."
+for attempt in $(seq 1 30); do
+    if curl -sf "http://localhost:9090/-/healthy" >/dev/null 2>&1; then
+        ok "Prometheus ready (attempt $attempt)"
+        break
+    fi
+    [ "$attempt" -eq 30 ] && die "Prometheus not ready after 30s"
+    sleep 1
+done
+
+# ---------------------------------------------------------------------------
+# Step 2: Generate validator keys and start cluster
+# ---------------------------------------------------------------------------
+log "Step 2: Starting $NUM_NODES-node validator cluster..."
+
+bash "$SCRIPT_DIR/generate-validator-keys.sh" "$XRPLD" "$NUM_NODES" "$WORKDIR"
+
+for i in $(seq 1 "$NUM_NODES"); do
+    NODE_DIR="$WORKDIR/node$i"
+    mkdir -p "$NODE_DIR/nudb" "$NODE_DIR/db"
+
+    RPC_PORT=$((RPC_PORT_BASE + i - 1))
+    WS_PORT=$((WS_PORT_BASE + i - 1))
+    PEER_PORT=$((PEER_PORT_BASE + i - 1))
+    SEED=$(jq -r ".[$((i-1))].seed" "$WORKDIR/validator-keys.json")
+
+    # Build ips_fixed.
+    IPS_FIXED=""
+    for j in $(seq 1 "$NUM_NODES"); do
+        if [ "$j" -ne "$i" ]; then
+            IPS_FIXED="${IPS_FIXED}127.0.0.1 $((PEER_PORT_BASE + j - 1))
+"
+        fi
+    done
+
+    cat > "$NODE_DIR/xrpld.cfg" <<EOCFG
+[server]
+port_rpc
+port_ws
+port_peer
+
+[port_rpc]
+port = $RPC_PORT
+ip = 127.0.0.1
+admin = 127.0.0.1
+protocol = http
+
+[port_ws]
+port = $WS_PORT
+ip = 127.0.0.1
+admin = 127.0.0.1
+protocol = ws
+
+[port_peer]
+port = $PEER_PORT
+ip = 0.0.0.0
+protocol = peer
+
+[node_db]
+type=NuDB
+path=$NODE_DIR/nudb
+online_delete=256
+
+[database_path]
+$NODE_DIR/db
+
+[debug_logfile]
+$NODE_DIR/debug.log
+
+[validation_seed]
+$SEED
+
+[validators_file]
+$WORKDIR/validators.txt
+
+[ips_fixed]
+${IPS_FIXED}
+[peer_private]
+1
+
+[telemetry]
+enabled=1
+service_instance_id=validator-${i}
+endpoint=http://localhost:4318/v1/traces
+exporter=otlp_http
+sampling_ratio=1.0
+batch_size=512
+batch_delay_ms=2000
+max_queue_size=2048
+trace_rpc=1
+trace_transactions=1
+trace_consensus=1
+trace_peer=1
+trace_ledger=1
+
+[insight]
+server=statsd
+address=127.0.0.1:8125
+prefix=rippled
+
+[rpc_startup]
+{ "command": "log_level", "severity": "warning" }
+
+[ssl_verify]
+0
+EOCFG
+
+    "$XRPLD" --conf "$NODE_DIR/xrpld.cfg" --start > "$NODE_DIR/stdout.log" 2>&1 &
+    echo $! > "$NODE_DIR/xrpld.pid"
+    log "  Node $i: RPC=$RPC_PORT WS=$WS_PORT Peer=$PEER_PORT PID=$!"
+done
+
+# ---------------------------------------------------------------------------
+# Step 3: Wait for consensus
+# ---------------------------------------------------------------------------
+log "Step 3: Waiting for consensus..."
+for attempt in $(seq 1 120); do
+    ready=0
+    for i in $(seq 1 "$NUM_NODES"); do
+        port=$((RPC_PORT_BASE + i - 1))
+        state=$(curl -sf "http://localhost:$port" \
+            -d '{"method":"server_info"}' 2>/dev/null \
+            | jq -r '.result.info.server_state' 2>/dev/null || echo "")
+        if [ "$state" = "proposing" ]; then
+            ready=$((ready + 1))
+        fi
+    done
+    if [ "$ready" -ge "$NUM_NODES" ]; then
+        ok "All $NUM_NODES nodes proposing (attempt $attempt)"
+        break
+    fi
+    if [ "$attempt" -eq 120 ]; then
+        warn "Consensus timeout — $ready/$NUM_NODES nodes ready"
+    fi
+    printf "\r  %d/%d nodes proposing..." "$ready" "$NUM_NODES"
+    sleep 1
+done
+echo ""
+
+# Wait for first validated ledger.
+log "Waiting for validated ledger..."
+for attempt in $(seq 1 60); do
+    val_seq=$(curl -sf "http://localhost:$RPC_PORT_BASE" \
+        -d '{"method":"server_info"}' 2>/dev/null \
+        | jq -r '.result.info.validated_ledger.seq // 0' 2>/dev/null || echo 0)
+    if [ "$val_seq" -gt 2 ] 2>/dev/null; then
+        ok "Validated ledger: seq $val_seq"
+        break
+    fi
+    [ "$attempt" -eq 60 ] && warn "No validated ledger after 60s"
+    sleep 1
+done
+
+# ---------------------------------------------------------------------------
+# Step 4: Run RPC load generator
+# ---------------------------------------------------------------------------
+log "Step 4: Running RPC load generator (${RPC_RATE} RPS for ${RPC_DURATION}s)..."
+
+WS_ENDPOINTS=""
+for i in $(seq 1 "$NUM_NODES"); do
+    WS_ENDPOINTS="$WS_ENDPOINTS ws://localhost:$((WS_PORT_BASE + i - 1))"
+done
+
+python3 "$SCRIPT_DIR/rpc_load_generator.py" \
+    --endpoints $WS_ENDPOINTS \
+    --rate "$RPC_RATE" \
+    --duration "$RPC_DURATION" \
+    --output "$REPORT_DIR/rpc-load-results.json" || \
+    warn "RPC load generator returned non-zero exit"
+
+ok "RPC load generation complete."
+
+# ---------------------------------------------------------------------------
+# Step 5: Run transaction submitter
+# ---------------------------------------------------------------------------
+log "Step 5: Running transaction submitter (${TX_TPS} TPS for ${TX_DURATION}s)..."
+
+python3 "$SCRIPT_DIR/tx_submitter.py" \
+    --endpoint "ws://localhost:$WS_PORT_BASE" \
+    --tps "$TX_TPS" \
+    --duration "$TX_DURATION" \
+    --output "$REPORT_DIR/tx-submit-results.json" || \
+    warn "Transaction submitter returned non-zero exit"
+
+ok "Transaction submission complete."
+
+# ---------------------------------------------------------------------------
+# Step 6: Wait for telemetry propagation
+# ---------------------------------------------------------------------------
+log "Step 6: Waiting 30s for telemetry data to propagate..."
+sleep 30
+
+# ---------------------------------------------------------------------------
+# Step 7: Run telemetry validation suite
+# ---------------------------------------------------------------------------
+log "Step 7: Running telemetry validation suite..."
+
+VALIDATION_ARGS="--report $REPORT_DIR/validation-report.json"
+if [ "$SKIP_LOKI" = true ]; then
+    VALIDATION_ARGS="$VALIDATION_ARGS --skip-loki"
+fi
+
+VALIDATION_EXIT=0
+python3 "$SCRIPT_DIR/validate_telemetry.py" $VALIDATION_ARGS || VALIDATION_EXIT=$?
+
+if [ "$VALIDATION_EXIT" -eq 0 ]; then
+    ok "All telemetry validation checks passed!"
+else
+    fail "Some telemetry validation checks failed (exit $VALIDATION_EXIT)"
+fi
+
+# ---------------------------------------------------------------------------
+# Step 8: (Optional) Run benchmark
+# ---------------------------------------------------------------------------
+if [ "$WITH_BENCHMARK" = true ]; then
+    log "Step 8: Running performance benchmark..."
+    bash "$SCRIPT_DIR/benchmark.sh" \
+        --xrpld "$XRPLD" \
+        --duration 120 \
+        --nodes 3 \
+        --output "$REPORT_DIR" || \
+        warn "Benchmark returned non-zero exit"
+fi
+
+# ---------------------------------------------------------------------------
+# Summary
+# ---------------------------------------------------------------------------
+echo ""
+echo "==========================================================="
+echo "  FULL VALIDATION RESULTS"
+echo "==========================================================="
+echo ""
+echo "  Reports directory: $REPORT_DIR"
+echo ""
+ls -la "$REPORT_DIR/" 2>/dev/null || true
+echo ""
+echo "  Observability stack is running:"
+echo "    Jaeger UI:     http://localhost:16686"
+echo "    Grafana:       http://localhost:3000"
+echo "    Prometheus:    http://localhost:9090"
+echo ""
+echo "  xrpld nodes ($NUM_NODES) are running:"
+for i in $(seq 1 "$NUM_NODES"); do
+    rpc=$((RPC_PORT_BASE + i - 1))
+    ws=$((WS_PORT_BASE + i - 1))
+    pid=$(cat "$WORKDIR/node$i/xrpld.pid" 2>/dev/null || echo 'unknown')
+    echo "    Node $i: RPC=$rpc WS=$ws PID=$pid"
+done
+echo ""
+echo "  To tear down:"
+echo "    $0 --cleanup"
+echo ""
+echo "==========================================================="
+
+exit "$VALIDATION_EXIT"
--- a/docker/telemetry/workload/test_accounts.json
+++ b/docker/telemetry/workload/test_accounts.json
@@ -0,0 +1,42 @@
+{
+  "genesis": {
+    "account": "rHb9CJAWyB4rj91VRWn96DkukG4bwdtyTh",
+    "seed": "snoPBrXtMeMyMHUVTgbuqAfg1SUTb",
+    "description": "Genesis account with all XRP. Used to fund test accounts."
+  },
+  "test_accounts": [
+    {
+      "name": "alice",
+      "description": "Primary sender for Payment and OfferCreate transactions."
+    },
+    {
+      "name": "bob",
+      "description": "Primary receiver for Payment transactions."
+    },
+    {
+      "name": "carol",
+      "description": "TrustSet and issued currency counterparty."
+    },
+    {
+      "name": "dave",
+      "description": "NFToken operations (mint, offer, accept)."
+    },
+    {
+      "name": "eve",
+      "description": "Escrow operations (create, finish)."
+    },
+    {
+      "name": "frank",
+      "description": "AMM pool operations (create, deposit, withdraw)."
+    },
+    {
+      "name": "grace",
+      "description": "Additional sender for parallel transaction submission."
+    },
+    {
+      "name": "heidi",
+      "description": "Additional receiver for payment diversity."
+    }
+  ],
+  "note": "Test account keypairs are generated dynamically at runtime via wallet_propose RPC. This file defines the logical roles. Actual keys are stored in the workdir during execution."
+}
--- a/docker/telemetry/workload/tx_submitter.py
+++ b/docker/telemetry/workload/tx_submitter.py
@@ -0,0 +1,790 @@
+#!/usr/bin/env python3
+"""Transaction Submitter for rippled telemetry validation.
+
+Generates diverse transaction types against a rippled cluster to exercise
+the full span and metric surface: tx.process, tx.apply, ledger.build,
+consensus.*, and all associated attributes.
+
+Pre-funds test accounts from the genesis account, then submits a
+configurable mix of transaction types at a target TPS.
+
+Supported transaction types:
+  - Payment (XRP and issued currencies)
+  - OfferCreate / OfferCancel (DEX activity)
+  - TrustSet (trust line creation)
+  - NFTokenMint / NFTokenCreateOffer / NFTokenAcceptOffer
+  - EscrowCreate / EscrowFinish
+  - AMMCreate / AMMDeposit / AMMWithdraw (if amendment enabled)
+
+Usage:
+    python3 tx_submitter.py --endpoint ws://localhost:6006 --tps 5 --duration 120
+
+    # Custom transaction mix:
+    python3 tx_submitter.py --endpoint ws://localhost:6006 \\
+        --weights '{"Payment":50,"OfferCreate":20,"TrustSet":10,"NFTokenMint":10,"EscrowCreate":10}'
+"""
+
+import argparse
+import asyncio
+import json
+import logging
+import random
+import sys
+import time
+from dataclasses import dataclass, field
+from typing import Any
+
+import websockets
+
+logger = logging.getLogger("tx_submitter")
+
+# ---------------------------------------------------------------------------
+# Constants
+# ---------------------------------------------------------------------------
+
+GENESIS_ACCOUNT = "rHb9CJAWyB4rj91VRWn96DkukG4bwdtyTh"
+GENESIS_SEED = "snoPBrXtMeMyMHUVTgbuqAfg1SUTb"
+
+# Amount to fund each test account (100,000 XRP in drops).
+FUND_AMOUNT = "100000000000"
+
+# Default transaction mix weights (relative).
+DEFAULT_TX_WEIGHTS: dict[str, int] = {
+    "Payment": 40,
+    "OfferCreate": 15,
+    "OfferCancel": 5,
+    "TrustSet": 10,
+    "NFTokenMint": 10,
+    "NFTokenCreateOffer": 5,
+    "EscrowCreate": 5,
+    "EscrowFinish": 5,
+    "AMMCreate": 3,
+    "AMMDeposit": 2,
+}
+
+# Number of test accounts to create.
+NUM_TEST_ACCOUNTS = 8
+
+
+# ---------------------------------------------------------------------------
+# Data classes
+# ---------------------------------------------------------------------------
+
+
+@dataclass
+class Account:
+    """Represents a funded XRPL test account.
+
+    Attributes:
+        name:      Human-readable name (e.g., "alice").
+        account:   Classic address (rXXX...).
+        seed:      Secret seed for signing.
+        sequence:  Next available sequence number.
+    """
+
+    name: str
+    account: str
+    seed: str
+    sequence: int = 0
+
+
+@dataclass
+class TxStats:
+    """Tracks transaction submission results.
+
+    Attributes:
+        total_submitted: Total transactions sent to the network.
+        total_success:   Transactions that returned tesSUCCESS or terQUEUED.
+        total_errors:    Transactions that returned an error engine_result.
+        by_type:         Per-transaction-type count of submissions.
+        errors_by_type:  Per-transaction-type count of errors.
+    """
+
+    total_submitted: int = 0
+    total_success: int = 0
+    total_errors: int = 0
+    by_type: dict[str, int] = field(default_factory=dict)
+    errors_by_type: dict[str, int] = field(default_factory=dict)
+
+    def record(self, tx_type: str, success: bool) -> None:
+        """Record the result of a transaction submission."""
+        self.total_submitted += 1
+        self.by_type[tx_type] = self.by_type.get(tx_type, 0) + 1
+        if success:
+            self.total_success += 1
+        else:
+            self.total_errors += 1
+            self.errors_by_type[tx_type] = self.errors_by_type.get(tx_type, 0) + 1
+
+    def summary(self) -> dict[str, Any]:
+        """Return a summary dict suitable for JSON serialization."""
+        return {
+            "total_submitted": self.total_submitted,
+            "total_success": self.total_success,
+            "total_errors": self.total_errors,
+            "success_rate_pct": (
+                round(self.total_success / self.total_submitted * 100, 2)
+                if self.total_submitted
+                else 0
+            ),
+            "by_type": self.by_type,
+            "errors_by_type": self.errors_by_type,
+        }
+
+
+# ---------------------------------------------------------------------------
+# WebSocket RPC helpers
+# ---------------------------------------------------------------------------
+
+
+async def ws_request(
+    ws: websockets.WebSocketClientProtocol,
+    method: str,
+    params: list[dict[str, Any]] | None = None,
+) -> dict[str, Any]:
+    """Send a JSON-RPC request over WebSocket and return the result.
+
+    Args:
+        ws:     Open WebSocket connection.
+        method: RPC method name.
+        params: Optional list of parameter dicts.
+
+    Returns:
+        The parsed JSON response dict.
+
+    Raises:
+        RuntimeError: If the request fails or times out.
+    """
+    request: dict[str, Any] = {"method": method}
+    if params:
+        request["params"] = params
+    await ws.send(json.dumps(request))
+    raw = await asyncio.wait_for(ws.recv(), timeout=30.0)
+    return json.loads(raw)
+
+
+async def create_account(ws: websockets.WebSocketClientProtocol, name: str) -> Account:
+    """Create a new account via wallet_propose RPC.
+
+    Args:
+        ws:   Open WebSocket connection.
+        name: Human-readable name for the account.
+
+    Returns:
+        An Account instance with the generated keypair.
+    """
+    resp = await ws_request(ws, "wallet_propose")
+    result = resp.get("result", {})
+    return Account(
+        name=name,
+        account=result["account_id"],
+        seed=result["master_seed"],
+    )
+
+
+async def fund_account(
+    ws: websockets.WebSocketClientProtocol,
+    dest: Account,
+    genesis_seq: int,
+) -> tuple[bool, int]:
+    """Fund a test account from genesis.
+
+    Args:
+        ws:          Open WebSocket connection.
+        dest:        Destination account to fund.
+        genesis_seq: Current genesis account sequence number.
+
+    Returns:
+        Tuple of (success: bool, next_sequence: int).
+    """
+    resp = await ws_request(
+        ws,
+        "submit",
+        [
+            {
+                "secret": GENESIS_SEED,
+                "tx_json": {
+                    "TransactionType": "Payment",
+                    "Account": GENESIS_ACCOUNT,
+                    "Destination": dest.account,
+                    "Amount": FUND_AMOUNT,
+                    "Sequence": genesis_seq,
+                },
+            }
+        ],
+    )
+    engine_result = resp.get("result", {}).get("engine_result", "unknown")
+    success = engine_result in ("tesSUCCESS", "terQUEUED")
+    if not success:
+        logger.warning("Fund %s failed: %s", dest.name, engine_result)
+    return success, genesis_seq + 1
+
+
+async def get_account_sequence(
+    ws: websockets.WebSocketClientProtocol, account: str
+) -> int:
+    """Get the current sequence number for an account.
+
+    Args:
+        ws:      Open WebSocket connection.
+        account: Classic address.
+
+    Returns:
+        Current sequence number.
+    """
+    resp = await ws_request(ws, "account_info", [{"account": account}])
+    return resp.get("result", {}).get("account_data", {}).get("Sequence", 0)
+
+
+# ---------------------------------------------------------------------------
+# Transaction builders
+# ---------------------------------------------------------------------------
+
+
+def build_payment(sender: Account, receiver: Account) -> dict[str, Any]:
+    """Build an XRP Payment transaction.
+
+    Args:
+        sender:   Source account.
+        receiver: Destination account.
+
+    Returns:
+        Transaction JSON and signing secret.
+    """
+    amount = str(random.randint(1000, 1000000))  # 0.001 - 1 XRP
+    return {
+        "secret": sender.seed,
+        "tx_json": {
+            "TransactionType": "Payment",
+            "Account": sender.account,
+            "Destination": receiver.account,
+            "Amount": amount,
+            "Sequence": sender.sequence,
+        },
+    }
+
+
+def build_offer_create(sender: Account) -> dict[str, Any]:
+    """Build an OfferCreate transaction (XRP/USD pair).
+
+    Args:
+        sender: Account placing the offer.
+
+    Returns:
+        Transaction JSON and signing secret.
+    """
+    return {
+        "secret": sender.seed,
+        "tx_json": {
+            "TransactionType": "OfferCreate",
+            "Account": sender.account,
+            "TakerPays": str(random.randint(100000, 10000000)),
+            "TakerGets": {
+                "currency": "USD",
+                "issuer": GENESIS_ACCOUNT,
+                "value": str(round(random.uniform(0.1, 100.0), 2)),
+            },
+            "Sequence": sender.sequence,
+        },
+    }
+
+
+def build_offer_cancel(sender: Account) -> dict[str, Any]:
+    """Build an OfferCancel transaction.
+
+    Uses a non-existent offer sequence — will fail gracefully but still
+    exercises the tx.process span pipeline.
+
+    Args:
+        sender: Account cancelling the offer.
+
+    Returns:
+        Transaction JSON and signing secret.
+    """
+    return {
+        "secret": sender.seed,
+        "tx_json": {
+            "TransactionType": "OfferCancel",
+            "Account": sender.account,
+            "OfferSequence": max(1, sender.sequence - 1),
+            "Sequence": sender.sequence,
+        },
+    }
+
+
+def build_trust_set(sender: Account) -> dict[str, Any]:
+    """Build a TrustSet transaction for a USD trust line.
+
+    Args:
+        sender: Account setting the trust line.
+
+    Returns:
+        Transaction JSON and signing secret.
+    """
+    return {
+        "secret": sender.seed,
+        "tx_json": {
+            "TransactionType": "TrustSet",
+            "Account": sender.account,
+            "LimitAmount": {
+                "currency": "USD",
+                "issuer": GENESIS_ACCOUNT,
+                "value": "1000000",
+            },
+            "Sequence": sender.sequence,
+        },
+    }
+
+
+def build_nftoken_mint(sender: Account) -> dict[str, Any]:
+    """Build an NFTokenMint transaction.
+
+    Args:
+        sender: Account minting the NFT.
+
+    Returns:
+        Transaction JSON and signing secret.
+    """
+    return {
+        "secret": sender.seed,
+        "tx_json": {
+            "TransactionType": "NFTokenMint",
+            "Account": sender.account,
+            "NFTokenTaxon": random.randint(0, 100),
+            "Flags": 8,  # tfTransferable
+            "Sequence": sender.sequence,
+        },
+    }
+
+
+def build_nftoken_create_offer(sender: Account) -> dict[str, Any]:
+    """Build an NFTokenCreateOffer transaction.
+
+    Uses a dummy NFTokenID — will fail but exercises the span pipeline.
+
+    Args:
+        sender: Account creating the NFT offer.
+
+    Returns:
+        Transaction JSON and signing secret.
+    """
+    return {
+        "secret": sender.seed,
+        "tx_json": {
+            "TransactionType": "NFTokenCreateOffer",
+            "Account": sender.account,
+            "NFTokenID": "0" * 64,
+            "Amount": str(random.randint(100000, 1000000)),
+            "Flags": 1,  # tfSellNFToken
+            "Sequence": sender.sequence,
+        },
+    }
+
+
+def build_escrow_create(sender: Account, receiver: Account) -> dict[str, Any]:
+    """Build an EscrowCreate transaction.
+
+    Creates a time-based escrow that finishes 10 seconds from now.
+
+    Args:
+        sender:   Account creating the escrow.
+        receiver: Destination account for escrow funds.
+
+    Returns:
+        Transaction JSON and signing secret.
+    """
+    # Ripple epoch offset: 946684800 seconds from Unix epoch
+    ripple_time = int(time.time()) - 946684800
+    return {
+        "secret": sender.seed,
+        "tx_json": {
+            "TransactionType": "EscrowCreate",
+            "Account": sender.account,
+            "Destination": receiver.account,
+            "Amount": str(random.randint(100000, 1000000)),
+            "FinishAfter": ripple_time + 10,
+            "Sequence": sender.sequence,
+        },
+    }
+
+
+def build_escrow_finish(sender: Account, owner: Account) -> dict[str, Any]:
+    """Build an EscrowFinish transaction.
+
+    Uses a dummy offer sequence — will likely fail but exercises spans.
+
+    Args:
+        sender: Account finishing the escrow.
+        owner:  Account that created the escrow.
+
+    Returns:
+        Transaction JSON and signing secret.
+    """
+    return {
+        "secret": sender.seed,
+        "tx_json": {
+            "TransactionType": "EscrowFinish",
+            "Account": sender.account,
+            "Owner": owner.account,
+            "OfferSequence": max(1, owner.sequence - 2),
+            "Sequence": sender.sequence,
+        },
+    }
+
+
+def build_amm_create(sender: Account) -> dict[str, Any]:
+    """Build an AMMCreate transaction (XRP/USD pool).
+
+    Requires the AMM amendment to be enabled on the network.
+
+    Args:
+        sender: Account creating the AMM pool.
+
+    Returns:
+        Transaction JSON and signing secret.
+    """
+    return {
+        "secret": sender.seed,
+        "tx_json": {
+            "TransactionType": "AMMCreate",
+            "Account": sender.account,
+            "Amount": str(random.randint(10000000, 100000000)),
+            "Amount2": {
+                "currency": "USD",
+                "issuer": GENESIS_ACCOUNT,
+                "value": str(round(random.uniform(10.0, 1000.0), 2)),
+            },
+            "TradingFee": 500,  # 0.5%
+            "Sequence": sender.sequence,
+        },
+    }
+
+
+def build_amm_deposit(sender: Account) -> dict[str, Any]:
+    """Build an AMMDeposit transaction.
+
+    Args:
+        sender: Account depositing into the AMM pool.
+
+    Returns:
+        Transaction JSON and signing secret.
+    """
+    return {
+        "secret": sender.seed,
+        "tx_json": {
+            "TransactionType": "AMMDeposit",
+            "Account": sender.account,
+            "Asset": {"currency": "XRP"},
+            "Asset2": {
+                "currency": "USD",
+                "issuer": GENESIS_ACCOUNT,
+            },
+            "Amount": str(random.randint(1000000, 10000000)),
+            "Flags": 0x00080000,  # tfSingleAsset
+            "Sequence": sender.sequence,
+        },
+    }
+
+
+# Transaction type -> builder function mapping.
+# Each builder takes (accounts: list[Account]) and returns submit params.
+TX_BUILDERS: dict[str, Any] = {
+    "Payment": lambda accts: build_payment(accts[0], accts[1]),
+    "OfferCreate": lambda accts: build_offer_create(accts[0]),
+    "OfferCancel": lambda accts: build_offer_cancel(accts[0]),
+    "TrustSet": lambda accts: build_trust_set(accts[2]),
+    "NFTokenMint": lambda accts: build_nftoken_mint(accts[3]),
+    "NFTokenCreateOffer": lambda accts: build_nftoken_create_offer(accts[3]),
+    "EscrowCreate": lambda accts: build_escrow_create(accts[4], accts[1]),
+    "EscrowFinish": lambda accts: build_escrow_finish(accts[4], accts[4]),
+    "AMMCreate": lambda accts: build_amm_create(accts[5]),
+    "AMMDeposit": lambda accts: build_amm_deposit(accts[5]),
+}
+
+
+# ---------------------------------------------------------------------------
+# Main submission loop
+# ---------------------------------------------------------------------------
+
+
+async def setup_accounts(
+    ws: websockets.WebSocketClientProtocol,
+) -> list[Account]:
+    """Create and fund test accounts from genesis.
+
+    Generates NUM_TEST_ACCOUNTS accounts via wallet_propose, then funds
+    each with FUND_AMOUNT XRP from genesis.
+
+    Args:
+        ws: Open WebSocket connection to a rippled node.
+
+    Returns:
+        List of funded Account instances.
+    """
+    account_names = ["alice", "bob", "carol", "dave", "eve", "frank", "grace", "heidi"]
+
+    logger.info("Creating %d test accounts...", NUM_TEST_ACCOUNTS)
+    accounts: list[Account] = []
+    for name in account_names[:NUM_TEST_ACCOUNTS]:
+        acct = await create_account(ws, name)
+        accounts.append(acct)
+        logger.info("  Created %s: %s", name, acct.account)
+
+    # Get genesis sequence.
+    genesis_seq = await get_account_sequence(ws, GENESIS_ACCOUNT)
+    logger.info("Genesis sequence: %d", genesis_seq)
+
+    # Fund all accounts.
+    logger.info("Funding test accounts...")
+    for acct in accounts:
+        success, genesis_seq = await fund_account(ws, acct, genesis_seq)
+        if success:
+            logger.info("  Funded %s", acct.name)
+        else:
+            logger.warning("  Failed to fund %s", acct.name)
+
+    # Wait for funding transactions to be validated.
+    logger.info("Waiting 10s for funding transactions to validate...")
+    await asyncio.sleep(10)
+
+    # Refresh sequence numbers for all accounts.
+    for acct in accounts:
+        try:
+            acct.sequence = await get_account_sequence(ws, acct.account)
+            logger.info("  %s sequence: %d", acct.name, acct.sequence)
+        except Exception as exc:
+            logger.warning("  Failed to get sequence for %s: %s", acct.name, exc)
+
+    return accounts
+
+
+async def submit_transaction(
+    ws: websockets.WebSocketClientProtocol,
+    tx_type: str,
+    accounts: list[Account],
+    stats: TxStats,
+) -> None:
+    """Submit a single transaction of the given type.
+
+    Selects the appropriate builder, constructs the transaction, submits
+    it via the submit RPC, and records the result.
+
+    Args:
+        ws:       Open WebSocket connection.
+        tx_type:  Transaction type name (e.g., "Payment").
+        accounts: List of funded test accounts.
+        stats:    TxStats instance to record results.
+    """
+    builder = TX_BUILDERS.get(tx_type)
+    if not builder:
+        logger.warning("Unknown transaction type: %s", tx_type)
+        return
+
+    try:
+        params = builder(accounts)
+        # Identify which account is the sender to bump its sequence.
+        sender_addr = params["tx_json"]["Account"]
+        sender = next((a for a in accounts if a.account == sender_addr), None)
+
+        resp = await ws_request(ws, "submit", [params])
+        engine_result = resp.get("result", {}).get("engine_result", "unknown")
+        success = engine_result in (
+            "tesSUCCESS",
+            "terQUEUED",
+            "tecUNFUNDED_OFFER",
+            "tecNO_DST_INSUF_XRP",
+        )
+        stats.record(tx_type, success)
+
+        if sender:
+            sender.sequence += 1
+
+        if not success:
+            logger.debug(
+                "%s result: %s (%s)",
+                tx_type,
+                engine_result,
+                resp.get("result", {}).get("engine_result_message", ""),
+            )
+    except Exception as exc:
+        stats.record(tx_type, False)
+        logger.debug("%s error: %s", tx_type, exc)
+
+
+async def run_submitter(
+    endpoint: str,
+    tps: float,
+    duration: float,
+    weights: dict[str, int],
+) -> TxStats:
+    """Run the transaction submitter against a single endpoint.
+
+    Args:
+        endpoint: WebSocket URL (ws://host:port).
+        tps:      Target transactions per second.
+        duration: Total run time in seconds.
+        weights:  Transaction type distribution weights.
+
+    Returns:
+        TxStats with aggregated results.
+    """
+    stats = TxStats()
+    interval = 1.0 / tps if tps > 0 else 0.5
+
+    ws = await websockets.connect(endpoint, ping_interval=20, ping_timeout=10)
+    logger.info("Connected to %s", endpoint)
+
+    try:
+        # Setup test accounts.
+        accounts = await setup_accounts(ws)
+        if len(accounts) < 6:
+            logger.error("Need at least 6 funded accounts, got %d", len(accounts))
+            return stats
+
+        # Build weighted command list.
+        tx_types = list(weights.keys())
+        tx_weights = [weights[t] for t in tx_types]
+
+        logger.info(
+            "Starting TX submission: tps=%s, duration=%ss, types=%d",
+            tps,
+            duration,
+            len(tx_types),
+        )
+
+        start = time.monotonic()
+        while (time.monotonic() - start) < duration:
+            tx_type = random.choices(tx_types, weights=tx_weights, k=1)[0]
+            await submit_transaction(ws, tx_type, accounts, stats)
+            await asyncio.sleep(interval)
+
+            # Progress logging every 50 transactions.
+            if stats.total_submitted % 50 == 0 and stats.total_submitted > 0:
+                elapsed = time.monotonic() - start
+                actual_tps = stats.total_submitted / elapsed if elapsed > 0 else 0
+                logger.info(
+                    "Progress: %d submitted, %d success, %d errors, "
+                    "%.1f TPS (%.0fs elapsed)",
+                    stats.total_submitted,
+                    stats.total_success,
+                    stats.total_errors,
+                    actual_tps,
+                    elapsed,
+                )
+
+    finally:
+        await ws.close()
+
+    elapsed = time.monotonic() - start
+    logger.info(
+        "Submission complete: %d submitted, %d success, %d errors "
+        "in %.1fs (%.1f TPS)",
+        stats.total_submitted,
+        stats.total_success,
+        stats.total_errors,
+        elapsed,
+        stats.total_submitted / elapsed if elapsed > 0 else 0,
+    )
+
+    return stats
+
+
+# ---------------------------------------------------------------------------
+# CLI entry point
+# ---------------------------------------------------------------------------
+
+
+def parse_args() -> argparse.Namespace:
+    """Parse command-line arguments."""
+    parser = argparse.ArgumentParser(
+        description="Transaction Submitter for rippled telemetry validation",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+  # Basic usage (5 TPS for 2 minutes):
+  python3 tx_submitter.py --endpoint ws://localhost:6006 --tps 5 --duration 120
+
+  # Custom transaction mix:
+  python3 tx_submitter.py --endpoint ws://localhost:6006 \\
+      --weights '{"Payment": 60, "OfferCreate": 20, "TrustSet": 20}'
+        """,
+    )
+    parser.add_argument(
+        "--endpoint",
+        type=str,
+        default="ws://localhost:6006",
+        help="WebSocket endpoint (default: ws://localhost:6006)",
+    )
+    parser.add_argument(
+        "--tps",
+        type=float,
+        default=5.0,
+        help="Target transactions per second (default: 5)",
+    )
+    parser.add_argument(
+        "--duration",
+        type=float,
+        default=120.0,
+        help="Run duration in seconds (default: 120)",
+    )
+    parser.add_argument(
+        "--weights",
+        type=str,
+        default=None,
+        help="JSON string of transaction type weights (overrides defaults)",
+    )
+    parser.add_argument(
+        "--output",
+        type=str,
+        default=None,
+        help="Write JSON summary to this file path",
+    )
+    parser.add_argument(
+        "--verbose",
+        action="store_true",
+        help="Enable debug logging",
+    )
+    return parser.parse_args()
+
+
+def main() -> None:
+    """Main entry point for the transaction submitter."""
+    args = parse_args()
+
+    logging.basicConfig(
+        level=logging.DEBUG if args.verbose else logging.INFO,
+        format="%(asctime)s [%(name)s] %(levelname)s %(message)s",
+    )
+
+    # Parse custom weights if provided.
+    weights = DEFAULT_TX_WEIGHTS.copy()
+    if args.weights:
+        try:
+            custom = json.loads(args.weights)
+            weights = {k: int(v) for k, v in custom.items()}
+            logger.info("Using custom weights: %s", weights)
+        except (json.JSONDecodeError, ValueError) as exc:
+            logger.error("Invalid --weights JSON: %s", exc)
+            sys.exit(1)
+
+    # Run the submitter.
+    stats = asyncio.run(
+        run_submitter(
+            endpoint=args.endpoint,
+            tps=args.tps,
+            duration=args.duration,
+            weights=weights,
+        )
+    )
+
+    summary = stats.summary()
+    print(json.dumps(summary, indent=2))
+
+    if args.output:
+        with open(args.output, "w") as f:
+            json.dump(summary, f, indent=2)
+        logger.info("Summary written to %s", args.output)
+
+
+if __name__ == "__main__":
+    main()
--- a/docker/telemetry/workload/validate_telemetry.py
+++ b/docker/telemetry/workload/validate_telemetry.py
@@ -0,0 +1,886 @@
+#!/usr/bin/env python3
+"""Telemetry Validation Suite for rippled.
+
+Validates that the full telemetry stack is emitting expected data after
+a workload run. Queries Jaeger (spans), Prometheus (metrics), Loki (logs),
+and Grafana (dashboards) APIs to produce a pass/fail report.
+
+Validation categories:
+  1. Span validation     — All 16+ span types present with required attributes
+  2. Metric validation   — SpanMetrics, StatsD, and Phase 9 metrics are non-zero
+  3. Log-trace correlation — Loki logs contain trace_id/span_id fields
+  4. Dashboard validation — All 10 Grafana dashboards render data
+
+Usage:
+    python3 validate_telemetry.py --report /tmp/validation-report.json
+
+    # Custom API endpoints:
+    python3 validate_telemetry.py \\
+        --jaeger http://localhost:16686 \\
+        --prometheus http://localhost:9090 \\
+        --loki http://localhost:3100 \\
+        --grafana http://localhost:3000
+"""
+
+import argparse
+import asyncio
+import json
+import logging
+import sys
+import time
+from dataclasses import dataclass, field
+from pathlib import Path
+from typing import Any
+
+import aiohttp
+
+logger = logging.getLogger("validate_telemetry")
+
+# ---------------------------------------------------------------------------
+# Configuration defaults
+# ---------------------------------------------------------------------------
+
+DEFAULT_JAEGER = "http://localhost:16686"
+DEFAULT_PROMETHEUS = "http://localhost:9090"
+DEFAULT_LOKI = "http://localhost:3100"
+DEFAULT_GRAFANA = "http://localhost:3000"
+
+SCRIPT_DIR = Path(__file__).parent
+EXPECTED_SPANS_FILE = SCRIPT_DIR / "expected_spans.json"
+EXPECTED_METRICS_FILE = SCRIPT_DIR / "expected_metrics.json"
+
+
+# ---------------------------------------------------------------------------
+# Data classes
+# ---------------------------------------------------------------------------
+
+
+@dataclass
+class CheckResult:
+    """Result of a single validation check.
+
+    Attributes:
+        name:     Check identifier (e.g., "span.rpc.request").
+        category: Validation category (span, metric, log, dashboard).
+        passed:   Whether the check passed.
+        message:  Human-readable description of the result.
+        details:  Optional additional data (counts, values, etc.).
+    """
+
+    name: str
+    category: str
+    passed: bool
+    message: str
+    details: dict[str, Any] = field(default_factory=dict)
+
+    def to_dict(self) -> dict[str, Any]:
+        """Serialize to a JSON-compatible dict."""
+        return {
+            "name": self.name,
+            "category": self.category,
+            "passed": self.passed,
+            "message": self.message,
+            "details": self.details,
+        }
+
+
+@dataclass
+class ValidationReport:
+    """Aggregated validation report.
+
+    Attributes:
+        checks:     List of all individual check results.
+        start_time: ISO timestamp when validation started.
+        end_time:   ISO timestamp when validation completed.
+    """
+
+    checks: list[CheckResult] = field(default_factory=list)
+    start_time: str = ""
+    end_time: str = ""
+
+    @property
+    def total_checks(self) -> int:
+        """Total number of checks executed."""
+        return len(self.checks)
+
+    @property
+    def passed(self) -> int:
+        """Number of checks that passed."""
+        return sum(1 for c in self.checks if c.passed)
+
+    @property
+    def failed(self) -> int:
+        """Number of checks that failed."""
+        return sum(1 for c in self.checks if not c.passed)
+
+    @property
+    def all_passed(self) -> bool:
+        """Whether all checks passed."""
+        return self.failed == 0
+
+    def add(self, check: CheckResult) -> None:
+        """Add a check result to the report."""
+        self.checks.append(check)
+        status = "PASS" if check.passed else "FAIL"
+        logger.info("[%s] %s: %s", status, check.name, check.message)
+
+    def to_dict(self) -> dict[str, Any]:
+        """Serialize to a JSON-compatible dict."""
+        return {
+            "summary": {
+                "total": self.total_checks,
+                "passed": self.passed,
+                "failed": self.failed,
+                "all_passed": self.all_passed,
+            },
+            "start_time": self.start_time,
+            "end_time": self.end_time,
+            "checks": [c.to_dict() for c in self.checks],
+        }
+
+
+# ---------------------------------------------------------------------------
+# Span Validation (Jaeger API)
+# ---------------------------------------------------------------------------
+
+
+async def validate_spans(
+    session: aiohttp.ClientSession,
+    jaeger_url: str,
+    report: ValidationReport,
+) -> None:
+    """Validate that all expected spans appear in Jaeger.
+
+    Queries the Jaeger HTTP API for each expected span name and checks
+    that traces exist. Also validates required attributes on spans and
+    parent-child relationships.
+
+    Args:
+        session:    aiohttp client session.
+        jaeger_url: Base URL for Jaeger API (e.g., http://localhost:16686).
+        report:     ValidationReport to accumulate results.
+    """
+    logger.info("--- Span Validation (Jaeger) ---")
+
+    # Load expected spans.
+    with open(EXPECTED_SPANS_FILE) as f:
+        expected = json.load(f)
+
+    # Check service registration.
+    try:
+        async with session.get(f"{jaeger_url}/api/services") as resp:
+            data = await resp.json()
+            services = data.get("data", [])
+            has_rippled = "rippled" in services
+            report.add(
+                CheckResult(
+                    name="span.service_registration",
+                    category="span",
+                    passed=has_rippled,
+                    message=(
+                        f"Service 'rippled' registered (found: {services})"
+                        if has_rippled
+                        else f"Service 'rippled' NOT found (found: {services})"
+                    ),
+                )
+            )
+    except Exception as exc:
+        report.add(
+            CheckResult(
+                name="span.service_registration",
+                category="span",
+                passed=False,
+                message=f"Jaeger API unreachable: {exc}",
+            )
+        )
+        return
+
+    # Check each expected span.
+    for span_def in expected["spans"]:
+        span_name = span_def["name"]
+        # For wildcard spans (rpc.command.*), search with regex pattern.
+        if "*" in span_name:
+            operation = span_name.replace("*", "")
+            # Query a concrete example: rpc.command.server_info.
+            operation = "rpc.command.server_info"
+            check_name = f"span.{span_name}"
+        else:
+            operation = span_name
+            check_name = f"span.{span_name}"
+
+        try:
+            params = {
+                "service": "rippled",
+                "operation": operation,
+                "limit": 5,
+                "lookback": "1h",
+            }
+            async with session.get(f"{jaeger_url}/api/traces", params=params) as resp:
+                data = await resp.json()
+                traces = data.get("data", [])
+                count = len(traces)
+                report.add(
+                    CheckResult(
+                        name=check_name,
+                        category="span",
+                        passed=count > 0,
+                        message=(
+                            f"{span_name}: {count} traces found"
+                            if count > 0
+                            else f"{span_name}: 0 traces (expected > 0)"
+                        ),
+                        details={"trace_count": count},
+                    )
+                )
+
+                # Validate required attributes on first trace.
+                if count > 0 and span_def.get("required_attributes"):
+                    await _validate_span_attributes(traces[0], span_def, report)
+        except Exception as exc:
+            report.add(
+                CheckResult(
+                    name=check_name,
+                    category="span",
+                    passed=False,
+                    message=f"{span_name}: query failed ({exc})",
+                )
+            )
+
+    # Validate parent-child relationships.
+    for rel in expected.get("parent_child_relationships", []):
+        await _validate_parent_child(session, jaeger_url, rel, report)
+
+
+async def _validate_span_attributes(
+    trace: dict[str, Any],
+    span_def: dict[str, Any],
+    report: ValidationReport,
+) -> None:
+    """Check that a trace's spans contain expected attributes.
+
+    Args:
+        trace:    A Jaeger trace object (from /api/traces).
+        span_def: Span definition from expected_spans.json.
+        report:   ValidationReport to accumulate results.
+    """
+    required_attrs = span_def.get("required_attributes", [])
+    if not required_attrs:
+        return
+
+    span_name = span_def["name"]
+    # Collect all tag keys from all spans in the trace.
+    found_attrs: set[str] = set()
+    for span in trace.get("spans", []):
+        for tag in span.get("tags", []):
+            found_attrs.add(tag.get("key", ""))
+
+    missing = [a for a in required_attrs if a not in found_attrs]
+    report.add(
+        CheckResult(
+            name=f"span.attrs.{span_name}",
+            category="span",
+            passed=len(missing) == 0,
+            message=(
+                f"{span_name}: all {len(required_attrs)} attributes present"
+                if not missing
+                else f"{span_name}: missing attributes: {missing}"
+            ),
+            details={
+                "required": required_attrs,
+                "found": list(found_attrs),
+                "missing": missing,
+            },
+        )
+    )
+
+
+async def _validate_parent_child(
+    session: aiohttp.ClientSession,
+    jaeger_url: str,
+    relationship: dict[str, Any],
+    report: ValidationReport,
+) -> None:
+    """Validate a parent-child span relationship in Jaeger traces.
+
+    Args:
+        session:      aiohttp client session.
+        jaeger_url:   Base URL for Jaeger API.
+        relationship: Dict with 'parent' and 'child' span names.
+        report:       ValidationReport to accumulate results.
+    """
+    parent_name = relationship["parent"]
+    child_name = relationship["child"]
+
+    try:
+        # Query traces for the parent span.
+        params = {
+            "service": "rippled",
+            "operation": parent_name,
+            "limit": 3,
+            "lookback": "1h",
+        }
+        async with session.get(f"{jaeger_url}/api/traces", params=params) as resp:
+            data = await resp.json()
+            traces = data.get("data", [])
+
+        if not traces:
+            report.add(
+                CheckResult(
+                    name=f"span.hierarchy.{parent_name}->{child_name}",
+                    category="span",
+                    passed=False,
+                    message=f"No {parent_name} traces to check hierarchy",
+                )
+            )
+            return
+
+        # Check if child spans exist within parent traces.
+        # Use the concrete child name for wildcard patterns.
+        concrete_child = child_name.replace("*", "server_info")
+        found_child = False
+        for trace in traces:
+            for span in trace.get("spans", []):
+                op = span.get("operationName", "")
+                if concrete_child in op or ("*" not in child_name and op == child_name):
+                    found_child = True
+                    break
+            if found_child:
+                break
+
+        report.add(
+            CheckResult(
+                name=f"span.hierarchy.{parent_name}->{child_name}",
+                category="span",
+                passed=found_child,
+                message=(
+                    f"Found {child_name} as child of {parent_name}"
+                    if found_child
+                    else f"{child_name} not found in {parent_name} traces"
+                ),
+            )
+        )
+    except Exception as exc:
+        report.add(
+            CheckResult(
+                name=f"span.hierarchy.{parent_name}->{child_name}",
+                category="span",
+                passed=False,
+                message=f"Hierarchy check failed: {exc}",
+            )
+        )
+
+
+# ---------------------------------------------------------------------------
+# Metric Validation (Prometheus API)
+# ---------------------------------------------------------------------------
+
+
+async def validate_metrics(
+    session: aiohttp.ClientSession,
+    prometheus_url: str,
+    report: ValidationReport,
+) -> None:
+    """Validate that expected metrics appear in Prometheus with non-zero values.
+
+    Args:
+        session:        aiohttp client session.
+        prometheus_url: Base URL for Prometheus API (e.g., http://localhost:9090).
+        report:         ValidationReport to accumulate results.
+    """
+    logger.info("--- Metric Validation (Prometheus) ---")
+
+    with open(EXPECTED_METRICS_FILE) as f:
+        expected = json.load(f)
+
+    # Check each metric category.
+    for category_key, category_data in expected.items():
+        if category_key in ("description", "grafana_dashboards"):
+            continue
+
+        metrics = category_data.get("metrics", [])
+        for metric_name in metrics:
+            await _check_prometheus_metric(
+                session, prometheus_url, metric_name, category_key, report
+            )
+
+
+async def _check_prometheus_metric(
+    session: aiohttp.ClientSession,
+    prometheus_url: str,
+    metric_name: str,
+    category: str,
+    report: ValidationReport,
+) -> None:
+    """Query Prometheus for a specific metric and check it exists.
+
+    Args:
+        session:        aiohttp client session.
+        prometheus_url: Prometheus base URL.
+        metric_name:    Prometheus metric name.
+        category:       Metric category for the report.
+        report:         ValidationReport to accumulate results.
+    """
+    try:
+        params = {"query": metric_name}
+        async with session.get(f"{prometheus_url}/api/v1/query", params=params) as resp:
+            data = await resp.json()
+            results = data.get("data", {}).get("result", [])
+            series_count = len(results)
+            report.add(
+                CheckResult(
+                    name=f"metric.{category}.{metric_name}",
+                    category="metric",
+                    passed=series_count > 0,
+                    message=(
+                        f"{metric_name}: {series_count} series"
+                        if series_count > 0
+                        else f"{metric_name}: 0 series (expected > 0)"
+                    ),
+                    details={"series_count": series_count},
+                )
+            )
+    except Exception as exc:
+        report.add(
+            CheckResult(
+                name=f"metric.{category}.{metric_name}",
+                category="metric",
+                passed=False,
+                message=f"{metric_name}: query failed ({exc})",
+            )
+        )
+
+
+# ---------------------------------------------------------------------------
+# Log-Trace Correlation Validation (Loki API)
+# ---------------------------------------------------------------------------
+
+
+async def validate_log_trace_correlation(
+    session: aiohttp.ClientSession,
+    loki_url: str,
+    jaeger_url: str,
+    report: ValidationReport,
+) -> None:
+    """Validate that Loki logs contain trace_id/span_id for correlation.
+
+    Checks:
+      1. Logs with trace_id= field exist in Loki.
+      2. A random trace_id from Jaeger can be found in Loki logs.
+
+    Args:
+        session:    aiohttp client session.
+        loki_url:   Base URL for Loki API (e.g., http://localhost:3100).
+        jaeger_url: Base URL for Jaeger API.
+        report:     ValidationReport to accumulate results.
+    """
+    logger.info("--- Log-Trace Correlation Validation (Loki) ---")
+
+    # Check 1: Any logs with trace_id exist.
+    try:
+        params = {
+            "query": '{job="rippled"} |= "trace_id="',
+            "limit": 5,
+            "direction": "backward",
+        }
+        async with session.get(
+            f"{loki_url}/loki/api/v1/query_range", params=params
+        ) as resp:
+            data = await resp.json()
+            streams = data.get("data", {}).get("result", [])
+            total_entries = sum(len(s.get("values", [])) for s in streams)
+            report.add(
+                CheckResult(
+                    name="log.trace_id_present",
+                    category="log",
+                    passed=total_entries > 0,
+                    message=(
+                        f"Found {total_entries} log entries with trace_id"
+                        if total_entries > 0
+                        else "No log entries with trace_id found"
+                    ),
+                    details={"log_count": total_entries},
+                )
+            )
+    except Exception as exc:
+        report.add(
+            CheckResult(
+                name="log.trace_id_present",
+                category="log",
+                passed=False,
+                message=f"Loki query failed: {exc}",
+            )
+        )
+
+    # Check 2: Cross-reference a trace_id from Jaeger to Loki.
+    try:
+        # Get a recent trace from Jaeger.
+        params = {
+            "service": "rippled",
+            "limit": 1,
+            "lookback": "1h",
+        }
+        async with session.get(f"{jaeger_url}/api/traces", params=params) as resp:
+            data = await resp.json()
+            traces = data.get("data", [])
+
+        if traces:
+            trace_id = traces[0].get("traceID", "")
+            if trace_id:
+                # Search Loki for this trace_id.
+                loki_params = {
+                    "query": f'{{job="rippled"}} |= "{trace_id}"',
+                    "limit": 5,
+                    "direction": "backward",
+                }
+                async with session.get(
+                    f"{loki_url}/loki/api/v1/query_range",
+                    params=loki_params,
+                ) as loki_resp:
+                    loki_data = await loki_resp.json()
+                    loki_streams = loki_data.get("data", {}).get("result", [])
+                    loki_count = sum(len(s.get("values", [])) for s in loki_streams)
+                    report.add(
+                        CheckResult(
+                            name="log.trace_id_cross_reference",
+                            category="log",
+                            passed=loki_count > 0,
+                            message=(
+                                f"trace_id {trace_id[:16]}... found in "
+                                f"{loki_count} Loki entries"
+                                if loki_count > 0
+                                else f"trace_id {trace_id[:16]}... not found " "in Loki"
+                            ),
+                            details={
+                                "trace_id": trace_id,
+                                "loki_count": loki_count,
+                            },
+                        )
+                    )
+        else:
+            report.add(
+                CheckResult(
+                    name="log.trace_id_cross_reference",
+                    category="log",
+                    passed=False,
+                    message="No traces in Jaeger to cross-reference",
+                )
+            )
+    except Exception as exc:
+        report.add(
+            CheckResult(
+                name="log.trace_id_cross_reference",
+                category="log",
+                passed=False,
+                message=f"Cross-reference check failed: {exc}",
+            )
+        )
+
+
+# ---------------------------------------------------------------------------
+# Dashboard Validation (Grafana API)
+# ---------------------------------------------------------------------------
+
+
+async def validate_dashboards(
+    session: aiohttp.ClientSession,
+    grafana_url: str,
+    report: ValidationReport,
+) -> None:
+    """Validate that all Grafana dashboards are accessible and return data.
+
+    For each expected dashboard UID, queries the Grafana API to verify
+    the dashboard exists and is loadable.
+
+    Args:
+        session:     aiohttp client session.
+        grafana_url: Base URL for Grafana API (e.g., http://localhost:3000).
+        report:      ValidationReport to accumulate results.
+    """
+    logger.info("--- Dashboard Validation (Grafana) ---")
+
+    with open(EXPECTED_METRICS_FILE) as f:
+        expected = json.load(f)
+
+    dashboard_uids = expected.get("grafana_dashboards", {}).get("uids", [])
+
+    for uid in dashboard_uids:
+        try:
+            async with session.get(f"{grafana_url}/api/dashboards/uid/{uid}") as resp:
+                if resp.status == 200:
+                    data = await resp.json()
+                    dashboard = data.get("dashboard", {})
+                    panel_count = len(dashboard.get("panels", []))
+                    report.add(
+                        CheckResult(
+                            name=f"dashboard.{uid}",
+                            category="dashboard",
+                            passed=True,
+                            message=(f"{uid}: loaded ({panel_count} panels)"),
+                            details={"panel_count": panel_count},
+                        )
+                    )
+                else:
+                    report.add(
+                        CheckResult(
+                            name=f"dashboard.{uid}",
+                            category="dashboard",
+                            passed=False,
+                            message=f"{uid}: HTTP {resp.status}",
+                        )
+                    )
+        except Exception as exc:
+            report.add(
+                CheckResult(
+                    name=f"dashboard.{uid}",
+                    category="dashboard",
+                    passed=False,
+                    message=f"{uid}: query failed ({exc})",
+                )
+            )
+
+
+# ---------------------------------------------------------------------------
+# Span duration validation
+# ---------------------------------------------------------------------------
+
+
+async def validate_span_durations(
+    session: aiohttp.ClientSession,
+    jaeger_url: str,
+    report: ValidationReport,
+) -> None:
+    """Validate that span durations are within reasonable bounds.
+
+    Checks that spans have duration > 0 and < 60s, flagging any anomalies.
+
+    Args:
+        session:    aiohttp client session.
+        jaeger_url: Base URL for Jaeger API.
+        report:     ValidationReport to accumulate results.
+    """
+    logger.info("--- Span Duration Validation ---")
+
+    try:
+        params = {
+            "service": "rippled",
+            "limit": 20,
+            "lookback": "1h",
+        }
+        async with session.get(f"{jaeger_url}/api/traces", params=params) as resp:
+            data = await resp.json()
+            traces = data.get("data", [])
+
+        if not traces:
+            report.add(
+                CheckResult(
+                    name="span.duration_bounds",
+                    category="span",
+                    passed=False,
+                    message="No traces available for duration check",
+                )
+            )
+            return
+
+        total_spans = 0
+        invalid_spans = 0
+        max_duration_us = 0
+
+        for trace in traces:
+            for span in trace.get("spans", []):
+                duration = span.get("duration", 0)  # microseconds
+                total_spans += 1
+                max_duration_us = max(max_duration_us, duration)
+                if duration <= 0 or duration > 60_000_000:
+                    invalid_spans += 1
+
+        report.add(
+            CheckResult(
+                name="span.duration_bounds",
+                category="span",
+                passed=invalid_spans == 0,
+                message=(
+                    f"All {total_spans} spans have valid durations "
+                    f"(max: {max_duration_us / 1000:.1f}ms)"
+                    if invalid_spans == 0
+                    else f"{invalid_spans}/{total_spans} spans have invalid "
+                    "durations (<=0 or >60s)"
+                ),
+                details={
+                    "total_spans": total_spans,
+                    "invalid_spans": invalid_spans,
+                    "max_duration_ms": round(max_duration_us / 1000, 2),
+                },
+            )
+        )
+    except Exception as exc:
+        report.add(
+            CheckResult(
+                name="span.duration_bounds",
+                category="span",
+                passed=False,
+                message=f"Duration check failed: {exc}",
+            )
+        )
+
+
+# ---------------------------------------------------------------------------
+# Main validation orchestrator
+# ---------------------------------------------------------------------------
+
+
+async def run_validation(
+    jaeger_url: str,
+    prometheus_url: str,
+    loki_url: str,
+    grafana_url: str,
+    skip_loki: bool = False,
+) -> ValidationReport:
+    """Run all validation checks and return a report.
+
+    Args:
+        jaeger_url:     Jaeger API base URL.
+        prometheus_url: Prometheus API base URL.
+        loki_url:       Loki API base URL.
+        grafana_url:    Grafana API base URL.
+        skip_loki:      If True, skip log-trace correlation checks.
+
+    Returns:
+        ValidationReport with all check results.
+    """
+    report = ValidationReport()
+    report.start_time = time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime())
+
+    async with aiohttp.ClientSession() as session:
+        await validate_spans(session, jaeger_url, report)
+        await validate_span_durations(session, jaeger_url, report)
+        await validate_metrics(session, prometheus_url, report)
+        if not skip_loki:
+            await validate_log_trace_correlation(session, loki_url, jaeger_url, report)
+        await validate_dashboards(session, grafana_url, report)
+
+    report.end_time = time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime())
+    return report
+
+
+# ---------------------------------------------------------------------------
+# CLI entry point
+# ---------------------------------------------------------------------------
+
+
+def parse_args() -> argparse.Namespace:
+    """Parse command-line arguments."""
+    parser = argparse.ArgumentParser(
+        description="Telemetry Validation Suite for rippled",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+  # Run all validations with defaults:
+  python3 validate_telemetry.py
+
+  # Write report to file:
+  python3 validate_telemetry.py --report /tmp/validation-report.json
+
+  # Custom endpoints:
+  python3 validate_telemetry.py \\
+      --jaeger http://jaeger:16686 --prometheus http://prom:9090
+
+  # Skip Loki checks (if log-trace correlation is not set up):
+  python3 validate_telemetry.py --skip-loki
+        """,
+    )
+    parser.add_argument(
+        "--jaeger",
+        type=str,
+        default=DEFAULT_JAEGER,
+        help=f"Jaeger API URL (default: {DEFAULT_JAEGER})",
+    )
+    parser.add_argument(
+        "--prometheus",
+        type=str,
+        default=DEFAULT_PROMETHEUS,
+        help=f"Prometheus API URL (default: {DEFAULT_PROMETHEUS})",
+    )
+    parser.add_argument(
+        "--loki",
+        type=str,
+        default=DEFAULT_LOKI,
+        help=f"Loki API URL (default: {DEFAULT_LOKI})",
+    )
+    parser.add_argument(
+        "--grafana",
+        type=str,
+        default=DEFAULT_GRAFANA,
+        help=f"Grafana API URL (default: {DEFAULT_GRAFANA})",
+    )
+    parser.add_argument(
+        "--skip-loki",
+        action="store_true",
+        help="Skip log-trace correlation validation",
+    )
+    parser.add_argument(
+        "--report",
+        type=str,
+        default=None,
+        help="Write JSON report to this file path",
+    )
+    parser.add_argument(
+        "--verbose",
+        action="store_true",
+        help="Enable debug logging",
+    )
+    return parser.parse_args()
+
+
+def main() -> None:
+    """Main entry point for the telemetry validation suite."""
+    args = parse_args()
+
+    logging.basicConfig(
+        level=logging.DEBUG if args.verbose else logging.INFO,
+        format="%(asctime)s [%(name)s] %(levelname)s %(message)s",
+    )
+
+    report = asyncio.run(
+        run_validation(
+            jaeger_url=args.jaeger,
+            prometheus_url=args.prometheus,
+            loki_url=args.loki,
+            grafana_url=args.grafana,
+            skip_loki=args.skip_loki,
+        )
+    )
+
+    # Print summary.
+    print("")
+    print("=" * 60)
+    print("  TELEMETRY VALIDATION REPORT")
+    print("=" * 60)
+    print(f"  Total checks: {report.total_checks}")
+    print(f"  Passed:       {report.passed}")
+    print(f"  Failed:       {report.failed}")
+    print("=" * 60)
+    print("")
+
+    # Print failures.
+    if report.failed > 0:
+        print("FAILED CHECKS:")
+        for check in report.checks:
+            if not check.passed:
+                print(f"  [{check.category}] {check.name}: {check.message}")
+        print("")
+
+    # Write report file.
+    report_dict = report.to_dict()
+    if args.report:
+        with open(args.report, "w") as f:
+            json.dump(report_dict, f, indent=2)
+        logger.info("Report written to %s", args.report)
+    else:
+        print(json.dumps(report_dict, indent=2))
+
+    # Exit with appropriate code for CI.
+    sys.exit(0 if report.all_passed else 1)
+
+
+if __name__ == "__main__":
+    main()
--- a/docker/telemetry/workload/xrpld-validator.cfg.template
+++ b/docker/telemetry/workload/xrpld-validator.cfg.template
@@ -0,0 +1,94 @@
+# xrpld validator node configuration template for workload harness.
+#
+# Placeholders (replaced by docker-compose entrypoint):
+#   {{NODE_INDEX}}        — Node number (1-based)
+#   {{RPC_PORT}}          — HTTP RPC port
+#   {{WS_PORT}}           — WebSocket port
+#   {{PEER_PORT}}         — Peer protocol port
+#   {{DATA_DIR}}          — Node data directory
+#   {{VALIDATION_SEED}}   — Validator seed from key generation
+#   {{VALIDATORS_FILE}}   — Path to shared validators.txt
+#   {{IPS_FIXED}}         — Peer addresses (one per line)
+#   {{OTEL_ENDPOINT}}     — OTel Collector OTLP/HTTP endpoint
+#   {{STATSD_ADDRESS}}    — StatsD UDP address (host:port)
+#   {{LOG_LEVEL}}         — Log level (debug, info, warning, error)
+
+[server]
+port_rpc
+port_ws
+port_peer
+
+[port_rpc]
+port = {{RPC_PORT}}
+ip = 0.0.0.0
+admin = 0.0.0.0
+protocol = http
+
+[port_ws]
+port = {{WS_PORT}}
+ip = 0.0.0.0
+admin = 0.0.0.0
+protocol = ws
+
+[port_peer]
+port = {{PEER_PORT}}
+ip = 0.0.0.0
+protocol = peer
+
+[node_db]
+type=NuDB
+path={{DATA_DIR}}/nudb
+online_delete=256
+
+[database_path]
+{{DATA_DIR}}/db
+
+[debug_logfile]
+{{DATA_DIR}}/debug.log
+
+[validation_seed]
+{{VALIDATION_SEED}}
+
+[validators_file]
+{{VALIDATORS_FILE}}
+
+[ips_fixed]
+{{IPS_FIXED}}
+
+[peer_private]
+1
+
+# --- OpenTelemetry tracing (all categories enabled) ---
+[telemetry]
+enabled=1
+service_instance_id=validator-{{NODE_INDEX}}
+endpoint={{OTEL_ENDPOINT}}
+exporter=otlp_http
+sampling_ratio=1.0
+batch_size=512
+batch_delay_ms=2000
+max_queue_size=2048
+trace_rpc=1
+trace_transactions=1
+trace_consensus=1
+trace_peer=1
+trace_ledger=1
+
+# --- StatsD metrics (beast::insight) ---
+[insight]
+server=statsd
+address={{STATSD_ADDRESS}}
+prefix=rippled
+
+[rpc_startup]
+{ "command": "log_level", "severity": "{{LOG_LEVEL}}" }
+
+[ssl_verify]
+0
+
+# --- Network tuning for local cluster ---
+[network_id]
+0
+
+[sntp_servers]
+time.google.com
--- a/docs/telemetry-runbook.md
+++ b/docs/telemetry-runbook.md
@@ -530,3 +530,77 @@ cmake --preset default -Dtelemetry=OFF
 ```

 When telemetry is compiled out, all trace macros expand to no-ops with zero overhead.
+
+## Validating Telemetry Stack
+
+After deploying telemetry, use the Phase 10 workload tools to validate the full stack end-to-end.
+
+### Quick Validation
+
+```bash
+# Run the full validation suite (starts cluster, generates load, validates):
+docker/telemetry/workload/run-full-validation.sh --xrpld .build/xrpld
+
+# Check the report:
+cat /tmp/xrpld-validation/reports/validation-report.json | jq '.summary'
+```
+
+### What Gets Validated
+
+| Category   | Checks         | Description                                              |
+| ---------- | -------------- | -------------------------------------------------------- |
+| Spans      | 16+ span types | All span names appear in Jaeger with required attributes |
+| Metrics    | 30+ metrics    | SpanMetrics, StatsD gauges/counters, Phase 9 metrics     |
+| Logs       | 2 checks       | trace_id/span_id present in Loki, cross-reference works  |
+| Dashboards | 10 dashboards  | All Grafana dashboards load without errors               |
+
+### Running Individual Tools
+
+```bash
+# RPC load only:
+python3 docker/telemetry/workload/rpc_load_generator.py \
+    --endpoints ws://localhost:6006 --rate 50 --duration 120
+
+# Transaction mix only:
+python3 docker/telemetry/workload/tx_submitter.py \
+    --endpoint ws://localhost:6006 --tps 5 --duration 120
+
+# Validation only (assumes load already ran):
+python3 docker/telemetry/workload/validate_telemetry.py \
+    --report /tmp/report.json
+```
+
+### Interpreting Failures
+
+- **Span failures**: Check that the relevant trace category is enabled in `[telemetry]` config (e.g., `trace_rpc=1`).
+- **Metric failures**: Verify the OTel Collector is running and Prometheus is scraping port 8889. Check `docker compose logs otel-collector`.
+- **Dashboard failures**: Ensure Grafana provisioning is mounted correctly. Check `docker compose logs grafana`.
+
+## Performance Benchmarking
+
+Measure the overhead of the telemetry stack against a baseline:
+
+```bash
+docker/telemetry/workload/benchmark.sh --xrpld .build/xrpld --duration 300
+```
+
+### Benchmark Thresholds
+
+| Metric            | Target | Description                            |
+| ----------------- | ------ | -------------------------------------- |
+| CPU overhead      | < 3%   | Average CPU increase across nodes      |
+| Memory overhead   | < 5MB  | Peak RSS increase per node             |
+| RPC p99 latency   | < 2ms  | Additional p99 latency for server_info |
+| Throughput impact | < 5%   | Reduction in ledger close rate         |
+| Consensus impact  | < 1%   | Increase in consensus round time       |
+
+### Tuning for Production
+
+If benchmarks exceed thresholds:
+
+1. **Reduce sampling**: `sampling_ratio=0.01` (1% of traces)
+2. **Disable peer tracing**: `trace_peer=0` (highest volume category)
+3. **Increase batch delay**: `batch_delay_ms=10000` (less frequent exports)
+4. **Reduce queue size**: `max_queue_size=1024` (back-pressure earlier)
+
+See `docker/telemetry/workload/README.md` for full documentation.