feat(telemetry): add Phase 5 documentation, deployment configs, and integration tests

Add the observability stack deployment infrastructure and integration test framework for verifying end-to-end trace export. - Add Grafana dashboards: RPC performance, transaction overview, consensus health (pre-provisioned via dashboards.yaml) - Add Prometheus config for spanmetrics collection from OTel Collector - Update OTel Collector config with spanmetrics connector and prometheus exporter for RED metrics - Add docker-compose services: prometheus, dashboard provisioning - Add integration-test.sh with Tempo API-based span verification (replaces previous Jaeger-based approach) - Add TESTING.md with step-by-step deployment and verification guide - Add telemetry-runbook.md for production operations reference - Add xrpld-telemetry.cfg sample configuration - Add toDisplayString() for ConsensusMode (human-readable span values) - Update Phase 2/3 task lists with known issues sections - Add Phase 5 integration test task list - Add TraceContext protobuf fields for future relay propagation - Wire telemetry lifecycle (setServiceInstanceId/start/stop) in Application.cpp Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 15:37:57 +00:00 · 2026-04-24 21:43:17 +01:00
parent ccd3ceac84
commit 360ecbdb44
25 changed files with 2387 additions and 24 deletions
--- a/docker/telemetry/.gitignore
+++ b/docker/telemetry/.gitignore
@@ -0,0 +1,2 @@
+# Runtime data generated by xrpld and telemetry stack
+data/
--- a/docker/telemetry/TESTING.md
+++ b/docker/telemetry/TESTING.md
@@ -0,0 +1,512 @@
+# OpenTelemetry Integration Testing Guide
+
+This document describes how to verify the rippled OpenTelemetry telemetry
+pipeline end-to-end, from span generation through the observability stack
+(otel-collector, Tempo, Prometheus, Grafana).
+
+---
+
+## Prerequisites
+
+### Build xrpld with telemetry
+
+```bash
+conan install . --build=missing -o telemetry=True
+cmake --preset default -Dtelemetry=ON
+cmake --build --preset default --target xrpld
+```
+
+The binary is at `.build/xrpld`.
+
+### Required tools
+
+- **Docker** with `docker compose` (v2)
+- **curl**
+- **jq** (JSON processor)
+
+### Verify binary
+
+```bash
+.build/xrpld --version
+```
+
+---
+
+## Test 1: Single-Node Standalone (Quick Verification)
+
+This test verifies RPC and transaction spans in standalone mode. Consensus
+spans will not fire because standalone mode does not run consensus.
+
+### Step 1: Start the observability stack
+
+```bash
+docker compose -f docker/telemetry/docker-compose.yml up -d
+```
+
+Wait for services to be ready:
+
+```bash
+# otel-collector health
+curl -sf http://localhost:13133/ && echo "collector ready"
+
+# Tempo readiness
+curl -sf http://localhost:3200/ready > /dev/null && echo "tempo ready"
+```
+
+### Step 2: Start xrpld in standalone mode
+
+```bash
+.build/xrpld --conf docker/telemetry/xrpld-telemetry.cfg -a --start
+```
+
+Wait a few seconds for the node to initialize.
+
+### Step 3: Exercise RPC spans
+
+```bash
+# server_info
+curl -s http://localhost:5005 \
+  -d '{"method":"server_info"}' | jq .result.info.server_state
+
+# server_state
+curl -s http://localhost:5005 \
+  -d '{"method":"server_state"}' | jq .result.state.server_state
+
+# ledger
+curl -s http://localhost:5005 \
+  -d '{"method":"ledger","params":[{"ledger_index":"current"}]}' \
+  | jq .result.ledger_current_index
+```
+
+### Step 4: Submit a transaction
+
+Close the ledger first (required in standalone mode):
+
+```bash
+curl -s http://localhost:5005 -d '{"method":"ledger_accept"}'
+```
+
+Submit a Payment from the genesis account:
+
+```bash
+curl -s http://localhost:5005 -d '{
+  "method": "submit",
+  "params": [{
+    "secret": "snoPBrXtMeMyMHUVTgbuqAfg1SUTb",
+    "tx_json": {
+      "TransactionType": "Payment",
+      "Account": "rHb9CJAWyB4rj91VRWn96DkukG4bwdtyTh",
+      "Destination": "rPMh7Pi9ct699iZUTWzJaUMR1o42VEfGqF",
+      "Amount": "10000000"
+    }
+  }]
+}' | jq .result.engine_result
+```
+
+Expected result: `"tesSUCCESS"`.
+
+Close the ledger again to finalize:
+
+```bash
+curl -s http://localhost:5005 -d '{"method":"ledger_accept"}'
+```
+
+### Step 5: Verify traces in Tempo
+
+Wait 5 seconds for the batch export, then:
+
+```bash
+TEMPO="http://localhost:3200"
+
+# Check rippled service is registered
+curl -s "$TEMPO/api/v2/search/tag/resource.service.name/values" | jq '.tagValues[].value'
+
+# Check RPC spans
+curl -s "$TEMPO/api/search" \
+  --data-urlencode 'q={resource.service.name="rippled" && name="rpc.request"}' \
+  --data-urlencode 'limit=5' | jq '.traces | length'
+
+curl -s "$TEMPO/api/search" \
+  --data-urlencode 'q={resource.service.name="rippled" && name="rpc.process"}' \
+  --data-urlencode 'limit=5' | jq '.traces | length'
+
+curl -s "$TEMPO/api/search" \
+  --data-urlencode 'q={resource.service.name="rippled" && name="rpc.command.server_info"}' \
+  --data-urlencode 'limit=5' | jq '.traces | length'
+
+# Check transaction spans
+curl -s "$TEMPO/api/search" \
+  --data-urlencode 'q={resource.service.name="rippled" && name="tx.process"}' \
+  --data-urlencode 'limit=5' | jq '.traces | length'
+```
+
+Or open Grafana Explore with Tempo datasource: http://localhost:3000
+
+### Step 6: Teardown
+
+```bash
+# Kill xrpld (Ctrl+C or)
+kill $(pgrep -f 'xrpld.*xrpld-telemetry')
+
+# Stop observability stack
+docker compose -f docker/telemetry/docker-compose.yml down
+
+# Clean xrpld data
+rm -rf data/
+```
+
+### Expected spans (standalone mode)
+
+| Span Name                   | Expected | Notes                         |
+| --------------------------- | -------- | ----------------------------- |
+| `rpc.request`               | Yes      | Every HTTP RPC call           |
+| `rpc.process`               | Yes      | Every RPC processing          |
+| `rpc.command.server_info`   | Yes      | server_info RPC               |
+| `rpc.command.server_state`  | Yes      | server_state RPC              |
+| `rpc.command.ledger`        | Yes      | ledger RPC                    |
+| `rpc.command.submit`        | Yes      | submit RPC                    |
+| `rpc.command.ledger_accept` | Yes      | ledger_accept RPC             |
+| `tx.process`                | Yes      | Transaction submission        |
+| `tx.receive`                | No       | No peers in standalone        |
+| `consensus.*`               | No       | Consensus disabled standalone |
+
+---
+
+## Test 2: 6-Node Consensus Network (Full Verification)
+
+This test verifies ALL span categories including consensus and peer
+transaction relay, using a 6-node validator network.
+
+### Automated
+
+Run the integration test script:
+
+```bash
+bash docker/telemetry/integration-test.sh
+```
+
+The script will:
+
+1. Start the observability stack
+2. Generate 6 validator key pairs
+3. Create config files for each node
+4. Start all 6 nodes
+5. Wait for consensus ("proposing" state)
+6. Exercise RPC, submit transactions
+7. Verify all span categories in Tempo
+8. Verify spanmetrics in Prometheus
+9. Print results and leave the stack running
+
+### Manual
+
+If you prefer to run the steps manually:
+
+#### Step 1: Start observability stack
+
+```bash
+docker compose -f docker/telemetry/docker-compose.yml up -d
+```
+
+#### Step 2: Generate validator keys
+
+Start a temporary standalone xrpld:
+
+```bash
+.build/xrpld --conf docker/telemetry/xrpld-telemetry.cfg -a --start &
+TEMP_PID=$!
+sleep 5
+```
+
+Generate 6 key pairs:
+
+```bash
+for i in $(seq 1 6); do
+  curl -s http://localhost:5005 \
+    -d '{"method":"validation_create"}' | jq '.result'
+done
+```
+
+Record the `validation_seed` and `validation_public_key` for each.
+Kill the temporary node:
+
+```bash
+kill $TEMP_PID
+rm -rf data/
+```
+
+#### Step 3: Create node configs
+
+For each node (1-6), create a config file. Template:
+
+```ini
+[server]
+port_rpc
+port_peer
+
+[port_rpc]
+port = {5004 + node_number}
+ip = 127.0.0.1
+admin = 127.0.0.1
+protocol = http
+
+[port_peer]
+port = {51234 + node_number}
+ip = 0.0.0.0
+protocol = peer
+
+[node_db]
+type=NuDB
+path=/tmp/xrpld-integration/node{N}/nudb
+online_delete=256
+
+[database_path]
+/tmp/xrpld-integration/node{N}/db
+
+[debug_logfile]
+/tmp/xrpld-integration/node{N}/debug.log
+
+[validation_seed]
+{seed from step 2}
+
+[validators_file]
+/tmp/xrpld-integration/validators.txt
+
+[ips_fixed]
+127.0.0.1 51235
+127.0.0.1 51236
+127.0.0.1 51237
+127.0.0.1 51238
+127.0.0.1 51239
+127.0.0.1 51240
+
+[peer_private]
+1
+
+[telemetry]
+enabled=1
+endpoint=http://localhost:4318/v1/traces
+exporter=otlp_http
+sampling_ratio=1.0
+batch_size=512
+batch_delay_ms=2000
+max_queue_size=2048
+trace_rpc=1
+trace_transactions=1
+trace_consensus=1
+trace_peer=0
+trace_ledger=1
+
+[rpc_startup]
+{ "command": "log_level", "severity": "warning" }
+
+[ssl_verify]
+0
+```
+
+#### Step 4: Create validators.txt
+
+```ini
+[validators]
+{public_key_1}
+{public_key_2}
+{public_key_3}
+{public_key_4}
+{public_key_5}
+{public_key_6}
+```
+
+#### Step 5: Start all 6 nodes
+
+```bash
+for i in $(seq 1 6); do
+  .build/xrpld --conf /tmp/xrpld-integration/node$i/xrpld.cfg --start &
+  echo $! > /tmp/xrpld-integration/node$i/xrpld.pid
+done
+```
+
+#### Step 6: Wait for consensus
+
+Poll each node until `server_state` = `"proposing"`:
+
+```bash
+for port in 5005 5006 5007 5008 5009 5010; do
+  while true; do
+    state=$(curl -s http://localhost:$port \
+      -d '{"method":"server_info"}' \
+      | jq -r '.result.info.server_state')
+    echo "Port $port: $state"
+    [ "$state" = "proposing" ] && break
+    sleep 5
+  done
+done
+```
+
+#### Step 7: Exercise RPC and submit transaction
+
+```bash
+# RPC calls
+curl -s http://localhost:5005 -d '{"method":"server_info"}'
+curl -s http://localhost:5005 -d '{"method":"server_state"}'
+curl -s http://localhost:5005 -d '{"method":"ledger","params":[{"ledger_index":"current"}]}'
+
+# Submit transaction
+curl -s http://localhost:5005 -d '{
+  "method": "submit",
+  "params": [{
+    "secret": "snoPBrXtMeMyMHUVTgbuqAfg1SUTb",
+    "tx_json": {
+      "TransactionType": "Payment",
+      "Account": "rHb9CJAWyB4rj91VRWn96DkukG4bwdtyTh",
+      "Destination": "rPMh7Pi9ct699iZUTWzJaUMR1o42VEfGqF",
+      "Amount": "10000000"
+    }
+  }]
+}'
+```
+
+Wait 15 seconds for consensus and batch export.
+
+#### Step 8: Verify in Tempo
+
+See the "Verification Queries" section below.
+
+---
+
+## Expected Span Catalog
+
+All 12 production span names instrumented across Phases 2-4:
+
+| Span Name                   | Source File           | Phase | Key Attributes                                                                    | How to Trigger            |
+| --------------------------- | --------------------- | ----- | --------------------------------------------------------------------------------- | ------------------------- |
+| `rpc.request`               | ServerHandler.cpp:271 | 2     | --                                                                                | Any HTTP RPC call         |
+| `rpc.process`               | ServerHandler.cpp:573 | 2     | --                                                                                | Any HTTP RPC call         |
+| `rpc.ws_message`            | ServerHandler.cpp:384 | 2     | --                                                                                | WebSocket RPC message     |
+| `rpc.command.<name>`        | RPCHandler.cpp:161    | 2     | `xrpl.rpc.command`, `xrpl.rpc.version`, `xrpl.rpc.role`                           | Any RPC command           |
+| `tx.process`                | NetworkOPs.cpp:1227   | 3     | `xrpl.tx.hash`, `xrpl.tx.local`, `xrpl.tx.path`                                   | Submit transaction        |
+| `tx.receive`                | PeerImp.cpp:1273      | 3     | `xrpl.peer.id`                                                                    | Peer relays transaction   |
+| `consensus.proposal.send`   | RCLConsensus.cpp:177  | 4     | `xrpl.consensus.round`                                                            | Consensus proposing phase |
+| `consensus.ledger_close`    | RCLConsensus.cpp:282  | 4     | `xrpl.consensus.ledger.seq`, `xrpl.consensus.mode`                                | Ledger close event        |
+| `consensus.accept`          | RCLConsensus.cpp:395  | 4     | `xrpl.consensus.proposers`, `xrpl.consensus.round_time_ms`                        | Ledger accepted           |
+| `consensus.validation.send` | RCLConsensus.cpp:753  | 4     | `xrpl.consensus.ledger.seq`, `xrpl.consensus.proposing`                           | Validation sent           |
+| `consensus.accept.apply`    | RCLConsensus.cpp:453  | 4     | `xrpl.consensus.close_time`, `close_time_correct`, `close_resolution_ms`, `state` | Ledger apply + close time |
+
+---
+
+## Verification Queries
+
+### Tempo API
+
+Base URL: `http://localhost:3200`
+
+```bash
+TEMPO="http://localhost:3200"
+
+# List all services
+curl -s "$TEMPO/api/v2/search/tag/resource.service.name/values" | jq '.tagValues[].value'
+
+# Query traces by operation
+for op in "rpc.request" "rpc.process" \
+          "rpc.command.server_info" "rpc.command.server_state" "rpc.command.ledger" \
+          "tx.process" "tx.receive" \
+          "consensus.proposal.send" "consensus.ledger_close" \
+          "consensus.accept" "consensus.accept.apply" \
+          "consensus.validation.send"; do
+  count=$(curl -s "$TEMPO/api/search" \
+    --data-urlencode "q={resource.service.name=\"rippled\" && name=\"$op\"}" \
+    --data-urlencode "limit=5" \
+    | jq '.traces | length')
+  printf "%-35s %s traces\n" "$op" "$count"
+done
+```
+
+### Prometheus API
+
+Base URL: `http://localhost:9090`
+
+```bash
+PROM="http://localhost:9090"
+
+# Span call counts (from spanmetrics connector)
+curl -s "$PROM/api/v1/query?query=traces_span_metrics_calls_total" \
+  | jq '.data.result[] | {span: .metric.span_name, count: .value[1]}'
+
+# Latency histogram
+curl -s "$PROM/api/v1/query?query=traces_span_metrics_duration_milliseconds_count" \
+  | jq '.data.result[] | {span: .metric.span_name, count: .value[1]}'
+
+# RPC calls by command
+curl -s "$PROM/api/v1/query?query=traces_span_metrics_calls_total{span_name=~\"rpc.command.*\"}" \
+  | jq '.data.result[] | {command: .metric["xrpl.rpc.command"], count: .value[1]}'
+```
+
+### Grafana
+
+Open http://localhost:3000 (anonymous admin access enabled).
+
+Pre-configured dashboards:
+
+- **RPC Performance**: Request rates, latency percentiles by command
+- **Transaction Overview**: Transaction processing rates and paths
+- **Consensus Health**: Consensus round duration and proposer counts
+
+Pre-configured datasources:
+
+- **Tempo**: Trace data at `http://tempo:3200`
+- **Prometheus**: Metrics at `http://prometheus:9090`
+
+---
+
+## Troubleshooting
+
+### No traces in Tempo
+
+1. Check otel-collector logs:
+   ```bash
+   docker compose -f docker/telemetry/docker-compose.yml logs otel-collector
+   ```
+2. Verify xrpld telemetry config has `enabled=1` and correct endpoint
+3. Check that otel-collector port 4318 is accessible:
+   ```bash
+   curl -sf http://localhost:4318 && echo "reachable"
+   ```
+4. Increase `batch_delay_ms` or decrease `batch_size` in xrpld config
+
+### Nodes not reaching "proposing" state
+
+1. Check that all peer ports (51235-51240) are not in use:
+   ```bash
+   for p in 51235 51236 51237 51238 51239 51240; do
+     ss -tlnp | grep ":$p " && echo "port $p in use"
+   done
+   ```
+2. Verify `[ips_fixed]` lists all 6 peer ports
+3. Verify `validators.txt` has all 6 public keys
+4. Check node debug logs: `tail -50 /tmp/xrpld-integration/node1/debug.log`
+5. Ensure `[peer_private]` is set to `1` (prevents reaching out to public network)
+
+### Transaction not processing
+
+1. Verify genesis account exists:
+   ```bash
+   curl -s http://localhost:5005 \
+     -d '{"method":"account_info","params":[{"account":"rHb9CJAWyB4rj91VRWn96DkukG4bwdtyTh"}]}' \
+     | jq .result.account_data.Balance
+   ```
+2. Check submit response for error codes
+3. In standalone mode, remember to call `ledger_accept` after submitting
+
+### Spanmetrics not appearing in Prometheus
+
+1. Verify otel-collector config has `spanmetrics` connector
+2. Check that the metrics pipeline is configured:
+   ```yaml
+   service:
+     pipelines:
+       metrics:
+         receivers: [spanmetrics]
+         exporters: [prometheus]
+   ```
+3. Verify Prometheus can reach collector:
+   ```bash
+   curl -s http://localhost:9090/api/v1/targets | jq '.data.activeTargets'
+   ```
--- a/docker/telemetry/docker-compose.yml
+++ b/docker/telemetry/docker-compose.yml
@@ -7,7 +7,7 @@
 #   - tempo: Grafana Tempo tracing backend, queryable via Grafana Explore
 #     on port 3000. Recommended for production (S3/GCS storage, TraceQL).
 #   - grafana: dashboards on port 3000, pre-configured with Tempo
-#     datasource.
+#     and Prometheus datasources.
 #
 # Usage:
 #   docker compose -f docker/telemetry/docker-compose.yml up -d
@@ -26,6 +26,7 @@ services:
    ports:
      - "4317:4317" # OTLP gRPC receiver
      - "4318:4318" # OTLP HTTP receiver (xrpld sends traces here)
+      - "8889:8889" # Prometheus metrics (spanmetrics)
      - "13133:13133" # Health check endpoint
    volumes:
      # Mount collector pipeline config (receivers → processors → exporters)
@@ -50,6 +51,17 @@ services:
    networks:
      - xrpld-telemetry

+  prometheus:
+    image: prom/prometheus:latest
+    ports:
+      - "9090:9090"
+    volumes:
+      - ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
+    depends_on:
+      - otel-collector
+    networks:
+      - xrpld-telemetry
+
  # Grafana: visualization UI with Tempo pre-configured as a datasource.
  # Anonymous admin access enabled for local development convenience.
  grafana:
@@ -62,8 +74,10 @@ services:
    volumes:
      # Auto-provision Tempo datasource and search filters on startup
      - ./grafana/provisioning:/etc/grafana/provisioning:ro
+      - ./grafana/dashboards:/var/lib/grafana/dashboards:ro
    depends_on:
      - tempo
+      - prometheus
    networks:
      - xrpld-telemetry

--- a/docker/telemetry/grafana/dashboards/consensus-health.json
+++ b/docker/telemetry/grafana/dashboards/consensus-health.json
@@ -0,0 +1,244 @@
+{
+  "annotations": {
+    "list": []
+  },
+  "editable": true,
+  "fiscalYearStartMonth": 0,
+  "graphTooltip": 1,
+  "id": null,
+  "links": [],
+  "panels": [
+    {
+      "title": "Consensus Round Duration",
+      "type": "timeseries",
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 0,
+        "y": 0
+      },
+      "targets": [
+        {
+          "datasource": {
+            "type": "prometheus"
+          },
+          "expr": "histogram_quantile(0.95, sum by (le) (rate(traces_span_metrics_duration_milliseconds_bucket{exported_instance=~\"$node\", span_name=\"consensus.accept\"}[5m])))",
+          "legendFormat": "P95 Round Duration"
+        },
+        {
+          "datasource": {
+            "type": "prometheus"
+          },
+          "expr": "histogram_quantile(0.50, sum by (le) (rate(traces_span_metrics_duration_milliseconds_bucket{exported_instance=~\"$node\", span_name=\"consensus.accept\"}[5m])))",
+          "legendFormat": "P50 Round Duration"
+        }
+      ],
+      "fieldConfig": {
+        "defaults": {
+          "unit": "ms"
+        },
+        "overrides": []
+      }
+    },
+    {
+      "title": "Consensus Proposals Sent Rate",
+      "type": "timeseries",
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 12,
+        "y": 0
+      },
+      "targets": [
+        {
+          "datasource": {
+            "type": "prometheus"
+          },
+          "expr": "sum(rate(traces_span_metrics_calls_total{exported_instance=~\"$node\", span_name=\"consensus.proposal.send\"}[5m]))",
+          "legendFormat": "Proposals / Sec"
+        }
+      ],
+      "fieldConfig": {
+        "defaults": {
+          "unit": "ops"
+        },
+        "overrides": []
+      }
+    },
+    {
+      "title": "Ledger Close Duration",
+      "type": "timeseries",
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 0,
+        "y": 8
+      },
+      "targets": [
+        {
+          "datasource": {
+            "type": "prometheus"
+          },
+          "expr": "histogram_quantile(0.95, sum by (le) (rate(traces_span_metrics_duration_milliseconds_bucket{exported_instance=~\"$node\", span_name=\"consensus.ledger_close\"}[5m])))",
+          "legendFormat": "P95 Close Duration"
+        }
+      ],
+      "fieldConfig": {
+        "defaults": {
+          "unit": "ms"
+        },
+        "overrides": []
+      }
+    },
+    {
+      "title": "Validation Send Rate",
+      "type": "stat",
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 12,
+        "y": 8
+      },
+      "targets": [
+        {
+          "datasource": {
+            "type": "prometheus"
+          },
+          "expr": "sum(rate(traces_span_metrics_calls_total{exported_instance=~\"$node\", span_name=\"consensus.validation.send\"}[5m]))",
+          "legendFormat": "Validations / Sec"
+        }
+      ],
+      "fieldConfig": {
+        "defaults": {
+          "unit": "ops"
+        },
+        "overrides": []
+      }
+    },
+    {
+      "title": "Ledger Apply Duration (doAccept)",
+      "description": "Time spent applying the consensus result to build a new ledger. Measured by the consensus.accept.apply span in doAccept().",
+      "type": "timeseries",
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 0,
+        "y": 16
+      },
+      "targets": [
+        {
+          "datasource": {
+            "type": "prometheus"
+          },
+          "expr": "histogram_quantile(0.95, sum by (le) (rate(traces_span_metrics_duration_milliseconds_bucket{exported_instance=~\"$node\", span_name=\"consensus.accept.apply\"}[5m])))",
+          "legendFormat": "P95 Apply Duration"
+        },
+        {
+          "datasource": {
+            "type": "prometheus"
+          },
+          "expr": "histogram_quantile(0.50, sum by (le) (rate(traces_span_metrics_duration_milliseconds_bucket{exported_instance=~\"$node\", span_name=\"consensus.accept.apply\"}[5m])))",
+          "legendFormat": "P50 Apply Duration"
+        }
+      ],
+      "fieldConfig": {
+        "defaults": {
+          "unit": "ms",
+          "custom": {
+            "axisLabel": "Duration (ms)",
+            "spanNulls": true,
+            "insertNulls": false,
+            "showPoints": "auto",
+            "pointSize": 3
+          }
+        },
+        "overrides": []
+      }
+    },
+    {
+      "title": "Close Time Agreement",
+      "description": "Rate of close time agreement vs disagreement across consensus rounds. Based on xrpl.consensus.close_time_correct attribute (true = validators agreed, false = agreed to disagree per avCT_CONSENSUS_PCT).",
+      "type": "timeseries",
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 12,
+        "y": 16
+      },
+      "targets": [
+        {
+          "datasource": {
+            "type": "prometheus"
+          },
+          "expr": "sum(rate(traces_span_metrics_calls_total{exported_instance=~\"$node\", span_name=\"consensus.accept.apply\"}[5m]))",
+          "legendFormat": "Total Rounds / Sec"
+        }
+      ],
+      "fieldConfig": {
+        "defaults": {
+          "unit": "ops",
+          "custom": {
+            "axisLabel": "Rounds / Sec",
+            "spanNulls": true,
+            "insertNulls": false,
+            "showPoints": "auto",
+            "pointSize": 3
+          }
+        },
+        "overrides": []
+      }
+    }
+  ],
+  "schemaVersion": 39,
+  "tags": ["rippled", "consensus", "telemetry"],
+  "templating": {
+    "list": [
+      {
+        "name": "node",
+        "label": "Node",
+        "description": "Filter by rippled node (service.instance.id \u2014 e.g. Node-1)",
+        "type": "query",
+        "query": "label_values(traces_span_metrics_calls_total, exported_instance)",
+        "datasource": {
+          "type": "prometheus",
+          "uid": "prometheus"
+        },
+        "includeAll": true,
+        "allValue": ".*",
+        "current": {
+          "text": "All",
+          "value": "$__all"
+        },
+        "multi": true,
+        "refresh": 2,
+        "sort": 1
+      },
+      {
+        "name": "consensus_mode",
+        "label": "Consensus Mode",
+        "description": "Filter by consensus mode (Proposing, Observing, Wrong Ledger, Switched Ledger)",
+        "type": "query",
+        "query": "label_values(traces_span_metrics_calls_total{span_name=\"consensus.ledger_close\"}, xrpl_consensus_mode)",
+        "datasource": {
+          "type": "prometheus",
+          "uid": "prometheus"
+        },
+        "includeAll": true,
+        "allValue": ".*",
+        "current": {
+          "text": "All",
+          "value": "$__all"
+        },
+        "multi": true,
+        "refresh": 2,
+        "sort": 1
+      }
+    ]
+  },
+  "time": {
+    "from": "now-1h",
+    "to": "now"
+  },
+  "title": "rippled Consensus Health",
+  "uid": "rippled-consensus"
+}
--- a/docker/telemetry/grafana/dashboards/rpc-performance.json
+++ b/docker/telemetry/grafana/dashboards/rpc-performance.json
@@ -0,0 +1,189 @@
+{
+  "annotations": {
+    "list": []
+  },
+  "editable": true,
+  "fiscalYearStartMonth": 0,
+  "graphTooltip": 1,
+  "id": null,
+  "links": [],
+  "panels": [
+    {
+      "title": "RPC Request Rate by Command",
+      "type": "timeseries",
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 0,
+        "y": 0
+      },
+      "targets": [
+        {
+          "datasource": {
+            "type": "prometheus"
+          },
+          "expr": "sum by (xrpl_rpc_command) (rate(traces_span_metrics_calls_total{xrpl_rpc_command=~\"$command\", exported_instance=~\"$node\", span_name=~\"rpc.command.*\"}[5m]))",
+          "legendFormat": "{{xrpl_rpc_command}}"
+        }
+      ],
+      "fieldConfig": {
+        "defaults": {
+          "unit": "reqps",
+          "custom": {
+            "axisLabel": "Requests / Sec",
+            "spanNulls": true,
+            "insertNulls": false,
+            "showPoints": "auto",
+            "pointSize": 3
+          }
+        },
+        "overrides": []
+      }
+    },
+    {
+      "title": "RPC Latency P95 by Command",
+      "type": "timeseries",
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 12,
+        "y": 0
+      },
+      "targets": [
+        {
+          "datasource": {
+            "type": "prometheus"
+          },
+          "expr": "histogram_quantile(0.95, sum by (le, xrpl_rpc_command) (rate(traces_span_metrics_duration_milliseconds_bucket{xrpl_rpc_command=~\"$command\", exported_instance=~\"$node\", span_name=~\"rpc.command.*\"}[5m])))",
+          "legendFormat": "P95 {{xrpl_rpc_command}}"
+        }
+      ],
+      "fieldConfig": {
+        "defaults": {
+          "unit": "ms",
+          "custom": {
+            "axisLabel": "Latency (ms)",
+            "spanNulls": true,
+            "insertNulls": false,
+            "showPoints": "auto",
+            "pointSize": 3
+          }
+        },
+        "overrides": []
+      }
+    },
+    {
+      "title": "RPC Error Rate",
+      "type": "bargauge",
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 0,
+        "y": 8
+      },
+      "targets": [
+        {
+          "datasource": {
+            "type": "prometheus"
+          },
+          "expr": "sum by (xrpl_rpc_command) (rate(traces_span_metrics_calls_total{xrpl_rpc_command=~\"$command\", exported_instance=~\"$node\", span_name=~\"rpc.command.*\", status_code=\"STATUS_CODE_ERROR\"}[5m])) / sum by (xrpl_rpc_command) (rate(traces_span_metrics_calls_total{exported_instance=~\"$node\", xrpl_rpc_command=~\"$command\", span_name=~\"rpc.command.*\"}[5m])) * 100",
+          "legendFormat": "{{xrpl_rpc_command}}"
+        }
+      ],
+      "fieldConfig": {
+        "defaults": {
+          "unit": "percent",
+          "thresholds": {
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "yellow",
+                "value": 1
+              },
+              {
+                "color": "red",
+                "value": 5
+              }
+            ]
+          }
+        },
+        "overrides": []
+      }
+    },
+    {
+      "title": "RPC Latency Heatmap",
+      "type": "heatmap",
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 12,
+        "y": 8
+      },
+      "targets": [
+        {
+          "datasource": {
+            "type": "prometheus"
+          },
+          "expr": "sum(increase(traces_span_metrics_duration_milliseconds_bucket{xrpl_rpc_command=~\"$command\", exported_instance=~\"$node\", span_name=~\"rpc.command.*\"}[5m])) by (le)",
+          "legendFormat": "{{le}}",
+          "format": "heatmap"
+        }
+      ]
+    }
+  ],
+  "schemaVersion": 39,
+  "tags": ["rippled", "rpc", "telemetry"],
+  "templating": {
+    "list": [
+      {
+        "name": "node",
+        "label": "Node",
+        "description": "Filter by rippled node (service.instance.id \u2014 e.g. Node-1)",
+        "type": "query",
+        "query": "label_values(traces_span_metrics_calls_total, exported_instance)",
+        "datasource": {
+          "type": "prometheus",
+          "uid": "prometheus"
+        },
+        "includeAll": true,
+        "allValue": ".*",
+        "current": {
+          "text": "All",
+          "value": "$__all"
+        },
+        "multi": true,
+        "refresh": 2,
+        "sort": 1
+      },
+      {
+        "name": "command",
+        "label": "RPC Command",
+        "description": "Filter by RPC command name (e.g., server_info, submit)",
+        "type": "query",
+        "query": "label_values(traces_span_metrics_calls_total{span_name=~\"rpc.command.*\"}, xrpl_rpc_command)",
+        "datasource": {
+          "type": "prometheus",
+          "uid": "prometheus"
+        },
+        "includeAll": true,
+        "allValue": ".*",
+        "current": {
+          "text": "All",
+          "value": "$__all"
+        },
+        "multi": true,
+        "refresh": 2,
+        "sort": 1
+      }
+    ]
+  },
+  "time": {
+    "from": "now-1h",
+    "to": "now"
+  },
+  "title": "rippled RPC Performance",
+  "uid": "rippled-rpc-perf"
+}
--- a/docker/telemetry/grafana/dashboards/transaction-overview.json
+++ b/docker/telemetry/grafana/dashboards/transaction-overview.json
@@ -0,0 +1,172 @@
+{
+  "annotations": {
+    "list": []
+  },
+  "editable": true,
+  "fiscalYearStartMonth": 0,
+  "graphTooltip": 1,
+  "id": null,
+  "links": [],
+  "panels": [
+    {
+      "title": "Transaction Processing Rate",
+      "type": "timeseries",
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 0,
+        "y": 0
+      },
+      "targets": [
+        {
+          "datasource": {
+            "type": "prometheus"
+          },
+          "expr": "sum(rate(traces_span_metrics_calls_total{exported_instance=~\"$node\", span_name=\"tx.process\"}[5m]))",
+          "legendFormat": "tx.process/sec"
+        },
+        {
+          "datasource": {
+            "type": "prometheus"
+          },
+          "expr": "sum(rate(traces_span_metrics_calls_total{exported_instance=~\"$node\", span_name=\"tx.receive\"}[5m]))",
+          "legendFormat": "tx.receive/sec"
+        }
+      ],
+      "fieldConfig": {
+        "defaults": {
+          "unit": "ops"
+        },
+        "overrides": []
+      }
+    },
+    {
+      "title": "Transaction Processing Latency",
+      "type": "timeseries",
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 12,
+        "y": 0
+      },
+      "targets": [
+        {
+          "datasource": {
+            "type": "prometheus"
+          },
+          "expr": "histogram_quantile(0.95, sum by (le) (rate(traces_span_metrics_duration_milliseconds_bucket{exported_instance=~\"$node\", span_name=\"tx.process\"}[5m])))",
+          "legendFormat": "p95"
+        },
+        {
+          "datasource": {
+            "type": "prometheus"
+          },
+          "expr": "histogram_quantile(0.50, sum by (le) (rate(traces_span_metrics_duration_milliseconds_bucket{exported_instance=~\"$node\", span_name=\"tx.process\"}[5m])))",
+          "legendFormat": "p50"
+        }
+      ],
+      "fieldConfig": {
+        "defaults": {
+          "unit": "ms"
+        },
+        "overrides": []
+      }
+    },
+    {
+      "title": "Transaction Path Distribution",
+      "type": "piechart",
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 0,
+        "y": 8
+      },
+      "targets": [
+        {
+          "datasource": {
+            "type": "prometheus"
+          },
+          "expr": "sum by (xrpl_tx_local) (rate(traces_span_metrics_calls_total{exported_instance=~\"$node\", xrpl_tx_local=~\"$tx_origin\", span_name=\"tx.process\"}[5m]))",
+          "legendFormat": "local={{xrpl_tx_local}}"
+        }
+      ]
+    },
+    {
+      "title": "Transaction Receive vs Suppressed",
+      "type": "timeseries",
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 12,
+        "y": 8
+      },
+      "targets": [
+        {
+          "datasource": {
+            "type": "prometheus"
+          },
+          "expr": "sum(rate(traces_span_metrics_calls_total{exported_instance=~\"$node\", span_name=\"tx.receive\"}[5m]))",
+          "legendFormat": "total received"
+        }
+      ],
+      "fieldConfig": {
+        "defaults": {
+          "unit": "ops"
+        },
+        "overrides": []
+      }
+    }
+  ],
+  "schemaVersion": 39,
+  "tags": ["rippled", "transactions", "telemetry"],
+  "templating": {
+    "list": [
+      {
+        "name": "node",
+        "label": "Node",
+        "description": "Filter by rippled node (service.instance.id \u2014 e.g. Node-1)",
+        "type": "query",
+        "query": "label_values(traces_span_metrics_calls_total, exported_instance)",
+        "datasource": {
+          "type": "prometheus",
+          "uid": "prometheus"
+        },
+        "includeAll": true,
+        "allValue": ".*",
+        "current": {
+          "text": "All",
+          "value": "$__all"
+        },
+        "multi": true,
+        "refresh": 2,
+        "sort": 1
+      },
+      {
+        "name": "tx_origin",
+        "label": "TX Origin",
+        "description": "Filter by transaction origin (true = local submit, false = peer relay)",
+        "type": "query",
+        "query": "label_values(traces_span_metrics_calls_total{span_name=\"tx.process\"}, xrpl_tx_local)",
+        "datasource": {
+          "type": "prometheus",
+          "uid": "prometheus"
+        },
+        "includeAll": true,
+        "allValue": ".*",
+        "current": {
+          "text": "All",
+          "value": "$__all"
+        },
+        "multi": true,
+        "refresh": 2,
+        "sort": 1
+      }
+    ]
+  },
+  "time": {
+    "from": "now-1h",
+    "to": "now"
+  },
+  "title": "rippled Transaction Overview",
+  "uid": "rippled-transactions"
+}
--- a/docker/telemetry/grafana/provisioning/dashboards/dashboards.yaml
+++ b/docker/telemetry/grafana/provisioning/dashboards/dashboards.yaml
@@ -0,0 +1,12 @@
+apiVersion: 1
+
+providers:
+  - name: rippled-telemetry
+    orgId: 1
+    folder: rippled
+    type: file
+    disableDeletion: false
+    editable: true
+    options:
+      path: /var/lib/grafana/dashboards
+      foldersFromFilesStructure: false
--- a/docker/telemetry/grafana/provisioning/datasources/prometheus.yaml
+++ b/docker/telemetry/grafana/provisioning/datasources/prometheus.yaml
@@ -0,0 +1,10 @@
+apiVersion: 1
+
+datasources:
+  - name: Prometheus
+    type: prometheus
+    uid: prometheus
+    access: proxy
+    url: http://prometheus:9090
+    isDefault: true
+    editable: true
--- a/docker/telemetry/grafana/provisioning/datasources/tempo.yaml
+++ b/docker/telemetry/grafana/provisioning/datasources/tempo.yaml
@@ -40,9 +40,9 @@ datasources:
            operator: "="
            scope: resource
            type: static
-          # service.instance.id: unique node identifier — defaults to the
-          #   node's public key (e.g., nHB1X37...). Distinguishes individual
-          #   nodes in a multi-node cluster or network.
+          # service.instance.id: unique node identifier — configurable via
+          #   the service_instance_id setting in [telemetry], defaults to the
+          #   node's public key. E.g. "Node-1" or "nHB1X37...".
          - id: node-id
            tag: service.instance.id
            operator: "="
--- a/docker/telemetry/integration-test.sh
+++ b/docker/telemetry/integration-test.sh
@@ -0,0 +1,558 @@
+#!/usr/bin/env bash
+# Integration test for rippled OpenTelemetry instrumentation.
+#
+# Launches a 6-node xrpld consensus network with telemetry enabled,
+# exercises RPC / transaction / consensus code paths, then verifies
+# that the expected spans and metrics appear in Tempo and Prometheus.
+#
+# Usage:
+#   bash docker/telemetry/integration-test.sh
+#
+# Prerequisites:
+#   - .build/xrpld built with telemetry=ON
+#   - docker compose (v2)
+#   - curl, jq
+#
+# The script leaves the observability stack and xrpld nodes running
+# so you can manually inspect Tempo (localhost:3200) and Grafana
+# (localhost:3000). Run with --cleanup to tear down instead.
+
+set -euo pipefail
+
+# ---------------------------------------------------------------------------
+# Configuration
+# ---------------------------------------------------------------------------
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+REPO_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
+XRPLD="$REPO_ROOT/.build/xrpld"
+COMPOSE_FILE="$SCRIPT_DIR/docker-compose.yml"
+STANDALONE_CFG="$SCRIPT_DIR/xrpld-telemetry.cfg"
+WORKDIR="/tmp/xrpld-integration"
+NUM_NODES=6
+PEER_PORT_BASE=51235
+RPC_PORT_BASE=5005
+CONSENSUS_TIMEOUT=120
+GENESIS_ACCOUNT="rHb9CJAWyB4rj91VRWn96DkukG4bwdtyTh"
+GENESIS_SEED="snoPBrXtMeMyMHUVTgbuqAfg1SUTb"
+DEST_ACCOUNT=""  # Generated dynamically via wallet_propose
+TEMPO="http://localhost:3200"
+PROM="http://localhost:9090"
+
+# Counters for pass/fail
+PASS=0
+FAIL=0
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+log()  { printf "\033[1;34m[INFO]\033[0m  %s\n" "$*"; }
+ok()   { printf "\033[1;32m[PASS]\033[0m  %s\n" "$*"; PASS=$((PASS + 1)); }
+fail() { printf "\033[1;31m[FAIL]\033[0m  %s\n" "$*"; FAIL=$((FAIL + 1)); }
+die()  { printf "\033[1;31m[ERROR]\033[0m %s\n" "$*" >&2; exit 1; }
+
+check_span() {
+    local op="$1"
+    local count
+    count=$(curl -sf "$TEMPO/api/search" \
+        --data-urlencode "q={resource.service.name=\"rippled\" && name=\"$op\"}" \
+        --data-urlencode "limit=5" \
+        | jq '.traces | length' 2>/dev/null || echo 0)
+    if [ "$count" -gt 0 ]; then
+        ok "$op  ($count traces)"
+    else
+        fail "$op  (0 traces)"
+    fi
+}
+
+cleanup() {
+    log "Cleaning up..."
+    # Kill xrpld nodes
+    for i in $(seq 1 "$NUM_NODES"); do
+        local pidfile="$WORKDIR/node$i/xrpld.pid"
+        if [ -f "$pidfile" ]; then
+            kill "$(cat "$pidfile")" 2>/dev/null || true
+            rm -f "$pidfile"
+        fi
+    done
+    # Also kill any straggling xrpld processes from our workdir
+    pkill -f "$WORKDIR" 2>/dev/null || true
+    # Stop docker stack
+    docker compose -f "$COMPOSE_FILE" down 2>/dev/null || true
+    # Remove workdir
+    rm -rf "$WORKDIR"
+    log "Cleanup complete."
+}
+
+# Handle --cleanup flag
+if [ "${1:-}" = "--cleanup" ]; then
+    cleanup
+    exit 0
+fi
+
+# ---------------------------------------------------------------------------
+# Step 0: Prerequisites
+# ---------------------------------------------------------------------------
+log "Checking prerequisites..."
+
+command -v docker >/dev/null 2>&1 || die "docker not found"
+docker compose version >/dev/null 2>&1 || die "docker compose (v2) not found"
+command -v curl >/dev/null 2>&1 || die "curl not found"
+command -v jq >/dev/null 2>&1 || die "jq not found"
+[ -x "$XRPLD" ] || die "xrpld binary not found at $XRPLD (build with telemetry=ON)"
+[ -f "$COMPOSE_FILE" ] || die "docker-compose.yml not found at $COMPOSE_FILE"
+[ -f "$STANDALONE_CFG" ] || die "xrpld-telemetry.cfg not found at $STANDALONE_CFG"
+
+log "All prerequisites met."
+
+# ---------------------------------------------------------------------------
+# Step 1: Clean previous run
+# ---------------------------------------------------------------------------
+log "Cleaning previous run data..."
+for i in $(seq 1 "$NUM_NODES"); do
+    pidfile="$WORKDIR/node$i/xrpld.pid"
+    if [ -f "$pidfile" ]; then
+        kill "$(cat "$pidfile")" 2>/dev/null || true
+    fi
+done
+pkill -f "$WORKDIR" 2>/dev/null || true
+# Kill any xrpld using the standalone config (from key generation)
+pkill -f "xrpld-telemetry.cfg" 2>/dev/null || true
+sleep 2
+rm -rf "$WORKDIR"
+mkdir -p "$WORKDIR"
+
+# ---------------------------------------------------------------------------
+# Step 2: Start observability stack
+# ---------------------------------------------------------------------------
+log "Starting observability stack..."
+docker compose -f "$COMPOSE_FILE" up -d
+
+log "Waiting for otel-collector to be ready..."
+for attempt in $(seq 1 30); do
+    # The OTLP HTTP endpoint returns 405 for GET (expects POST), which
+    # means it is listening.  curl -sf would fail on 405, so we check
+    # the HTTP status code explicitly.
+    status=$(curl -so /dev/null -w '%{http_code}' http://localhost:4318/ 2>/dev/null || echo 000)
+    if [ "$status" != "000" ]; then
+        log "otel-collector ready (attempt $attempt, HTTP $status)."
+        break
+    fi
+    if [ "$attempt" -eq 30 ]; then
+        die "otel-collector not ready after 30s"
+    fi
+    sleep 1
+done
+
+log "Waiting for Tempo to be ready..."
+for attempt in $(seq 1 30); do
+    if curl -sf "$TEMPO/ready" >/dev/null 2>&1; then
+        log "Tempo ready (attempt $attempt)."
+        break
+    fi
+    if [ "$attempt" -eq 30 ]; then
+        die "Tempo not ready after 30s"
+    fi
+    sleep 1
+done
+
+# ---------------------------------------------------------------------------
+# Step 3: Generate validator keys
+# ---------------------------------------------------------------------------
+log "Generating $NUM_NODES validator key pairs..."
+
+# Start a temporary standalone xrpld for key generation
+TEMP_DATA="$WORKDIR/temp-keygen"
+mkdir -p "$TEMP_DATA"
+
+# Create a minimal temp config for key generation
+TEMP_CFG="$TEMP_DATA/xrpld.cfg"
+cat > "$TEMP_CFG" <<EOCFG
+[server]
+port_rpc_temp
+
+[port_rpc_temp]
+port = 5099
+ip = 127.0.0.1
+admin = 127.0.0.1
+protocol = http
+
+[node_db]
+type=NuDB
+path=$TEMP_DATA/nudb
+online_delete=256
+
+[database_path]
+$TEMP_DATA/db
+
+[debug_logfile]
+$TEMP_DATA/debug.log
+
+[ssl_verify]
+0
+EOCFG
+
+"$XRPLD" --conf "$TEMP_CFG" -a --start > "$TEMP_DATA/stdout.log" 2>&1 &
+TEMP_PID=$!
+log "Temporary xrpld started (PID $TEMP_PID), waiting for RPC..."
+
+for attempt in $(seq 1 30); do
+    if curl -sf http://localhost:5099 -d '{"method":"server_info"}' >/dev/null 2>&1; then
+        log "Temporary xrpld RPC ready (attempt $attempt)."
+        break
+    fi
+    if [ "$attempt" -eq 30 ]; then
+        kill "$TEMP_PID" 2>/dev/null || true
+        die "Temporary xrpld RPC not ready after 30s"
+    fi
+    sleep 1
+done
+
+declare -a SEEDS
+declare -a PUBKEYS
+
+for i in $(seq 1 "$NUM_NODES"); do
+    result=$(curl -sf http://localhost:5099 -d '{"method":"validation_create"}')
+    seed=$(echo "$result" | jq -r '.result.validation_seed')
+    pubkey=$(echo "$result" | jq -r '.result.validation_public_key')
+    if [ -z "$seed" ] || [ "$seed" = "null" ]; then
+        kill "$TEMP_PID" 2>/dev/null || true
+        die "Failed to generate key pair $i"
+    fi
+    SEEDS+=("$seed")
+    PUBKEYS+=("$pubkey")
+    log "  Node $i: $pubkey"
+done
+
+kill "$TEMP_PID" 2>/dev/null || true
+wait "$TEMP_PID" 2>/dev/null || true
+rm -rf "$TEMP_DATA"
+log "Key generation complete."
+
+# ---------------------------------------------------------------------------
+# Step 4: Generate node configs and validators.txt
+# ---------------------------------------------------------------------------
+log "Generating node configs..."
+
+# Create shared validators.txt
+VALIDATORS_FILE="$WORKDIR/validators.txt"
+{
+    echo "[validators]"
+    for i in $(seq 0 $((NUM_NODES - 1))); do
+        echo "${PUBKEYS[$i]}"
+    done
+} > "$VALIDATORS_FILE"
+
+# Create per-node configs
+for i in $(seq 1 "$NUM_NODES"); do
+    NODE_DIR="$WORKDIR/node$i"
+    mkdir -p "$NODE_DIR/nudb" "$NODE_DIR/db"
+
+    RPC_PORT=$((RPC_PORT_BASE + i - 1))
+    PEER_PORT=$((PEER_PORT_BASE + i - 1))
+    SEED="${SEEDS[$((i - 1))]}"
+
+    # Build ips_fixed list (all peers except self)
+    IPS_FIXED=""
+    for j in $(seq 1 "$NUM_NODES"); do
+        if [ "$j" -ne "$i" ]; then
+            IPS_FIXED="${IPS_FIXED}127.0.0.1 $((PEER_PORT_BASE + j - 1))
+"
+        fi
+    done
+
+    cat > "$NODE_DIR/xrpld.cfg" <<EOCFG
+[server]
+port_rpc
+port_peer
+
+[port_rpc]
+port = $RPC_PORT
+ip = 127.0.0.1
+admin = 127.0.0.1
+protocol = http
+
+[port_peer]
+port = $PEER_PORT
+ip = 0.0.0.0
+protocol = peer
+
+[node_db]
+type=NuDB
+path=$NODE_DIR/nudb
+online_delete=256
+
+[database_path]
+$NODE_DIR/db
+
+[debug_logfile]
+$NODE_DIR/debug.log
+
+[validation_seed]
+$SEED
+
+[validators_file]
+$VALIDATORS_FILE
+
+[ips_fixed]
+${IPS_FIXED}
+[peer_private]
+1
+
+[telemetry]
+enabled=1
+service_instance_id=Node-${i}
+endpoint=http://localhost:4318/v1/traces
+exporter=otlp_http
+sampling_ratio=1.0
+batch_size=512
+batch_delay_ms=2000
+max_queue_size=2048
+trace_rpc=1
+trace_transactions=1
+trace_consensus=1
+trace_peer=0
+trace_ledger=1
+
+[rpc_startup]
+{ "command": "log_level", "severity": "warning" }
+
+[ssl_verify]
+0
+EOCFG
+
+    log "  Node $i config: RPC=$RPC_PORT, Peer=$PEER_PORT"
+done
+
+# ---------------------------------------------------------------------------
+# Step 5: Start all 6 nodes
+# ---------------------------------------------------------------------------
+log "Starting $NUM_NODES xrpld nodes..."
+
+for i in $(seq 1 "$NUM_NODES"); do
+    NODE_DIR="$WORKDIR/node$i"
+    "$XRPLD" --conf "$NODE_DIR/xrpld.cfg" --start > "$NODE_DIR/stdout.log" 2>&1 &
+    echo $! > "$NODE_DIR/xrpld.pid"
+    log "  Node $i started (PID $(cat "$NODE_DIR/xrpld.pid"))"
+done
+
+# Give nodes a moment to initialize
+sleep 5
+
+# ---------------------------------------------------------------------------
+# Step 6: Wait for consensus
+# ---------------------------------------------------------------------------
+log "Waiting for nodes to reach 'proposing' state (timeout: ${CONSENSUS_TIMEOUT}s)..."
+
+start_time=$(date +%s)
+nodes_ready=0
+
+while [ "$nodes_ready" -lt "$NUM_NODES" ]; do
+    elapsed=$(( $(date +%s) - start_time ))
+    if [ "$elapsed" -ge "$CONSENSUS_TIMEOUT" ]; then
+        fail "Consensus timeout after ${CONSENSUS_TIMEOUT}s ($nodes_ready/$NUM_NODES nodes ready)"
+        log "Continuing with partial consensus..."
+        break
+    fi
+
+    nodes_ready=0
+    for i in $(seq 1 "$NUM_NODES"); do
+        RPC_PORT=$((RPC_PORT_BASE + i - 1))
+        state=$(curl -sf "http://localhost:$RPC_PORT" \
+            -d '{"method":"server_info"}' 2>/dev/null \
+            | jq -r '.result.info.server_state' 2>/dev/null || echo "unreachable")
+        if [ "$state" = "proposing" ]; then
+            nodes_ready=$((nodes_ready + 1))
+        fi
+    done
+    printf "\r  %d/%d nodes proposing (%ds elapsed)..." "$nodes_ready" "$NUM_NODES" "$elapsed"
+    if [ "$nodes_ready" -lt "$NUM_NODES" ]; then
+        sleep 3
+    fi
+done
+echo ""
+
+if [ "$nodes_ready" -eq "$NUM_NODES" ]; then
+    ok "All $NUM_NODES nodes reached 'proposing' state"
+else
+    fail "Only $nodes_ready/$NUM_NODES nodes reached 'proposing' state"
+fi
+
+# ---------------------------------------------------------------------------
+# Step 6b: Wait for validated ledger
+# ---------------------------------------------------------------------------
+log "Waiting for first validated ledger..."
+for attempt in $(seq 1 60); do
+    val_seq=$(curl -sf "http://localhost:$RPC_PORT_BASE" \
+        -d '{"method":"server_info"}' 2>/dev/null \
+        | jq -r '.result.info.validated_ledger.seq // 0' 2>/dev/null || echo 0)
+    if [ "$val_seq" -gt 2 ] 2>/dev/null; then
+        ok "First validated ledger: seq $val_seq"
+        break
+    fi
+    if [ "$attempt" -eq 60 ]; then
+        fail "No validated ledger after 60s"
+    fi
+    sleep 1
+done
+
+# ---------------------------------------------------------------------------
+# Step 7: Exercise RPC spans (Phase 2)
+# ---------------------------------------------------------------------------
+log "Exercising RPC spans..."
+
+curl -sf "http://localhost:$RPC_PORT_BASE" \
+    -d '{"method":"server_info"}' > /dev/null
+curl -sf "http://localhost:$RPC_PORT_BASE" \
+    -d '{"method":"server_state"}' > /dev/null
+curl -sf "http://localhost:$RPC_PORT_BASE" \
+    -d '{"method":"ledger","params":[{"ledger_index":"current"}]}' > /dev/null
+
+log "RPC commands sent. Waiting 5s for batch export..."
+sleep 5
+
+# ---------------------------------------------------------------------------
+# Step 8: Submit transaction (Phase 3)
+# ---------------------------------------------------------------------------
+log "Submitting Payment transaction..."
+
+# Generate a destination wallet
+log "  Generating destination wallet..."
+wallet_result=$(curl -sf "http://localhost:$RPC_PORT_BASE" \
+    -d '{"method":"wallet_propose"}')
+DEST_ACCOUNT=$(echo "$wallet_result" | jq -r '.result.account_id' 2>/dev/null)
+if [ -z "$DEST_ACCOUNT" ] || [ "$DEST_ACCOUNT" = "null" ]; then
+    fail "Could not generate destination wallet"
+    DEST_ACCOUNT="rrrrrrrrrrrrrrrrrrrrrhoLvTp"  # ACCOUNT_ZERO fallback
+fi
+log "  Destination: $DEST_ACCOUNT"
+
+# Get genesis account info
+acct_result=$(curl -sf "http://localhost:$RPC_PORT_BASE" \
+    -d "{\"method\":\"account_info\",\"params\":[{\"account\":\"$GENESIS_ACCOUNT\"}]}")
+seq_num=$(echo "$acct_result" | jq -r '.result.account_data.Sequence' 2>/dev/null || echo "unknown")
+log "  Genesis account sequence: $seq_num"
+
+# Submit payment
+submit_result=$(curl -sf "http://localhost:$RPC_PORT_BASE" -d "{
+  \"method\": \"submit\",
+  \"params\": [{
+    \"secret\": \"$GENESIS_SEED\",
+    \"tx_json\": {
+      \"TransactionType\": \"Payment\",
+      \"Account\": \"$GENESIS_ACCOUNT\",
+      \"Destination\": \"$DEST_ACCOUNT\",
+      \"Amount\": \"10000000\"
+    }
+  }]
+}")
+
+engine_result=$(echo "$submit_result" | jq -r '.result.engine_result' 2>/dev/null || echo "unknown")
+tx_hash=$(echo "$submit_result" | jq -r '.result.tx_json.hash' 2>/dev/null || echo "unknown")
+
+if [ "$engine_result" = "tesSUCCESS" ] || [ "$engine_result" = "terQUEUED" ]; then
+    ok "Transaction submitted: $engine_result (hash: ${tx_hash:0:16}...)"
+else
+    fail "Transaction submission: $engine_result"
+    log "  Full response: $(echo "$submit_result" | jq -c .result 2>/dev/null)"
+fi
+
+log "Waiting 15s for consensus round + batch export..."
+sleep 15
+
+# ---------------------------------------------------------------------------
+# Step 9: Verify Tempo traces
+# ---------------------------------------------------------------------------
+log "Verifying spans in Tempo..."
+
+# Check service registration
+services=$(curl -sf "$TEMPO/api/v2/search/tag/resource.service.name/values" \
+    | jq -r '.tagValues[].value' 2>/dev/null || echo "")
+if echo "$services" | grep -q "rippled"; then
+    ok "Service 'rippled' registered in Tempo"
+else
+    fail "Service 'rippled' NOT found in Tempo (found: $services)"
+fi
+
+log ""
+log "--- Phase 2: RPC Spans ---"
+check_span "rpc.request"
+check_span "rpc.process"
+check_span "rpc.command.server_info"
+check_span "rpc.command.server_state"
+check_span "rpc.command.ledger"
+
+log ""
+log "--- Phase 3: Transaction Spans ---"
+check_span "tx.process"
+check_span "tx.receive"
+
+log ""
+log "--- Phase 4: Consensus Spans ---"
+check_span "consensus.proposal.send"
+check_span "consensus.ledger_close"
+check_span "consensus.accept"
+check_span "consensus.validation.send"
+
+# ---------------------------------------------------------------------------
+# Step 10: Verify Prometheus spanmetrics
+# ---------------------------------------------------------------------------
+log ""
+log "--- Phase 5: Spanmetrics ---"
+log "Waiting 20s for Prometheus scrape cycle..."
+sleep 20
+
+calls_count=$(curl -sf "$PROM/api/v1/query?query=traces_span_metrics_calls_total" \
+    | jq '.data.result | length' 2>/dev/null || echo 0)
+if [ "$calls_count" -gt 0 ]; then
+    ok "Prometheus: traces_span_metrics_calls_total ($calls_count series)"
+else
+    fail "Prometheus: traces_span_metrics_calls_total (0 series)"
+fi
+
+duration_count=$(curl -sf "$PROM/api/v1/query?query=traces_span_metrics_duration_milliseconds_count" \
+    | jq '.data.result | length' 2>/dev/null || echo 0)
+if [ "$duration_count" -gt 0 ]; then
+    ok "Prometheus: duration histogram ($duration_count series)"
+else
+    fail "Prometheus: duration histogram (0 series)"
+fi
+
+# Check Grafana
+if curl -sf http://localhost:3000/api/health > /dev/null 2>&1; then
+    ok "Grafana: healthy at localhost:3000"
+else
+    fail "Grafana: not reachable at localhost:3000"
+fi
+
+# ---------------------------------------------------------------------------
+# Step 11: Summary
+# ---------------------------------------------------------------------------
+echo ""
+echo "==========================================================="
+echo "  INTEGRATION TEST RESULTS"
+echo "==========================================================="
+printf "  \033[1;32mPASSED: %d\033[0m\n" "$PASS"
+printf "  \033[1;31mFAILED: %d\033[0m\n" "$FAIL"
+echo "==========================================================="
+echo ""
+echo "  Observability stack is running:"
+echo ""
+echo "    Tempo:         http://localhost:3200"
+echo "    Grafana:       http://localhost:3000"
+echo "    Prometheus:    http://localhost:9090"
+echo ""
+echo "  xrpld nodes (6) are running:"
+for i in $(seq 1 "$NUM_NODES"); do
+    RPC_PORT=$((RPC_PORT_BASE + i - 1))
+    PEER_PORT=$((PEER_PORT_BASE + i - 1))
+    echo "    Node $i: RPC=localhost:$RPC_PORT  Peer=:$PEER_PORT  PID=$(cat "$WORKDIR/node$i/xrpld.pid" 2>/dev/null || echo 'unknown')"
+done
+echo ""
+echo "  To tear down:"
+echo "    bash docker/telemetry/integration-test.sh --cleanup"
+echo ""
+echo "==========================================================="
+
+if [ "$FAIL" -gt 0 ]; then
+    exit 1
+fi
--- a/docker/telemetry/otel-collector-config.yaml
+++ b/docker/telemetry/otel-collector-config.yaml
@@ -1,9 +1,12 @@
 # OpenTelemetry Collector configuration for xrpld development.
 #
-# Pipeline: OTLP receiver -> batch processor -> debug + Tempo.
+# Pipelines:
+#   traces: OTLP receiver -> batch processor -> debug + Tempo + spanmetrics
+#   metrics: spanmetrics connector -> Prometheus exporter
+#
 # xrpld sends traces via OTLP/HTTP to port 4318. The collector batches
-# them and forwards to Tempo via OTLP/gRPC on the Docker network. Tempo
-# is queryable via Grafana Explore using TraceQL.
+# them, forwards to Tempo, and derives RED metrics via the spanmetrics
+# connector, which Prometheus scrapes on port 8889.

 receivers:
  otlp:
@@ -18,6 +21,21 @@ processors:
    timeout: 1s
    send_batch_size: 100

+connectors:
+  spanmetrics:
+    # Expose service.instance.id (node public key) as a Prometheus label so
+    # Grafana dashboards can filter metrics by individual node.
+    resource_metrics_key_attributes:
+      - service.instance.id
+    histogram:
+      explicit:
+        buckets: [1ms, 5ms, 10ms, 25ms, 50ms, 100ms, 250ms, 500ms, 1s, 5s]
+    dimensions:
+      - name: xrpl.rpc.command
+      - name: xrpl.rpc.status
+      - name: xrpl.consensus.mode
+      - name: xrpl.tx.local
+
 exporters:
  debug:
    verbosity: detailed
@@ -25,6 +43,8 @@ exporters:
    endpoint: tempo:4317
    tls:
      insecure: true
+  prometheus:
+    endpoint: 0.0.0.0:8889

 extensions:
  health_check:
@@ -36,4 +56,7 @@ service:
    traces:
      receivers: [otlp]
      processors: [batch]
-      exporters: [debug, otlp/tempo]
+      exporters: [debug, otlp/tempo, spanmetrics]
+    metrics:
+      receivers: [spanmetrics]
+      exporters: [prometheus]
--- a/docker/telemetry/prometheus.yml
+++ b/docker/telemetry/prometheus.yml
@@ -0,0 +1,9 @@
+# Prometheus configuration for scraping spanmetrics from OTel Collector.
+global:
+  scrape_interval: 15s
+  evaluation_interval: 15s
+
+scrape_configs:
+  - job_name: otel-collector
+    static_configs:
+      - targets: ["otel-collector:8889"]
--- a/docker/telemetry/xrpld-telemetry.cfg
+++ b/docker/telemetry/xrpld-telemetry.cfg
@@ -0,0 +1,60 @@
+# Standalone xrpld configuration with OpenTelemetry enabled.
+#
+# Usage:
+#   1. Start the observability stack:
+#        docker compose -f docker/telemetry/docker-compose.yml up -d
+#   2. Run xrpld in standalone mode:
+#        ./xrpld --conf docker/telemetry/xrpld-telemetry.cfg -a --start
+#   3. Send RPC commands to exercise tracing:
+#        curl -s http://localhost:5005 -d '{"method":"server_info"}'
+#   4. View traces in Jaeger UI: http://localhost:16686
+
+[server]
+port_rpc_admin_local
+port_ws_admin_local
+
+[port_rpc_admin_local]
+port = 5005
+ip = 127.0.0.1
+admin = 127.0.0.1
+protocol = http
+
+[port_ws_admin_local]
+port = 6006
+ip = 127.0.0.1
+admin = 127.0.0.1
+protocol = ws
+
+[node_db]
+type=NuDB
+path=docker/telemetry/data/nudb
+online_delete=256
+advisory_delete=0
+
+[database_path]
+docker/telemetry/data
+
+[debug_logfile]
+docker/telemetry/data/debug.log
+
+[rpc_startup]
+{ "command": "log_level", "severity": "debug" }
+
+[ssl_verify]
+0
+
+# --- OpenTelemetry tracing ---
+[telemetry]
+enabled=1
+service_instance_id=rippled-standalone
+endpoint=http://localhost:4318/v1/traces
+exporter=otlp_http
+sampling_ratio=1.0
+batch_size=512
+batch_delay_ms=5000
+max_queue_size=2048
+trace_rpc=1
+trace_transactions=1
+trace_consensus=1
+trace_peer=0
+trace_ledger=1