Merge branch 'pratik/otel-phase8-log-correlation' into pratik/otel-phase9-metric-gap-fill

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Pratik Mankawde
2026-04-29 20:24:52 +01:00
1526 changed files with 75760 additions and 25422 deletions

View File

@@ -1,6 +1,6 @@
# OpenTelemetry Integration Testing Guide
This document describes how to verify the rippled OpenTelemetry telemetry
This document describes how to verify the xrpld OpenTelemetry telemetry
pipeline end-to-end, from span generation through the observability stack
(otel-collector, Tempo, Prometheus, Grafana).
@@ -118,25 +118,25 @@ Wait 5 seconds for the batch export, then:
```bash
TEMPO="http://localhost:3200"
# Check rippled service is registered
# Check xrpld service is registered
curl -s "$TEMPO/api/v2/search/tag/resource.service.name/values" | jq '.tagValues[].value'
# Check RPC spans
curl -s "$TEMPO/api/search" \
--data-urlencode 'q={resource.service.name="rippled" && name="rpc.request"}' \
--data-urlencode 'q={resource.service.name="xrpld" && name="rpc.request"}' \
--data-urlencode 'limit=5' | jq '.traces | length'
curl -s "$TEMPO/api/search" \
--data-urlencode 'q={resource.service.name="rippled" && name="rpc.process"}' \
--data-urlencode 'q={resource.service.name="xrpld" && name="rpc.process"}' \
--data-urlencode 'limit=5' | jq '.traces | length'
curl -s "$TEMPO/api/search" \
--data-urlencode 'q={resource.service.name="rippled" && name="rpc.command.server_info"}' \
--data-urlencode 'q={resource.service.name="xrpld" && name="rpc.command.server_info"}' \
--data-urlencode 'limit=5' | jq '.traces | length'
# Check transaction spans
curl -s "$TEMPO/api/search" \
--data-urlencode 'q={resource.service.name="rippled" && name="tx.process"}' \
--data-urlencode 'q={resource.service.name="xrpld" && name="tx.process"}' \
--data-urlencode 'limit=5' | jq '.traces | length'
```
@@ -374,21 +374,27 @@ See the "Verification Queries" section below.
## Expected Span Catalog
All 12 production span names instrumented across Phases 2-4:
All 16 production span names instrumented across Phases 2-5:
| Span Name | Source File | Phase | Key Attributes | How to Trigger |
| --------------------------- | --------------------- | ----- | --------------------------------------------------------------------------------- | ------------------------- |
| `rpc.request` | ServerHandler.cpp:271 | 2 | -- | Any HTTP RPC call |
| `rpc.process` | ServerHandler.cpp:573 | 2 | -- | Any HTTP RPC call |
| `rpc.ws_message` | ServerHandler.cpp:384 | 2 | -- | WebSocket RPC message |
| `rpc.command.<name>` | RPCHandler.cpp:161 | 2 | `xrpl.rpc.command`, `xrpl.rpc.version`, `xrpl.rpc.role` | Any RPC command |
| `tx.process` | NetworkOPs.cpp:1227 | 3 | `xrpl.tx.hash`, `xrpl.tx.local`, `xrpl.tx.path` | Submit transaction |
| `tx.receive` | PeerImp.cpp:1273 | 3 | `xrpl.peer.id` | Peer relays transaction |
| `consensus.proposal.send` | RCLConsensus.cpp:177 | 4 | `xrpl.consensus.round` | Consensus proposing phase |
| `consensus.ledger_close` | RCLConsensus.cpp:282 | 4 | `xrpl.consensus.ledger.seq`, `xrpl.consensus.mode` | Ledger close event |
| `consensus.accept` | RCLConsensus.cpp:395 | 4 | `xrpl.consensus.proposers`, `xrpl.consensus.round_time_ms` | Ledger accepted |
| `consensus.validation.send` | RCLConsensus.cpp:753 | 4 | `xrpl.consensus.ledger.seq`, `xrpl.consensus.proposing` | Validation sent |
| `consensus.accept.apply` | RCLConsensus.cpp:453 | 4 | `xrpl.consensus.close_time`, `close_time_correct`, `close_resolution_ms`, `state` | Ledger apply + close time |
| Span Name | Source File | Phase | Key Attributes | How to Trigger |
| --------------------------- | --------------------- | ----- | ---------------------------------------------------------------------------------------- | ------------------------- |
| `rpc.request` | ServerHandler.cpp:271 | 2 | -- | Any HTTP RPC call |
| `rpc.process` | ServerHandler.cpp:573 | 2 | -- | Any HTTP RPC call |
| `rpc.ws_message` | ServerHandler.cpp:384 | 2 | -- | WebSocket RPC message |
| `rpc.command.<name>` | RPCHandler.cpp:161 | 2 | `xrpl.rpc.command`, `xrpl.rpc.version`, `xrpl.rpc.role` | Any RPC command |
| `tx.process` | NetworkOPs.cpp:1227 | 3 | `xrpl.tx.hash`, `xrpl.tx.local`, `xrpl.tx.path` | Submit transaction |
| `tx.receive` | PeerImp.cpp:1273 | 3 | `xrpl.peer.id` | Peer relays transaction |
| `consensus.proposal.send` | RCLConsensus.cpp:177 | 4 | `xrpl.consensus.round` | Consensus proposing phase |
| `consensus.ledger_close` | RCLConsensus.cpp:282 | 4 | `xrpl.consensus.ledger.seq`, `xrpl.consensus.mode` | Ledger close event |
| `consensus.accept` | RCLConsensus.cpp:395 | 4 | `xrpl.consensus.proposers`, `xrpl.consensus.round_time_ms` | Ledger accepted |
| `consensus.validation.send` | RCLConsensus.cpp:753 | 4 | `xrpl.consensus.ledger.seq`, `xrpl.consensus.proposing` | Validation sent |
| `consensus.accept.apply` | RCLConsensus.cpp:453 | 4 | `xrpl.consensus.close_time`, `close_time_correct`, `close_resolution_ms`, `state` | Ledger apply + close time |
| `tx.apply` | BuildLedger.cpp:88 | 5 | `xrpl.ledger.tx_count`, `xrpl.ledger.tx_failed` | Ledger close (tx set) |
| `ledger.build` | BuildLedger.cpp:31 | 5 | `xrpl.ledger.seq`, `xrpl.ledger.close_time`, `close_time_correct`, `close_resolution_ms` | Ledger build |
| `ledger.validate` | LedgerMaster.cpp:915 | 5 | `xrpl.ledger.seq`, `xrpl.ledger.validations` | Ledger validated |
| `ledger.store` | LedgerMaster.cpp:409 | 5 | `xrpl.ledger.seq` | Ledger stored |
| `peer.proposal.receive` | PeerImp.cpp:1667 | 5 | `xrpl.peer.id`, `xrpl.peer.proposal.trusted` | Peer sends proposal |
| `peer.validation.receive` | PeerImp.cpp:2264 | 5 | `xrpl.peer.id`, `xrpl.peer.validation.trusted` | Peer sends validation |
---
@@ -407,12 +413,14 @@ curl -s "$TEMPO/api/v2/search/tag/resource.service.name/values" | jq '.tagValues
# Query traces by operation
for op in "rpc.request" "rpc.process" \
"rpc.command.server_info" "rpc.command.server_state" "rpc.command.ledger" \
"tx.process" "tx.receive" \
"tx.process" "tx.receive" "tx.apply" \
"consensus.proposal.send" "consensus.ledger_close" \
"consensus.accept" "consensus.accept.apply" \
"consensus.validation.send"; do
"consensus.validation.send" \
"ledger.build" "ledger.validate" "ledger.store" \
"peer.proposal.receive" "peer.validation.receive"; do
count=$(curl -s "$TEMPO/api/search" \
--data-urlencode "q={resource.service.name=\"rippled\" && name=\"$op\"}" \
--data-urlencode "q={resource.service.name=\"xrpld\" && name=\"$op\"}" \
--data-urlencode "limit=5" \
| jq '.traces | length')
printf "%-35s %s traces\n" "$op" "$count"
@@ -445,9 +453,11 @@ Open http://localhost:3000 (anonymous admin access enabled).
Pre-configured dashboards:
- **RPC Performance**: Request rates, latency percentiles by command
- **Transaction Overview**: Transaction processing rates and paths
- **Consensus Health**: Consensus round duration and proposer counts
- **RPC Performance**: Request rates, latency percentiles by command, top commands, WebSocket rate
- **Transaction Overview**: Transaction processing rates, apply duration, peer relay, failed tx rate
- **Consensus Health**: Consensus round duration, proposer counts, mode tracking, accept heatmap
- **Ledger Operations**: Build/validate/store rates and durations, TX apply metrics
- **Peer Network**: Proposal/validation receive rates, trusted vs untrusted breakdown (requires `trace_peer=1`)
Pre-configured datasources:
@@ -459,14 +469,14 @@ Pre-configured datasources:
## Test 3: Log-Trace Correlation (Phase 8)
Phase 8 injects `trace_id` and `span_id` into rippled's log output when
Phase 8 injects `trace_id` and `span_id` into xrpld's log output when
a log line is emitted within an active OTel span. This test verifies the
end-to-end log-trace correlation pipeline.
### Step 1: Verify trace_id in log output
After running Test 1 or Test 2 (which generate RPC spans), check the
rippled debug.log for trace context:
xrpld debug.log for trace context:
```bash
grep 'trace_id=[a-f0-9]\{32\} span_id=[a-f0-9]\{16\}' /path/to/debug.log
@@ -496,13 +506,13 @@ Expected result: `1` (the trace exists in Jaeger).
### Step 3: Verify Loki log ingestion
The OTel Collector's filelog receiver tails rippled's debug.log and
The OTel Collector's filelog receiver tails xrpld's debug.log and
exports parsed entries to Loki. Verify Loki has received entries:
```bash
# Query Loki for any rippled logs
# Query Loki for any xrpld logs
curl -sG "http://localhost:3100/loki/api/v1/query" \
--data-urlencode 'query={job="rippled"}' \
--data-urlencode 'query={job="xrpld"}' \
--data-urlencode 'limit=5' | jq '.data.result | length'
```
@@ -519,7 +529,7 @@ Expected: > 0 results.
### Step 5: Verify Grafana Loki-to-Tempo correlation
1. In Grafana **Explore**, select **Loki** datasource
2. Query: `{job="rippled"} |= "trace_id="`
2. Query: `{job="xrpld"} |= "trace_id="`
3. In the log results, click the **TraceID** derived field link
4. Verify it navigates to the full trace in Tempo
@@ -578,7 +588,7 @@ Expected: > 0 results.
### No trace_id in log output (Phase 8)
1. Verify rippled was built with `telemetry=ON` (`-Dtelemetry=ON` in CMake)
1. Verify xrpld was built with `telemetry=ON` (`-Dtelemetry=ON` in CMake)
2. Verify `enabled=1` in the `[telemetry]` config section
3. Log lines only contain trace context when emitted inside an active span.
Background logs (startup, periodic tasks outside spans) will not have

View File

@@ -1,7 +1,7 @@
# Docker Compose stack for rippled OpenTelemetry observability.
# Docker Compose stack for xrpld OpenTelemetry observability.
#
# Provides services for local development:
# - otel-collector: receives OTLP traces from rippled, batches and
# - otel-collector: receives OTLP traces from xrpld, batches and
# forwards them to Tempo. Listens on ports 4317 (gRPC)
# and 4318 (HTTP).
# - tempo: Grafana Tempo tracing backend, queryable via Grafana Explore
@@ -12,42 +12,45 @@
# Usage:
# docker compose -f docker/telemetry/docker-compose.yml up -d
#
# Configure rippled to export traces by adding to xrpld.cfg:
# Configure xrpld to export traces by adding to xrpld.cfg:
# [telemetry]
# enabled=1
# endpoint=http://localhost:4318/v1/traces
version: "3.8"
services:
# OpenTelemetry Collector: receives spans from xrpld via OTLP protocol,
# batches them for efficiency, and forwards to Tempo for storage.
otel-collector:
image: otel/opentelemetry-collector-contrib:latest
image: otel/opentelemetry-collector-contrib:0.121.0
command: ["--config=/etc/otel-collector-config.yaml"]
ports:
- "4317:4317" # OTLP gRPC
- "4318:4318" # OTLP HTTP (traces + native OTel metrics)
- "8889:8889" # Prometheus metrics (spanmetrics + OTLP)
- "4318:4318" # OTLP HTTP
- "8125:8125/udp" # StatsD UDP (beast::insight metrics)
- "8889:8889" # Prometheus metrics (spanmetrics + statsd)
- "13133:13133" # Health check
# StatsD UDP port removed — beast::insight now uses native OTLP.
# Uncomment if using server=statsd fallback:
# - "8125:8125/udp"
volumes:
# Mount collector pipeline config (receivers → processors → exporters)
- ./otel-collector-config.yaml:/etc/otel-collector-config.yaml:ro
depends_on:
- tempo
networks:
- rippled-telemetry
- xrpld-telemetry
# Grafana Tempo: distributed tracing backend that stores and indexes
# spans. Queryable via TraceQL in Grafana Explore.
tempo:
image: grafana/tempo:2.7.2
command: ["-config.file=/etc/tempo.yaml"]
ports:
- "3200:3200" # Tempo HTTP API (health, query)
- "3200:3200" # Tempo HTTP API (health check, query)
volumes:
# Mount Tempo storage and ingestion config
- ./tempo.yaml:/etc/tempo.yaml:ro
# Persistent volume for trace data (WAL + blocks)
- tempo-data:/var/tempo
networks:
- rippled-telemetry
- xrpld-telemetry
prometheus:
image: prom/prometheus:latest
@@ -58,27 +61,35 @@ services:
depends_on:
- otel-collector
networks:
- rippled-telemetry
- xrpld-telemetry
# Grafana: visualization UI with Tempo pre-configured as a datasource.
# Anonymous admin access enabled for local development convenience.
grafana:
image: grafana/grafana:latest
image: grafana/grafana:11.5.2
environment:
- GF_AUTH_ANONYMOUS_ENABLED=true
- GF_AUTH_ANONYMOUS_ORG_ROLE=Admin
- GF_AUTH_ANONYMOUS_ENABLED=true # No login required for local dev
- GF_AUTH_ANONYMOUS_ORG_ROLE=Admin # Full access without auth
ports:
- "3000:3000"
- "3000:3000" # Grafana web UI
volumes:
# Auto-provision Tempo datasource and search filters on startup
- ./grafana/provisioning:/etc/grafana/provisioning:ro
- ./grafana/dashboards:/var/lib/grafana/dashboards:ro
depends_on:
- tempo
- prometheus
networks:
- rippled-telemetry
- xrpld-telemetry
# Named volume for Tempo trace storage (WAL and compacted blocks).
# Data persists across container restarts. Remove with:
# docker compose -f docker/telemetry/docker-compose.yml down -v
volumes:
tempo-data:
# Isolated bridge network so services communicate by container name
# (e.g., the collector reaches Tempo at http://tempo:4317).
networks:
rippled-telemetry:
xrpld-telemetry:
driver: bridge

View File

@@ -29,14 +29,14 @@
"datasource": {
"type": "prometheus"
},
"expr": "histogram_quantile(0.95, sum by (le, exported_instance) (rate(traces_span_metrics_duration_milliseconds_bucket{exported_instance=~\"$node\", span_name=\"consensus.accept\"}[5m])))",
"expr": "histogram_quantile(0.95, sum by (le, exported_instance) (rate(traces_span_metrics_duration_milliseconds_bucket{exported_instance=~\"$node\", xrpl_consensus_mode=~\"$consensus_mode\", span_name=\"consensus.accept\"}[5m])))",
"legendFormat": "P95 Round Duration [{{exported_instance}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "histogram_quantile(0.50, sum by (le, exported_instance) (rate(traces_span_metrics_duration_milliseconds_bucket{exported_instance=~\"$node\", span_name=\"consensus.accept\"}[5m])))",
"expr": "histogram_quantile(0.50, sum by (le, exported_instance) (rate(traces_span_metrics_duration_milliseconds_bucket{exported_instance=~\"$node\", xrpl_consensus_mode=~\"$consensus_mode\", span_name=\"consensus.accept\"}[5m])))",
"legendFormat": "P50 Round Duration [{{exported_instance}}]"
}
],
@@ -75,7 +75,7 @@
"datasource": {
"type": "prometheus"
},
"expr": "sum by (exported_instance) (rate(traces_span_metrics_calls_total{exported_instance=~\"$node\", span_name=\"consensus.proposal.send\"}[5m]))",
"expr": "sum by (exported_instance) (rate(traces_span_metrics_calls_total{exported_instance=~\"$node\", xrpl_consensus_mode=~\"$consensus_mode\", span_name=\"consensus.proposal.send\"}[5m]))",
"legendFormat": "Proposals / Sec [{{exported_instance}}]"
}
],
@@ -114,7 +114,7 @@
"datasource": {
"type": "prometheus"
},
"expr": "histogram_quantile(0.95, sum by (le, exported_instance) (rate(traces_span_metrics_duration_milliseconds_bucket{exported_instance=~\"$node\", span_name=\"consensus.ledger_close\"}[5m])))",
"expr": "histogram_quantile(0.95, sum by (le, exported_instance) (rate(traces_span_metrics_duration_milliseconds_bucket{exported_instance=~\"$node\", xrpl_consensus_mode=~\"$consensus_mode\", span_name=\"consensus.ledger_close\"}[5m])))",
"legendFormat": "P95 Close Duration [{{exported_instance}}]"
}
],
@@ -153,7 +153,7 @@
"datasource": {
"type": "prometheus"
},
"expr": "sum by (exported_instance) (rate(traces_span_metrics_calls_total{exported_instance=~\"$node\", span_name=\"consensus.validation.send\"}[5m]))",
"expr": "sum by (exported_instance) (rate(traces_span_metrics_calls_total{exported_instance=~\"$node\", xrpl_consensus_mode=~\"$consensus_mode\", span_name=\"consensus.validation.send\"}[5m]))",
"legendFormat": "Validations / Sec [{{exported_instance}}]"
}
],
@@ -179,14 +179,14 @@
"datasource": {
"type": "prometheus"
},
"expr": "histogram_quantile(0.95, sum by (le, exported_instance) (rate(traces_span_metrics_duration_milliseconds_bucket{exported_instance=~\"$node\", span_name=\"consensus.accept.apply\"}[5m])))",
"expr": "histogram_quantile(0.95, sum by (le, exported_instance) (rate(traces_span_metrics_duration_milliseconds_bucket{exported_instance=~\"$node\", xrpl_consensus_mode=~\"$consensus_mode\", span_name=\"consensus.accept.apply\"}[5m])))",
"legendFormat": "P95 Apply Duration [{{exported_instance}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "histogram_quantile(0.50, sum by (le, exported_instance) (rate(traces_span_metrics_duration_milliseconds_bucket{exported_instance=~\"$node\", span_name=\"consensus.accept.apply\"}[5m])))",
"expr": "histogram_quantile(0.50, sum by (le, exported_instance) (rate(traces_span_metrics_duration_milliseconds_bucket{exported_instance=~\"$node\", xrpl_consensus_mode=~\"$consensus_mode\", span_name=\"consensus.accept.apply\"}[5m])))",
"legendFormat": "P50 Apply Duration [{{exported_instance}}]"
}
],
@@ -219,8 +219,8 @@
"datasource": {
"type": "prometheus"
},
"expr": "sum by (exported_instance) (rate(traces_span_metrics_calls_total{exported_instance=~\"$node\", span_name=\"consensus.accept.apply\"}[5m]))",
"legendFormat": "Total Rounds / Sec [{{exported_instance}}]"
"expr": "sum by (xrpl_consensus_close_time_correct, exported_instance) (rate(traces_span_metrics_calls_total{span_name=\"consensus.accept.apply\", xrpl_consensus_mode=~\"$consensus_mode\", exported_instance=~\"$node\"}[$__rate_interval]))",
"legendFormat": "Close Time Correct={{xrpl_consensus_close_time_correct}} [{{exported_instance}}]"
}
],
"fieldConfig": {
@@ -397,6 +397,263 @@
"format": "heatmap"
}
]
},
{
"title": "Close Time: Raw Proposals (Per Node)",
"description": "Each node's raw proposed close time (xrpl.consensus.close_time_self) \u2014 the unrounded wall clock value at the moment the node closed its ledger. Compare across nodes to see clock drift.",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 40
},
"fieldConfig": {
"defaults": {
"unit": "dateTimeFromNow",
"custom": {
"drawStyle": "points",
"pointSize": 6,
"showPoints": "always"
}
},
"overrides": []
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
},
"legend": {
"displayMode": "table",
"placement": "bottom",
"calcs": ["lastNotNull"]
}
},
"targets": [
{
"datasource": {
"type": "tempo"
},
"queryType": "traceql",
"query": "{name=\"consensus.accept.apply\" && resource.service.instance.id=~\"$node\" && span.xrpl.consensus.close_time_correct=~\"$close_time_correct\"} | select(span.xrpl.consensus.close_time_self)",
"refId": "A"
}
]
},
{
"title": "Close Time: Effective / Quantized",
"description": "The consensus-agreed close time after rounding to the current resolution bin (xrpl.consensus.close_time). This is the value written to the ledger header. All nodes in agreement produce the same value.",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 40
},
"fieldConfig": {
"defaults": {
"unit": "dateTimeFromNow",
"custom": {
"drawStyle": "points",
"pointSize": 6,
"showPoints": "always"
}
},
"overrides": []
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
},
"legend": {
"displayMode": "table",
"placement": "bottom",
"calcs": ["lastNotNull"]
}
},
"targets": [
{
"datasource": {
"type": "tempo"
},
"queryType": "traceql",
"query": "{name=\"consensus.accept.apply\" && resource.service.instance.id=~\"$node\" && span.xrpl.consensus.close_time_correct=~\"$close_time_correct\"} | select(span.xrpl.consensus.close_time)",
"refId": "A"
}
]
},
{
"title": "Close Time Vote Bins & Resolution",
"description": "Number of distinct close time vote bins (xrpl.consensus.close_time_vote_bins) and the bin size / resolution in ms (xrpl.consensus.close_resolution_ms). More bins = more clock disagreement. Resolution adapts: finer (10s) when validators agree, coarser (120s) when they disagree.",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 48
},
"fieldConfig": {
"defaults": {
"custom": {
"drawStyle": "line",
"lineInterpolation": "stepAfter",
"pointSize": 5,
"showPoints": "auto"
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "Vote Bins"
},
"properties": [
{
"id": "unit",
"value": "short"
},
{
"id": "custom.axisPlacement",
"value": "left"
}
]
},
{
"matcher": {
"id": "byName",
"options": "Resolution"
},
"properties": [
{
"id": "unit",
"value": "ms"
},
{
"id": "custom.axisPlacement",
"value": "right"
}
]
}
]
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
},
"legend": {
"displayMode": "table",
"placement": "bottom",
"calcs": ["mean", "max"]
}
},
"targets": [
{
"datasource": {
"type": "tempo"
},
"queryType": "traceql",
"query": "{name=\"consensus.accept.apply\" && resource.service.instance.id=~\"$node\" && span.xrpl.consensus.close_time_correct=~\"$close_time_correct\"} | select(span.xrpl.consensus.close_time_vote_bins)",
"refId": "A"
},
{
"datasource": {
"type": "tempo"
},
"queryType": "traceql",
"query": "{name=\"consensus.accept.apply\" && resource.service.instance.id=~\"$node\" && span.xrpl.consensus.close_time_correct=~\"$close_time_correct\"} | select(span.xrpl.consensus.close_resolution_ms)",
"refId": "B"
}
]
},
{
"title": "Close Time Resolution Direction",
"description": "Whether close time resolution increased (coarser bins, more disagreement), decreased (finer bins, better agreement), or stayed unchanged relative to the previous ledger. Based on xrpl.consensus.resolution_direction attribute.",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 48
},
"fieldConfig": {
"defaults": {
"custom": {
"drawStyle": "bars",
"fillOpacity": 40,
"pointSize": 5,
"showPoints": "auto"
}
},
"overrides": []
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
},
"legend": {
"displayMode": "table",
"placement": "bottom",
"calcs": ["lastNotNull"]
}
},
"targets": [
{
"datasource": {
"type": "tempo"
},
"queryType": "traceql",
"query": "{name=\"consensus.accept.apply\" && resource.service.instance.id=~\"$node\" && span.xrpl.consensus.close_time_correct=~\"$close_time_correct\" && span.xrpl.consensus.resolution_direction=~\"$resolution_direction\"} | select(span.xrpl.consensus.resolution_direction)",
"refId": "A"
}
]
},
{
"title": "Close Time Bin Distribution",
"description": "Distribution of raw proposed close times across quantized bins. Shows how many nodes' proposals landed in each resolution bin per consensus round. A single dominant bin indicates good clock agreement; spread across bins indicates drift or network latency.",
"type": "barchart",
"gridPos": {
"h": 8,
"w": 24,
"x": 0,
"y": 56
},
"fieldConfig": {
"defaults": {
"unit": "short",
"custom": {
"fillOpacity": 60
}
},
"overrides": []
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
},
"legend": {
"displayMode": "table",
"placement": "bottom",
"calcs": ["sum"]
},
"xTickLabelRotation": -45,
"barWidth": 0.8,
"stacking": "normal"
},
"targets": [
{
"datasource": {
"type": "tempo"
},
"queryType": "traceql",
"query": "{name=\"consensus.accept.apply\" && resource.service.instance.id=~\"$node\" && span.xrpl.consensus.close_time_correct=~\"$close_time_correct\"} | select(span.xrpl.consensus.close_time, span.xrpl.consensus.close_time_vote_bins)",
"refId": "A"
}
]
}
],
"schemaVersion": 39,
@@ -406,7 +663,7 @@
{
"name": "node",
"label": "Node",
"description": "Filter by rippled node (service.instance.id e.g. Node-1)",
"description": "Filter by rippled node (service.instance.id \u2014 e.g. Node-1)",
"type": "query",
"query": "label_values(traces_span_metrics_calls_total, exported_instance)",
"datasource": {
@@ -442,6 +699,71 @@
"multi": true,
"refresh": 2,
"sort": 1
},
{
"name": "close_time_correct",
"label": "Close Time Agreed",
"type": "custom",
"query": "true,false",
"current": {
"text": "All",
"value": "$__all"
},
"includeAll": true,
"allValue": ".*",
"multi": true,
"options": [
{
"text": "All",
"value": "$__all",
"selected": true
},
{
"text": "true",
"value": "true",
"selected": false
},
{
"text": "false",
"value": "false",
"selected": false
}
]
},
{
"name": "resolution_direction",
"label": "Resolution Direction",
"type": "custom",
"query": "increased,decreased,unchanged",
"current": {
"text": "All",
"value": "$__all"
},
"includeAll": true,
"allValue": ".*",
"multi": true,
"options": [
{
"text": "All",
"value": "$__all",
"selected": true
},
{
"text": "increased",
"value": "increased",
"selected": false
},
{
"text": "decreased",
"value": "decreased",
"selected": false
},
{
"text": "unchanged",
"value": "unchanged",
"selected": false
}
]
}
]
},

View File

@@ -0,0 +1,506 @@
{
"annotations": {
"list": []
},
"description": "Ledger data exchange and object fetch traffic from beast::insight StatsD. Covers ledger sync, node data retrieval, and transaction set exchange. Requires [insight] server=statsd in rippled config.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [],
"panels": [
{
"title": "Ledger Data Exchange (Bytes In)",
"description": "Inbound bytes for ledger data sub-categories. 'ledger_data' = aggregated ledger data, sub-types include Transaction_Set_candidate (proposed tx sets), Transaction_Node (tx tree nodes), and Account_State_Node (state tree nodes). High Account_State_Node traffic indicates state sync; high Transaction_Set_candidate indicates consensus catch-up. Sourced from TrafficCount.h ledger_data_* categories.",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 0
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_data_get_Bytes_In",
"legendFormat": "Ledger Data Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_data_share_Bytes_In",
"legendFormat": "Ledger Data Share"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_data_Transaction_Set_candidate_get_Bytes_In",
"legendFormat": "TX Set Candidate Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_data_Transaction_Set_candidate_share_Bytes_In",
"legendFormat": "TX Set Candidate Share"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_data_Transaction_Node_get_Bytes_In",
"legendFormat": "TX Node Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_data_Transaction_Node_share_Bytes_In",
"legendFormat": "TX Node Share"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_data_Account_State_Node_get_Bytes_In",
"legendFormat": "Account State Node Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_data_Account_State_Node_share_Bytes_In",
"legendFormat": "Account State Node Share"
}
],
"fieldConfig": {
"defaults": {
"unit": "decbytes",
"custom": {
"axisLabel": "Bytes In",
"spanNulls": true,
"insertNulls": false,
"showPoints": "auto",
"pointSize": 3
}
},
"overrides": []
}
},
{
"title": "Ledger Share/Get Traffic (Bytes)",
"description": "Legacy ledger share and get traffic by sub-type. These are the older ledger fetch protocol categories (as opposed to ledger_data_* which is the newer protocol). Sub-types: Transaction_Set_candidate, Transaction_node, Account_State_node, plus aggregate ledger_share and ledger_get. Sourced from TrafficCount.h ledger_* categories.",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 0
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_share_Bytes_In",
"legendFormat": "Ledger Share In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_get_Bytes_In",
"legendFormat": "Ledger Get In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_Transaction_Set_candidate_share_Bytes_In",
"legendFormat": "TX Set Candidate Share"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_Transaction_Set_candidate_get_Bytes_In",
"legendFormat": "TX Set Candidate Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_Transaction_node_share_Bytes_In",
"legendFormat": "TX Node Share"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_Transaction_node_get_Bytes_In",
"legendFormat": "TX Node Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_Account_State_node_share_Bytes_In",
"legendFormat": "Account State Share"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_Account_State_node_get_Bytes_In",
"legendFormat": "Account State Get"
}
],
"fieldConfig": {
"defaults": {
"unit": "decbytes",
"custom": {
"axisLabel": "Bytes In",
"spanNulls": true,
"insertNulls": false,
"showPoints": "auto",
"pointSize": 3
}
},
"overrides": []
}
},
{
"title": "GetObject Traffic by Type (Bytes In)",
"description": "Object fetch traffic by object type. GetObject is the protocol for fetching specific SHAMap nodes. Types: Ledger (full ledger headers), Transaction (individual txs), Transaction_node (tx tree nodes), Account_State_node (state tree nodes), CAS (Content Addressable Storage objects), Fetch_Pack (batch fetch during catch-up), Transactions (bulk tx fetch). High Fetch_Pack traffic indicates a node is catching up. Sourced from TrafficCount.h getobject_* categories.",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 8
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Ledger_get_Bytes_In",
"legendFormat": "Ledger Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Ledger_share_Bytes_In",
"legendFormat": "Ledger Share"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Transaction_get_Bytes_In",
"legendFormat": "Transaction Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Transaction_share_Bytes_In",
"legendFormat": "Transaction Share"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Transaction_node_get_Bytes_In",
"legendFormat": "TX Node Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Transaction_node_share_Bytes_In",
"legendFormat": "TX Node Share"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Account_State_node_get_Bytes_In",
"legendFormat": "Account State Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Account_State_node_share_Bytes_In",
"legendFormat": "Account State Share"
}
],
"fieldConfig": {
"defaults": {
"unit": "decbytes",
"custom": {
"axisLabel": "Bytes In",
"spanNulls": true,
"insertNulls": false,
"showPoints": "auto",
"pointSize": 3
}
},
"overrides": []
}
},
{
"title": "GetObject Aggregate & Special Types (Bytes In)",
"description": "Aggregate getobject traffic plus special categories: CAS (Content Addressable Storage) for SHAMap node fetch, Fetch_Pack for bulk batch downloads during catch-up, Transactions for bulk tx fetch, and the aggregate getobject_get/getobject_share totals. Sourced from TrafficCount.h getobject_* categories.",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 8
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_CAS_get_Bytes_In",
"legendFormat": "CAS Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_CAS_share_Bytes_In",
"legendFormat": "CAS Share"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Fetch_Pack_share_Bytes_In",
"legendFormat": "Fetch Pack Share"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Fetch_Pack_get_Bytes_In",
"legendFormat": "Fetch Pack Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Transactions_get_Bytes_In",
"legendFormat": "Transactions Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_get_Bytes_In",
"legendFormat": "Aggregate Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_share_Bytes_In",
"legendFormat": "Aggregate Share"
}
],
"fieldConfig": {
"defaults": {
"unit": "decbytes",
"custom": {
"axisLabel": "Bytes In",
"spanNulls": true,
"insertNulls": false,
"showPoints": "auto",
"pointSize": 3
}
},
"overrides": []
}
},
{
"title": "GetObject Messages by Type",
"description": "Message counts for object fetch operations. Shows how many individual fetch requests and responses are exchanged per type. High message counts with low byte counts indicate small object fetches; the inverse indicates large batch transfers. Sourced from TrafficCount.h getobject_* categories.",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 16
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Ledger_get_Messages_In",
"legendFormat": "Ledger Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Transaction_get_Messages_In",
"legendFormat": "Transaction Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Transaction_node_get_Messages_In",
"legendFormat": "TX Node Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Account_State_node_get_Messages_In",
"legendFormat": "Account State Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_CAS_get_Messages_In",
"legendFormat": "CAS Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Fetch_Pack_get_Messages_In",
"legendFormat": "Fetch Pack Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Transactions_get_Messages_In",
"legendFormat": "Transactions Get"
}
],
"fieldConfig": {
"defaults": {
"unit": "short",
"custom": {
"axisLabel": "Messages In",
"spanNulls": true,
"insertNulls": false,
"showPoints": "auto",
"pointSize": 3
}
},
"overrides": []
}
},
{
"title": "Overlay Traffic Heatmap (All Categories, Bytes In)",
"description": "Bar gauge showing all overlay traffic categories ranked by inbound bytes. Provides a complete at-a-glance view of which protocol message types consume the most bandwidth across all 57+ traffic categories. Sourced from all TrafficCount.h categories via wildcard match.",
"type": "bargauge",
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 16
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
},
"displayMode": "gradient",
"orientation": "horizontal",
"reduceOptions": {
"calcs": ["lastNotNull"],
"fields": "",
"values": false
}
},
"targets": [
{
"datasource": {
"type": "prometheus"
},
"expr": "topk(20, {__name__=~\"rippled_.*_Bytes_In\", __name__!~\"rippled_total_.*\"})",
"legendFormat": "{{__name__}}"
}
],
"fieldConfig": {
"defaults": {
"unit": "decbytes",
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 1048576
},
{
"color": "red",
"value": 104857600
}
]
}
},
"overrides": []
}
}
],
"schemaVersion": 39,
"tags": ["rippled", "statsd", "ledger", "sync", "telemetry"],
"templating": {
"list": []
},
"time": {
"from": "now-1h",
"to": "now"
},
"title": "Ledger Data & Sync (StatsD)",
"uid": "rippled-statsd-ledger-sync"
}

View File

@@ -1,5 +1,7 @@
{
"annotations": { "list": [] },
"annotations": {
"list": []
},
"description": "Network traffic and peer metrics from beast::insight StatsD. Requires [insight] server=statsd in rippled config.",
"editable": true,
"fiscalYearStartMonth": 0,
@@ -11,18 +13,30 @@
"title": "Active Peers",
"description": "Number of active inbound and outbound peer connections. Sourced from Peer_Finder.Active_Inbound_Peers and Peer_Finder.Active_Outbound_Peers gauges (PeerfinderManager.cpp:214-215). A healthy mainnet node typically has 10-21 outbound and 0-85 inbound peers depending on configuration.",
"type": "timeseries",
"gridPos": { "h": 8, "w": 12, "x": 0, "y": 0 },
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 0
},
"options": {
"tooltip": { "mode": "multi", "sort": "desc" }
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": { "type": "prometheus" },
"datasource": {
"type": "prometheus"
},
"expr": "rippled_Peer_Finder_Active_Inbound_Peers",
"legendFormat": "Inbound Peers"
},
{
"datasource": { "type": "prometheus" },
"datasource": {
"type": "prometheus"
},
"expr": "rippled_Peer_Finder_Active_Outbound_Peers",
"legendFormat": "Outbound Peers"
}
@@ -45,13 +59,23 @@
"title": "Peer Disconnects",
"description": "Cumulative count of peer disconnections. Sourced from the Overlay.Peer_Disconnects gauge (OverlayImpl.h:557). A rising trend indicates network instability, aggressive peer management, or resource exhaustion causing connection drops.",
"type": "timeseries",
"gridPos": { "h": 8, "w": 12, "x": 12, "y": 0 },
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 0
},
"options": {
"tooltip": { "mode": "multi", "sort": "desc" }
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": { "type": "prometheus" },
"datasource": {
"type": "prometheus"
},
"expr": "rippled_Overlay_Peer_Disconnects",
"legendFormat": "Disconnects"
}
@@ -72,29 +96,41 @@
},
{
"title": "Total Network Bytes",
"description": "Total bytes sent and received across all peer connections. Sourced from the total.Bytes_In and total.Bytes_Out traffic category gauges (OverlayImpl.h:535-548). Provides a high-level view of network bandwidth consumption.",
"description": "Rate of total bytes sent and received across all peer connections. Sourced from the total.Bytes_In and total.Bytes_Out traffic category gauges (OverlayImpl.h:535-548). Wrapped in rate() to show throughput rather than cumulative counter values.",
"type": "timeseries",
"gridPos": { "h": 8, "w": 12, "x": 0, "y": 8 },
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 8
},
"options": {
"tooltip": { "mode": "multi", "sort": "desc" }
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": { "type": "prometheus" },
"expr": "rippled_total_Bytes_In",
"datasource": {
"type": "prometheus"
},
"expr": "rate(rippled_total_Bytes_In[5m])",
"legendFormat": "Bytes In"
},
{
"datasource": { "type": "prometheus" },
"expr": "rippled_total_Bytes_Out",
"datasource": {
"type": "prometheus"
},
"expr": "rate(rippled_total_Bytes_Out[5m])",
"legendFormat": "Bytes Out"
}
],
"fieldConfig": {
"defaults": {
"unit": "decbytes",
"unit": "Bps",
"custom": {
"axisLabel": "Bytes",
"axisLabel": "Throughput",
"spanNulls": true,
"insertNulls": false,
"showPoints": "auto",
@@ -108,18 +144,30 @@
"title": "Total Network Messages",
"description": "Total messages sent and received across all peer connections. Sourced from the total.Messages_In and total.Messages_Out traffic category gauges (OverlayImpl.h:535-548). Shows the overall message throughput of the overlay network.",
"type": "timeseries",
"gridPos": { "h": 8, "w": 12, "x": 12, "y": 8 },
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 8
},
"options": {
"tooltip": { "mode": "multi", "sort": "desc" }
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": { "type": "prometheus" },
"datasource": {
"type": "prometheus"
},
"expr": "rippled_total_Messages_In",
"legendFormat": "Messages In"
},
{
"datasource": { "type": "prometheus" },
"datasource": {
"type": "prometheus"
},
"expr": "rippled_total_Messages_Out",
"legendFormat": "Messages Out"
}
@@ -142,23 +190,37 @@
"title": "Transaction Traffic",
"description": "Bytes and messages for transaction-related overlay traffic. Includes the transactions traffic category (OverlayImpl/TrafficCount.h). Spikes indicate high transaction volume on the network or transaction flooding.",
"type": "timeseries",
"gridPos": { "h": 8, "w": 12, "x": 0, "y": 16 },
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 16
},
"options": {
"tooltip": { "mode": "multi", "sort": "desc" }
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": { "type": "prometheus" },
"datasource": {
"type": "prometheus"
},
"expr": "rippled_transactions_Messages_In",
"legendFormat": "TX Messages In"
},
{
"datasource": { "type": "prometheus" },
"datasource": {
"type": "prometheus"
},
"expr": "rippled_transactions_Messages_Out",
"legendFormat": "TX Messages Out"
},
{
"datasource": { "type": "prometheus" },
"datasource": {
"type": "prometheus"
},
"expr": "rippled_transactions_duplicate_Messages_In",
"legendFormat": "TX Duplicate In"
}
@@ -181,28 +243,44 @@
"title": "Proposal Traffic",
"description": "Messages for consensus proposal overlay traffic. Includes proposals, proposals_untrusted, and proposals_duplicate categories (TrafficCount.h). High untrusted or duplicate counts may indicate UNL misconfiguration or network spam.",
"type": "timeseries",
"gridPos": { "h": 8, "w": 12, "x": 12, "y": 16 },
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 16
},
"options": {
"tooltip": { "mode": "multi", "sort": "desc" }
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": { "type": "prometheus" },
"datasource": {
"type": "prometheus"
},
"expr": "rippled_proposals_Messages_In",
"legendFormat": "Proposals In"
},
{
"datasource": { "type": "prometheus" },
"datasource": {
"type": "prometheus"
},
"expr": "rippled_proposals_Messages_Out",
"legendFormat": "Proposals Out"
},
{
"datasource": { "type": "prometheus" },
"datasource": {
"type": "prometheus"
},
"expr": "rippled_proposals_untrusted_Messages_In",
"legendFormat": "Untrusted In"
},
{
"datasource": { "type": "prometheus" },
"datasource": {
"type": "prometheus"
},
"expr": "rippled_proposals_duplicate_Messages_In",
"legendFormat": "Duplicate In"
}
@@ -225,28 +303,44 @@
"title": "Validation Traffic",
"description": "Messages for validation overlay traffic. Includes validations, validations_untrusted, and validations_duplicate categories (TrafficCount.h). Monitoring trusted vs untrusted validation traffic helps detect UNL health issues.",
"type": "timeseries",
"gridPos": { "h": 8, "w": 12, "x": 0, "y": 24 },
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 24
},
"options": {
"tooltip": { "mode": "multi", "sort": "desc" }
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": { "type": "prometheus" },
"datasource": {
"type": "prometheus"
},
"expr": "rippled_validations_Messages_In",
"legendFormat": "Validations In"
},
{
"datasource": { "type": "prometheus" },
"datasource": {
"type": "prometheus"
},
"expr": "rippled_validations_Messages_Out",
"legendFormat": "Validations Out"
},
{
"datasource": { "type": "prometheus" },
"datasource": {
"type": "prometheus"
},
"expr": "rippled_validations_untrusted_Messages_In",
"legendFormat": "Untrusted In"
},
{
"datasource": { "type": "prometheus" },
"datasource": {
"type": "prometheus"
},
"expr": "rippled_validations_duplicate_Messages_In",
"legendFormat": "Duplicate In"
}
@@ -269,13 +363,23 @@
"title": "Overlay Traffic by Category (Bytes In)",
"description": "Top traffic categories by inbound bytes. Includes all 57 overlay traffic categories from TrafficCount.h. Shows which protocol message types consume the most bandwidth. Categories include transactions, proposals, validations, ledger data, getobject, and overlay overhead.",
"type": "bargauge",
"gridPos": { "h": 8, "w": 12, "x": 12, "y": 24 },
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 24
},
"options": {
"tooltip": { "mode": "multi", "sort": "desc" }
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": { "type": "prometheus" },
"datasource": {
"type": "prometheus"
},
"expr": "topk(10, {__name__=~\"rippled_.*_Bytes_In\", __name__!~\"rippled_total_.*\"})",
"legendFormat": "{{__name__}}"
}
@@ -290,78 +394,144 @@
"id": "byName",
"options": "rippled_transactions_Bytes_In"
},
"properties": [{ "id": "displayName", "value": "Transactions" }]
"properties": [
{
"id": "displayName",
"value": "Transactions"
}
]
},
{
"matcher": {
"id": "byName",
"options": "rippled_proposals_Bytes_In"
},
"properties": [{ "id": "displayName", "value": "Proposals" }]
"properties": [
{
"id": "displayName",
"value": "Proposals"
}
]
},
{
"matcher": {
"id": "byName",
"options": "rippled_validations_Bytes_In"
},
"properties": [{ "id": "displayName", "value": "Validations" }]
"properties": [
{
"id": "displayName",
"value": "Validations"
}
]
},
{
"matcher": {
"id": "byName",
"options": "rippled_overhead_Bytes_In"
},
"properties": [{ "id": "displayName", "value": "Overhead" }]
"properties": [
{
"id": "displayName",
"value": "Overhead"
}
]
},
{
"matcher": {
"id": "byName",
"options": "rippled_overhead_overlay_Bytes_In"
},
"properties": [{ "id": "displayName", "value": "Overhead Overlay" }]
"properties": [
{
"id": "displayName",
"value": "Overhead Overlay"
}
]
},
{
"matcher": { "id": "byName", "options": "rippled_ping_Bytes_In" },
"properties": [{ "id": "displayName", "value": "Ping" }]
"matcher": {
"id": "byName",
"options": "rippled_ping_Bytes_In"
},
"properties": [
{
"id": "displayName",
"value": "Ping"
}
]
},
{
"matcher": { "id": "byName", "options": "rippled_status_Bytes_In" },
"properties": [{ "id": "displayName", "value": "Status" }]
"matcher": {
"id": "byName",
"options": "rippled_status_Bytes_In"
},
"properties": [
{
"id": "displayName",
"value": "Status"
}
]
},
{
"matcher": {
"id": "byName",
"options": "rippled_getObject_Bytes_In"
},
"properties": [{ "id": "displayName", "value": "Get Object" }]
"properties": [
{
"id": "displayName",
"value": "Get Object"
}
]
},
{
"matcher": {
"id": "byName",
"options": "rippled_haveTxSet_Bytes_In"
},
"properties": [{ "id": "displayName", "value": "Have Tx Set" }]
"properties": [
{
"id": "displayName",
"value": "Have Tx Set"
}
]
},
{
"matcher": {
"id": "byName",
"options": "rippled_ledgerData_Bytes_In"
},
"properties": [{ "id": "displayName", "value": "Ledger Data" }]
"properties": [
{
"id": "displayName",
"value": "Ledger Data"
}
]
},
{
"matcher": {
"id": "byName",
"options": "rippled_ledger_share_Bytes_In"
},
"properties": [{ "id": "displayName", "value": "Ledger Share" }]
"properties": [
{
"id": "displayName",
"value": "Ledger Share"
}
]
},
{
"matcher": {
"id": "byName",
"options": "rippled_ledger_data_get_Bytes_In"
},
"properties": [{ "id": "displayName", "value": "Ledger Data Get" }]
"properties": [
{
"id": "displayName",
"value": "Ledger Data Get"
}
]
},
{
"matcher": {
@@ -369,7 +539,10 @@
"options": "rippled_ledger_data_share_Bytes_In"
},
"properties": [
{ "id": "displayName", "value": "Ledger Data Share" }
{
"id": "displayName",
"value": "Ledger Data Share"
}
]
},
{
@@ -378,7 +551,10 @@
"options": "rippled_ledger_data_Account_State_Node_get_Bytes_In"
},
"properties": [
{ "id": "displayName", "value": "Account State Node Get" }
{
"id": "displayName",
"value": "Account State Node Get"
}
]
},
{
@@ -387,7 +563,10 @@
"options": "rippled_ledger_data_Account_State_Node_share_Bytes_In"
},
"properties": [
{ "id": "displayName", "value": "Account State Node Share" }
{
"id": "displayName",
"value": "Account State Node Share"
}
]
},
{
@@ -396,7 +575,10 @@
"options": "rippled_ledger_data_Transaction_Node_get_Bytes_In"
},
"properties": [
{ "id": "displayName", "value": "Transaction Node Get" }
{
"id": "displayName",
"value": "Transaction Node Get"
}
]
},
{
@@ -405,7 +587,10 @@
"options": "rippled_ledger_data_Transaction_Node_share_Bytes_In"
},
"properties": [
{ "id": "displayName", "value": "Transaction Node Share" }
{
"id": "displayName",
"value": "Transaction Node Share"
}
]
},
{
@@ -414,7 +599,10 @@
"options": "rippled_ledger_data_Transaction_Set_candidate_get_Bytes_In"
},
"properties": [
{ "id": "displayName", "value": "Tx Set Candidate Get" }
{
"id": "displayName",
"value": "Tx Set Candidate Get"
}
]
},
{
@@ -435,7 +623,10 @@
"options": "rippled_ledger_Transaction_Set_candidate_share_Bytes_In"
},
"properties": [
{ "id": "displayName", "value": "Tx Set Candidate Share" }
{
"id": "displayName",
"value": "Tx Set Candidate Share"
}
]
},
{
@@ -455,16 +646,139 @@
"id": "byName",
"options": "rippled_set_get_Bytes_In"
},
"properties": [{ "id": "displayName", "value": "Set Get" }]
"properties": [
{
"id": "displayName",
"value": "Set Get"
}
]
}
]
}
},
{
"title": "Duplicate Traffic (Wasted Bandwidth)",
"description": "Rate of duplicate overlay traffic across transaction, proposal, and validation categories. Duplicate messages are messages the node has already seen and discards. High duplicate rates indicate inefficient message routing or network topology issues causing redundant relays.",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 32
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "prometheus"
},
"expr": "rate(rippled_transactions_duplicate_Bytes_In[5m])",
"legendFormat": "TX Duplicate In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rate(rippled_transactions_duplicate_Bytes_Out[5m])",
"legendFormat": "TX Duplicate Out"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rate(rippled_proposals_duplicate_Bytes_In[5m])",
"legendFormat": "Proposals Duplicate In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rate(rippled_proposals_duplicate_Bytes_Out[5m])",
"legendFormat": "Proposals Duplicate Out"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rate(rippled_validations_duplicate_Bytes_In[5m])",
"legendFormat": "Validations Duplicate In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rate(rippled_validations_duplicate_Bytes_Out[5m])",
"legendFormat": "Validations Duplicate Out"
}
],
"fieldConfig": {
"defaults": {
"unit": "Bps",
"custom": {
"axisLabel": "Throughput",
"spanNulls": true,
"insertNulls": false,
"showPoints": "auto",
"pointSize": 3
}
},
"overrides": []
}
},
{
"title": "All Traffic Categories (Detail)",
"description": "Top 15 traffic categories by inbound byte rate, excluding the total aggregate. Provides a detailed timeseries view of which overlay message types are consuming the most bandwidth over time. Complements the bar gauge snapshot view in the Overlay Traffic panel.",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 32
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "prometheus"
},
"expr": "topk(15, rate({__name__=~\"rippled_.*_Bytes_In\", __name__!~\"rippled_total_.*\"}[5m]))",
"legendFormat": "{{__name__}}"
}
],
"fieldConfig": {
"defaults": {
"unit": "Bps",
"custom": {
"axisLabel": "Throughput",
"spanNulls": true,
"insertNulls": false,
"showPoints": "auto",
"pointSize": 3
}
},
"overrides": []
}
}
],
"schemaVersion": 39,
"tags": ["rippled", "statsd", "network", "telemetry"],
"templating": { "list": [] },
"time": { "from": "now-1h", "to": "now" },
"title": "rippled Network Traffic (StatsD)",
"templating": {
"list": []
},
"time": {
"from": "now-1h",
"to": "now"
},
"title": "Network Traffic (StatsD)",
"uid": "rippled-statsd-network"
}

View File

@@ -287,7 +287,7 @@
},
{
"title": "Job Queue Depth",
"description": "Current number of jobs waiting in the job queue. Sourced from the job_count gauge (JobQueue.cpp:26). A sustained high value indicates the node cannot process work fast enough \u2014 common during ledger replay or heavy RPC load.",
"description": "Current number of jobs waiting in the job queue. Sourced from the job_count gauge (JobQueue.cpp:26). A sustained high value indicates the node cannot process work fast enough common during ledger replay or heavy RPC load.",
"type": "timeseries",
"gridPos": {
"h": 8,
@@ -399,17 +399,532 @@
},
"overrides": []
}
},
{
"title": "Key Jobs Execution Time",
"description": "Execution time for critical job types at the selected quantile. Sourced from per-job-type events in JobTypeData (JobTypeData.h:48). Shows how long key consensus, transaction, and maintenance jobs take to execute. Spikes indicate processing bottlenecks.",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 32
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_acceptLedger{quantile=\"$quantile\"}",
"legendFormat": "Accept Ledger [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_advanceLedger{quantile=\"$quantile\"}",
"legendFormat": "Advance Ledger [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_transaction{quantile=\"$quantile\"}",
"legendFormat": "Transaction [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_writeObjects{quantile=\"$quantile\"}",
"legendFormat": "Write Objects [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_heartbeat{quantile=\"$quantile\"}",
"legendFormat": "Heartbeat [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_sweep{quantile=\"$quantile\"}",
"legendFormat": "Sweep [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_trustedValidation{quantile=\"$quantile\"}",
"legendFormat": "Trusted Validation [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_trustedProposal{quantile=\"$quantile\"}",
"legendFormat": "Trusted Proposal [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_publishNewLedger{quantile=\"$quantile\"}",
"legendFormat": "Publish New Ledger [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_clientRPC{quantile=\"$quantile\"}",
"legendFormat": "Client RPC [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledgerData{quantile=\"$quantile\"}",
"legendFormat": "Ledger Data [{{quantile}}]"
}
],
"fieldConfig": {
"defaults": {
"unit": "ms",
"custom": {
"axisLabel": "Duration (ms)",
"spanNulls": true,
"insertNulls": false,
"showPoints": "auto",
"pointSize": 3
}
},
"overrides": []
}
},
{
"title": "Key Jobs Dequeue Wait Time",
"description": "Time spent waiting in the job queue before execution for critical job types. Sourced from per-job-type dequeue events (JobTypeData.h:47). High dequeue times indicate the job queue is backlogged and jobs are waiting too long to be scheduled.",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 32
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_acceptLedger_q{quantile=\"$quantile\"}",
"legendFormat": "Accept Ledger [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_advanceLedger_q{quantile=\"$quantile\"}",
"legendFormat": "Advance Ledger [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_transaction_q{quantile=\"$quantile\"}",
"legendFormat": "Transaction [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_writeObjects_q{quantile=\"$quantile\"}",
"legendFormat": "Write Objects [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_heartbeat_q{quantile=\"$quantile\"}",
"legendFormat": "Heartbeat [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_sweep_q{quantile=\"$quantile\"}",
"legendFormat": "Sweep [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_trustedValidation_q{quantile=\"$quantile\"}",
"legendFormat": "Trusted Validation [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_trustedProposal_q{quantile=\"$quantile\"}",
"legendFormat": "Trusted Proposal [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_publishNewLedger_q{quantile=\"$quantile\"}",
"legendFormat": "Publish New Ledger [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_clientRPC_q{quantile=\"$quantile\"}",
"legendFormat": "Client RPC [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledgerData_q{quantile=\"$quantile\"}",
"legendFormat": "Ledger Data [{{quantile}}]"
}
],
"fieldConfig": {
"defaults": {
"unit": "ms",
"custom": {
"axisLabel": "Wait Time (ms)",
"spanNulls": true,
"insertNulls": false,
"showPoints": "auto",
"pointSize": 3
}
},
"overrides": []
}
},
{
"title": "FullBelowCache Size",
"description": "Number of entries in the FullBelowCache. Sourced from the TaggedCache size gauge (TaggedCache.h:183) for the Node family full below cache (NodeFamily.cpp:29). This cache tracks which SHAMap nodes have all children present locally, avoiding redundant fetches during ledger acquisition.",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 40
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_Node_family_full_below_cache_size",
"legendFormat": "FullBelowCache Size"
}
],
"fieldConfig": {
"defaults": {
"unit": "short",
"custom": {
"axisLabel": "Entries",
"spanNulls": true,
"insertNulls": false,
"showPoints": "auto",
"pointSize": 3
}
},
"overrides": []
}
},
{
"title": "FullBelowCache Hit Rate",
"description": "Hit rate percentage for the FullBelowCache. Sourced from the TaggedCache hit_rate gauge (TaggedCache.h:184). A high hit rate means the node is efficiently reusing cached knowledge about complete SHAMap subtrees. Low hit rates during steady state warrant investigation.",
"type": "gauge",
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 40
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_Node_family_full_below_cache_hit_rate",
"legendFormat": "Hit Rate"
}
],
"fieldConfig": {
"defaults": {
"unit": "percent",
"min": 0,
"max": 100,
"thresholds": {
"steps": [
{
"color": "red",
"value": null
},
{
"color": "yellow",
"value": 25
},
{
"color": "green",
"value": 50
}
]
}
},
"overrides": []
}
},
{
"title": "Ledger Publish Gap",
"description": "Difference between published and validated ledger ages. Computed as Published_Ledger_Age minus Validated_Ledger_Age. A value near zero means the publish pipeline keeps up with validation. A growing gap indicates the publish pipeline is falling behind, potentially causing stale data for subscribers.",
"type": "stat",
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 48
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_LedgerMaster_Published_Ledger_Age - rippled_LedgerMaster_Validated_Ledger_Age",
"legendFormat": "Publish Gap"
}
],
"fieldConfig": {
"defaults": {
"unit": "s",
"thresholds": {
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 5
},
{
"color": "red",
"value": 10
}
]
}
},
"overrides": []
}
},
{
"title": "State Duration Rate (Full vs Tracking)",
"description": "Rate of change of time spent in Full and Tracking operating modes, normalized to seconds. Sourced from State_Accounting duration gauges (NetworkOPs.cpp:774-778). In steady state the Full duration rate should be close to 1.0 (gaining one second of Full-mode time per wall-clock second). A drop below 1.0 means the node is spending time in other modes.",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 48
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "prometheus"
},
"expr": "rate(rippled_State_Accounting_Full_duration[5m]) / 1000000",
"legendFormat": "Full Mode Rate"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rate(rippled_State_Accounting_Tracking_duration[5m]) / 1000000",
"legendFormat": "Tracking Mode Rate"
}
],
"fieldConfig": {
"defaults": {
"unit": "short",
"custom": {
"axisLabel": "Rate (s/s)",
"spanNulls": true,
"insertNulls": false,
"showPoints": "auto",
"pointSize": 3
}
},
"overrides": []
}
},
{
"title": "All Jobs Execution Time (Detail)",
"description": "Execution time for ALL non-special job types at the selected quantile. Shows the complete picture of job execution performance. Use the Key Jobs panel for a focused view of the most critical jobs.",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 24,
"x": 0,
"y": 56
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "prometheus"
},
"expr": "{__name__=~\"rippled_(makeFetchPack|publishAcqLedger|untrustedValidation|manifest|localTransaction|ledgerReplayRequest|ledgerRequest|untrustedProposal|ledgerReplayTask|ledgerData|clientCommand|clientSubscribe|clientFeeChange|clientConsensus|clientAccountHistory|clientRPC|clientWebsocket|RPC|updatePaths|transaction|batch|advanceLedger|publishNewLedger|fetchTxnData|writeAhead|trustedValidation|writeObjects|acceptLedger|trustedProposal|sweep|clusterReport|heartbeat|administration|handleHaveTransactions|doTransactions)\", quantile=\"$quantile\"}",
"legendFormat": "{{__name__}} [{{quantile}}]"
}
],
"fieldConfig": {
"defaults": {
"unit": "ms",
"custom": {
"axisLabel": "Duration (ms)",
"spanNulls": true,
"insertNulls": false,
"showPoints": "auto",
"pointSize": 3
}
},
"overrides": []
}
},
{
"title": "All Jobs Dequeue Wait (Detail)",
"description": "Dequeue wait time for ALL non-special job types at the selected quantile. Shows the complete picture of job queue waiting times. High wait times across many job types indicate systemic job queue congestion.",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 24,
"x": 0,
"y": 64
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "prometheus"
},
"expr": "{__name__=~\"rippled_(makeFetchPack_q|publishAcqLedger_q|untrustedValidation_q|manifest_q|localTransaction_q|ledgerReplayRequest_q|ledgerRequest_q|untrustedProposal_q|ledgerReplayTask_q|ledgerData_q|clientCommand_q|clientSubscribe_q|clientFeeChange_q|clientConsensus_q|clientAccountHistory_q|clientRPC_q|clientWebsocket_q|RPC_q|updatePaths_q|transaction_q|batch_q|advanceLedger_q|publishNewLedger_q|fetchTxnData_q|writeAhead_q|trustedValidation_q|writeObjects_q|acceptLedger_q|trustedProposal_q|sweep_q|clusterReport_q|heartbeat_q|administration_q|handleHaveTransactions_q|doTransactions_q)\", quantile=\"$quantile\"}",
"legendFormat": "{{__name__}} [{{quantile}}]"
}
],
"fieldConfig": {
"defaults": {
"unit": "ms",
"custom": {
"axisLabel": "Wait Time (ms)",
"spanNulls": true,
"insertNulls": false,
"showPoints": "auto",
"pointSize": 3
}
},
"overrides": []
}
}
],
"schemaVersion": 39,
"tags": ["rippled", "statsd", "node-health", "telemetry"],
"templating": {
"list": []
"list": [
{
"name": "quantile",
"label": "Quantile",
"type": "custom",
"query": "0.5,0.9,0.95,0.99",
"current": {
"selected": true,
"text": "0.95",
"value": "0.95"
},
"options": [
{
"selected": false,
"text": "0.5",
"value": "0.5"
},
{
"selected": false,
"text": "0.9",
"value": "0.9"
},
{
"selected": true,
"text": "0.95",
"value": "0.95"
},
{
"selected": false,
"text": "0.99",
"value": "0.99"
}
],
"multi": false,
"includeAll": false
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"title": "rippled Node Health (StatsD)",
"title": "Node Health (StatsD)",
"uid": "rippled-statsd-node-health"
}

View File

@@ -0,0 +1,566 @@
{
"annotations": {
"list": []
},
"description": "Detailed overlay traffic breakdown for categories not covered by the main Network Traffic dashboard. Includes squelch, overhead, validator lists, object fetch, ledger sync, and protocol negotiation traffic. Requires [insight] server=statsd in rippled config.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [],
"panels": [
{
"title": "Squelch Traffic (Messages)",
"description": "Squelch-related overlay messages. Squelch is the peer traffic management protocol that suppresses redundant message forwarding. 'squelch' = squelch control messages, 'squelch_suppressed' = messages suppressed by squelch, 'squelch_ignored' = squelch directives that were ignored. High suppressed counts indicate effective bandwidth savings; high ignored counts may indicate misconfigured peers. Sourced from TrafficCount.h squelch categories.",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 0
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_squelch_Messages_In",
"legendFormat": "Squelch In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_squelch_Messages_Out",
"legendFormat": "Squelch Out"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_squelch_suppressed_Messages_In",
"legendFormat": "Suppressed In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_squelch_suppressed_Messages_Out",
"legendFormat": "Suppressed Out"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_squelch_ignored_Messages_In",
"legendFormat": "Ignored In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_squelch_ignored_Messages_Out",
"legendFormat": "Ignored Out"
}
],
"fieldConfig": {
"defaults": {
"unit": "short",
"custom": {
"axisLabel": "Messages",
"spanNulls": true,
"insertNulls": false,
"showPoints": "auto",
"pointSize": 3
}
},
"overrides": []
}
},
{
"title": "Overhead Traffic Breakdown (Bytes)",
"description": "Overlay protocol overhead by sub-category. 'overhead' = base protocol overhead (ping, status, etc.), 'overhead_cluster' = intra-cluster communication overhead, 'overhead_manifest' = validator manifest distribution overhead. High cluster overhead may indicate frequent cluster state syncs; high manifest overhead occurs during UNL changes. Sourced from TrafficCount.h overhead categories.",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 0
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_overhead_Bytes_In",
"legendFormat": "Base Overhead In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_overhead_Bytes_Out",
"legendFormat": "Base Overhead Out"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_overhead_cluster_Bytes_In",
"legendFormat": "Cluster In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_overhead_cluster_Bytes_Out",
"legendFormat": "Cluster Out"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_overhead_manifest_Bytes_In",
"legendFormat": "Manifest In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_overhead_manifest_Bytes_Out",
"legendFormat": "Manifest Out"
}
],
"fieldConfig": {
"defaults": {
"unit": "decbytes",
"custom": {
"axisLabel": "Bytes",
"spanNulls": true,
"insertNulls": false,
"showPoints": "auto",
"pointSize": 3
}
},
"overrides": []
}
},
{
"title": "Validator List Traffic",
"description": "Validator list (UNL) distribution traffic. Validator lists are exchanged when peers share their trusted validator configurations. Spikes occur during UNL updates or when new peers connect. Sourced from TrafficCount.h validator_lists category.",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 8
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_validator_lists_Bytes_In",
"legendFormat": "Bytes In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_validator_lists_Bytes_Out",
"legendFormat": "Bytes Out"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_validator_lists_Messages_In",
"legendFormat": "Messages In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_validator_lists_Messages_Out",
"legendFormat": "Messages Out"
}
],
"fieldConfig": {
"defaults": {
"unit": "short",
"custom": {
"axisLabel": "Count",
"spanNulls": true,
"insertNulls": false,
"showPoints": "auto",
"pointSize": 3
}
},
"overrides": [
{
"matcher": {
"id": "byRegexp",
"options": "/Bytes/"
},
"properties": [
{
"id": "custom.axisPlacement",
"value": "right"
},
{
"id": "unit",
"value": "decbytes"
}
]
}
]
}
},
{
"title": "Set Get/Share Traffic (Bytes)",
"description": "Transaction set get and share traffic. 'set_get' = requests to fetch transaction sets (sent during ledger close), 'set_share' = responses sharing transaction sets. High set_get traffic indicates peers frequently requesting missing transaction sets, which may signal sync delays. Sourced from TrafficCount.h set_get/set_share categories.",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 8
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_set_get_Bytes_In",
"legendFormat": "Set Get In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_set_get_Bytes_Out",
"legendFormat": "Set Get Out"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_set_share_Bytes_In",
"legendFormat": "Set Share In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_set_share_Bytes_Out",
"legendFormat": "Set Share Out"
}
],
"fieldConfig": {
"defaults": {
"unit": "decbytes",
"custom": {
"axisLabel": "Bytes",
"spanNulls": true,
"insertNulls": false,
"showPoints": "auto",
"pointSize": 3
}
},
"overrides": []
}
},
{
"title": "Have/Requested Transactions (Messages)",
"description": "Transaction availability protocol messages. 'have_transactions' = advertisements that a peer has specific transactions available, 'requested_transactions' = explicit requests for transaction data. A high ratio of requested to have may indicate peers are behind on transaction propagation. Sourced from TrafficCount.h have_transactions/requested_transactions categories.",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 16
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_have_transactions_Messages_In",
"legendFormat": "Have TX In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_have_transactions_Messages_Out",
"legendFormat": "Have TX Out"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_requested_transactions_Messages_In",
"legendFormat": "Requested TX In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_requested_transactions_Messages_Out",
"legendFormat": "Requested TX Out"
}
],
"fieldConfig": {
"defaults": {
"unit": "short",
"custom": {
"axisLabel": "Messages",
"spanNulls": true,
"insertNulls": false,
"showPoints": "auto",
"pointSize": 3
}
},
"overrides": []
}
},
{
"title": "Unknown / Unclassified Traffic",
"description": "Traffic that does not match any known overlay message category. Non-zero values may indicate protocol version mismatches, corrupted messages, or new message types not yet classified. Sourced from TrafficCount.h unknown category.",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 16
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_unknown_Bytes_In",
"legendFormat": "Unknown Bytes In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_unknown_Bytes_Out",
"legendFormat": "Unknown Bytes Out"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_unknown_Messages_In",
"legendFormat": "Unknown Messages In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_unknown_Messages_Out",
"legendFormat": "Unknown Messages Out"
}
],
"fieldConfig": {
"defaults": {
"unit": "short",
"custom": {
"axisLabel": "Count",
"spanNulls": true,
"insertNulls": false,
"showPoints": "auto",
"pointSize": 3
}
},
"overrides": [
{
"matcher": {
"id": "byRegexp",
"options": "/Bytes/"
},
"properties": [
{
"id": "custom.axisPlacement",
"value": "right"
},
{
"id": "unit",
"value": "decbytes"
}
]
}
]
}
},
{
"title": "Proof Path Traffic",
"description": "Proof path request/response traffic for ledger state proof exchange. Used by peers to verify specific ledger entries without downloading the full ledger. High request volume may indicate peers validating state during catch-up. Sourced from TrafficCount.h proof_path_request/proof_path_response categories.",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 24
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_proof_path_request_Bytes_In",
"legendFormat": "Request Bytes In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_proof_path_request_Bytes_Out",
"legendFormat": "Request Bytes Out"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_proof_path_response_Bytes_In",
"legendFormat": "Response Bytes In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_proof_path_response_Bytes_Out",
"legendFormat": "Response Bytes Out"
}
],
"fieldConfig": {
"defaults": {
"unit": "decbytes",
"custom": {
"axisLabel": "Bytes",
"spanNulls": true,
"insertNulls": false,
"showPoints": "auto",
"pointSize": 3
}
},
"overrides": []
}
},
{
"title": "Replay Delta Traffic",
"description": "Replay delta request/response traffic for ledger replay protocol. Used during catch-up to efficiently replay ledger state changes. Sourced from TrafficCount.h replay_delta_request/replay_delta_response categories.",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 24
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_replay_delta_request_Bytes_In",
"legendFormat": "Request Bytes In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_replay_delta_request_Bytes_Out",
"legendFormat": "Request Bytes Out"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_replay_delta_response_Bytes_In",
"legendFormat": "Response Bytes In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_replay_delta_response_Bytes_Out",
"legendFormat": "Response Bytes Out"
}
],
"fieldConfig": {
"defaults": {
"unit": "decbytes",
"custom": {
"axisLabel": "Bytes",
"spanNulls": true,
"insertNulls": false,
"showPoints": "auto",
"pointSize": 3
}
},
"overrides": []
}
}
],
"schemaVersion": 39,
"tags": ["rippled", "statsd", "overlay", "network", "telemetry"],
"templating": {
"list": []
},
"time": {
"from": "now-1h",
"to": "now"
},
"title": "Overlay Traffic Detail (StatsD)",
"uid": "rippled-statsd-overlay-detail"
}

View File

@@ -391,6 +391,6 @@
"from": "now-1h",
"to": "now"
},
"title": "rippled RPC & Pathfinding (StatsD)",
"title": "RPC & Pathfinding (StatsD)",
"uid": "rippled-statsd-rpc"
}

View File

@@ -147,8 +147,8 @@
"datasource": {
"type": "prometheus"
},
"expr": "sum by (exported_instance) (rate(traces_span_metrics_calls_total{exported_instance=~\"$node\", span_name=\"tx.receive\"}[5m]))",
"legendFormat": "Total Received [{{exported_instance}}]"
"expr": "sum by (xrpl_tx_suppressed, exported_instance) (rate(traces_span_metrics_calls_total{span_name=\"tx.receive\", exported_instance=~\"$node\"}[$__rate_interval]))",
"legendFormat": "Suppressed={{xrpl_tx_suppressed}} [{{exported_instance}}]"
}
],
"fieldConfig": {

View File

@@ -1,9 +1,9 @@
apiVersion: 1
providers:
- name: rippled-telemetry
- name: xrpld-telemetry
orgId: 1
folder: rippled
folder: xrpld
type: file
disableDeletion: false
editable: true

View File

@@ -1,7 +1,7 @@
# Grafana datasource provisioning for Grafana Tempo.
# Auto-configures Tempo as a trace data source on Grafana startup.
# Access Grafana at http://localhost:3000, then use Explore -> Tempo
# to browse rippled traces using TraceQL.
# to browse xrpld traces using TraceQL.
#
# Search filters provide pre-configured dropdowns in the Explore UI.
# Each phase adds filters for the span attributes it introduces.
@@ -21,6 +21,9 @@ datasources:
jsonData:
nodeGraph:
enabled: true
# Service map and traces-to-metrics require a Prometheus datasource
# (not included in this stack). These features are inactive until a
# Prometheus service is added to docker-compose.yml.
serviceMap:
datasourceUid: prometheus
# Phase 8: Trace-to-log correlation — enables one-click navigation
@@ -38,7 +41,7 @@ datasources:
search:
filters:
# --- Node identification filters ---
# service.name: logical service name (default: "rippled").
# service.name: logical service name (default: "xrpld").
# Useful when running multiple service types in the same collector.
- id: service-name
tag: service.name
@@ -53,7 +56,7 @@ datasources:
operator: "="
scope: resource
type: static
# service.version: rippled build version (e.g., "2.4.0-b1").
# service.version: xrpld build version (e.g., "2.4.0-b1").
# Filter traces from specific software releases.
- id: node-version
tag: service.version
@@ -62,29 +65,36 @@ datasources:
type: dynamic
# xrpl.network.id: numeric network identifier
# (0 = mainnet, 1 = testnet, 2 = devnet, etc.).
# Derived from the [network_id] config section.
- id: network-id
tag: xrpl.network.id
operator: "="
scope: resource
type: dynamic
# xrpl.network.type: human-readable network name
# ("mainnet", "testnet", "devnet", "standalone").
# xrpl.network.type: human-readable network name derived from
# network ID ("mainnet", "testnet", "devnet", "unknown").
- id: network-type
tag: xrpl.network.type
operator: "="
scope: resource
type: static
# --- Span intrinsic filters ---
# name: the span operation name (e.g., "rpc.command.server_info").
# Use to find traces for a specific RPC command or subsystem.
- id: span-name
tag: name
operator: "="
scope: intrinsic
type: static
# status: span completion status ("ok", "error", "unset").
# Filter for failed operations to diagnose errors.
- id: span-status
tag: status
operator: "="
scope: intrinsic
type: static
# duration: span wall-clock duration. Use with ">" operator
# to find slow operations (e.g., duration > 500ms).
- id: span-duration
tag: duration
operator: ">"
@@ -106,6 +116,17 @@ datasources:
operator: "="
scope: span
type: dynamic
# Phase 2: Node health filters (Task 2.8)
- id: node-amendment-blocked
tag: xrpl.node.amendment_blocked
operator: "="
scope: span
type: static
- id: node-server-state
tag: xrpl.node.server_state
operator: "="
scope: span
type: dynamic
# Phase 3: Transaction tracing filters
- id: tx-hash
tag: xrpl.tx.hash
@@ -153,3 +174,54 @@ datasources:
operator: "="
scope: span
type: dynamic
- id: consensus-proposers
tag: xrpl.consensus.proposers
operator: "="
scope: span
type: dynamic
- id: consensus-result
tag: xrpl.consensus.result
operator: "="
scope: span
type: dynamic
- id: consensus-mode-old
tag: xrpl.consensus.mode.old
operator: "="
scope: span
type: dynamic
- id: consensus-mode-new
tag: xrpl.consensus.mode.new
operator: "="
scope: span
type: dynamic
- id: consensus-ledger-id
tag: xrpl.consensus.ledger_id
operator: "="
scope: span
type: static
# Phase 3/4: Additional transaction and queue filters
- id: tx-path
tag: xrpl.tx.path
operator: "="
scope: span
type: dynamic
- id: tx-suppressed
tag: xrpl.tx.suppressed
operator: "="
scope: span
type: dynamic
- id: peer-version
tag: xrpl.peer.version
operator: "="
scope: span
type: dynamic
- id: txq-status
tag: xrpl.txq.status
operator: "="
scope: span
type: dynamic
- id: txq-ter-code
tag: xrpl.txq.ter_code
operator: "="
scope: span
type: dynamic

View File

@@ -358,8 +358,8 @@ trace_ledger=1
metrics_endpoint=http://localhost:4318/v1/metrics
[insight]
server=otel
endpoint=http://localhost:4318/v1/metrics
server=statsd
address=127.0.0.1:8125
prefix=rippled
[rpc_startup]
@@ -482,18 +482,8 @@ seq_num=$(echo "$acct_result" | jq -r '.result.account_data.Sequence' 2>/dev/nul
log " Genesis account sequence: $seq_num"
# Submit payment
submit_result=$(curl -sf "http://localhost:$RPC_PORT_BASE" -d "{
\"method\": \"submit\",
\"params\": [{
\"secret\": \"$GENESIS_SEED\",
\"tx_json\": {
\"TransactionType\": \"Payment\",
\"Account\": \"$GENESIS_ACCOUNT\",
\"Destination\": \"$DEST_ACCOUNT\",
\"Amount\": \"10000000\"
}
}]
}")
submit_result=$(curl -sf "http://localhost:$RPC_PORT_BASE" \
-d "{\"method\":\"submit\",\"params\":[{\"secret\":\"$GENESIS_SEED\",\"tx_json\":{\"TransactionType\":\"Payment\",\"Account\":\"$GENESIS_ACCOUNT\",\"Destination\":\"$DEST_ACCOUNT\",\"Amount\":\"10000000\"}}]}")
engine_result=$(echo "$submit_result" | jq -r '.result.engine_result' 2>/dev/null || echo "unknown")
tx_hash=$(echo "$submit_result" | jq -r '.result.tx_json.hash' 2>/dev/null || echo "unknown")
@@ -593,52 +583,42 @@ else
fi
# ---------------------------------------------------------------------------
# Step 10b: Verify native OTel metrics in Prometheus (beast::insight)
# Step 10b: Verify StatsD metrics in Prometheus
# ---------------------------------------------------------------------------
log ""
log "--- Phase 7: Native OTel Metrics (beast::insight via OTLP) ---"
log "Waiting 20s for OTLP metric export + Prometheus scrape..."
log "--- Phase 6: StatsD Metrics (beast::insight) ---"
log "Waiting 20s for StatsD aggregation + Prometheus scrape..."
sleep 20
check_otel_metric() {
check_statsd_metric() {
local metric_name="$1"
local result
result=$(curl -sf "$PROM/api/v1/query?query=$metric_name" \
| jq '.data.result | length' 2>/dev/null || echo 0)
if [ "$result" -gt 0 ]; then
ok "OTel: $metric_name ($result series)"
ok "StatsD: $metric_name ($result series)"
else
fail "OTel: $metric_name (0 series)"
fail "StatsD: $metric_name (0 series)"
fi
}
# Node health gauges (ObservableGauge — no _total suffix)
check_otel_metric "rippled_LedgerMaster_Validated_Ledger_Age"
check_otel_metric "rippled_LedgerMaster_Published_Ledger_Age"
check_otel_metric "rippled_job_count"
# Node health gauges
check_statsd_metric "rippled_LedgerMaster_Validated_Ledger_Age"
check_statsd_metric "rippled_LedgerMaster_Published_Ledger_Age"
check_statsd_metric "rippled_job_count"
# State accounting
check_otel_metric "rippled_State_Accounting_Full_duration"
check_statsd_metric "rippled_State_Accounting_Full_duration"
# Peer finder
check_otel_metric "rippled_Peer_Finder_Active_Inbound_Peers"
check_otel_metric "rippled_Peer_Finder_Active_Outbound_Peers"
check_statsd_metric "rippled_Peer_Finder_Active_Inbound_Peers"
check_statsd_metric "rippled_Peer_Finder_Active_Outbound_Peers"
# RPC counters (Counter — Prometheus adds _total suffix automatically)
check_otel_metric "rippled_rpc_requests_total"
# RPC counters (only if RPC was exercised — should be true from Steps 5-8)
check_statsd_metric "rippled_rpc_requests"
# Overlay traffic
check_otel_metric "rippled_total_Bytes_In"
# Verify StatsD receiver is NOT required (no statsd receiver in pipeline)
log ""
log "--- Verify StatsD receiver is not required ---"
statsd_port_check=$(curl -sf "http://localhost:8125" 2>&1 || echo "refused")
if echo "$statsd_port_check" | grep -qi "refused\|error\|connection"; then
ok "StatsD port 8125 is not listening (not required)"
else
fail "StatsD port 8125 appears to be listening (should not be needed)"
fi
check_statsd_metric "rippled_total_Bytes_In"
# ---------------------------------------------------------------------------
# Step 10c: Verify Phase 9 OTel SDK Metrics

View File

@@ -1,12 +1,16 @@
# OpenTelemetry Collector configuration for rippled development.
# OpenTelemetry Collector configuration for xrpld development.
#
# Pipelines:
# traces: OTLP receiver -> batch processor -> debug + Tempo + spanmetrics
# metrics: spanmetrics connector -> Prometheus exporter
# metrics: StatsD receiver + spanmetrics connector -> Prometheus exporter
#
# rippled sends traces via OTLP/HTTP to port 4318. The collector batches
# xrpld sends traces via OTLP/HTTP to port 4318. The collector batches
# them, forwards to Tempo, and derives RED metrics via the spanmetrics
# connector, which Prometheus scrapes on port 8889.
#
# xrpld's beast::insight framework sends StatsD UDP metrics to port 8125.
# The StatsD receiver aggregates them and exports to Prometheus alongside
# the span-derived metrics.
receivers:
otlp:
@@ -15,6 +19,20 @@ receivers:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
statsd:
endpoint: "0.0.0.0:8125"
aggregation_interval: 15s
enable_metric_type: true
is_monotonic_counter: true
timer_histogram_mapping:
- statsd_type: "timing"
observer_type: "summary"
summary:
percentiles: [0, 50, 90, 95, 99, 100]
- statsd_type: "histogram"
observer_type: "summary"
summary:
percentiles: [0, 50, 90, 95, 99, 100]
processors:
batch:
@@ -34,7 +52,9 @@ connectors:
- name: xrpl.rpc.command
- name: xrpl.rpc.status
- name: xrpl.consensus.mode
- name: xrpl.consensus.close_time_correct
- name: xrpl.tx.local
- name: xrpl.tx.suppressed
- name: xrpl.peer.proposal.trusted
- name: xrpl.peer.validation.trusted
@@ -48,12 +68,17 @@ exporters:
prometheus:
endpoint: 0.0.0.0:8889
extensions:
health_check:
endpoint: 0.0.0.0:13133
service:
extensions: [health_check]
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [debug, otlp/tempo, spanmetrics]
metrics:
receivers: [spanmetrics]
receivers: [spanmetrics, statsd]
exporters: [prometheus]

View File

@@ -1,4 +1,4 @@
# Grafana Tempo configuration for rippled telemetry stack.
# Grafana Tempo configuration for xrpld telemetry stack.
#
# Runs in single-binary mode for local development.
# Receives traces via OTLP/gRPC from the OTel Collector and stores
@@ -40,8 +40,10 @@ metrics_generator:
source: tempo
storage:
path: /var/tempo/generator/wal
remote_write:
- url: http://prometheus:9090/api/v1/write
# Uncomment and add a Prometheus service to docker-compose.yml
# to enable remote_write for service graph metrics:
# remote_write:
# - url: http://prometheus:9090/api/v1/write
overrides:
defaults:

View File

@@ -46,7 +46,7 @@ docker/telemetry/data/debug.log
# --- OpenTelemetry tracing ---
[telemetry]
enabled=1
service_instance_id=rippled-standalone
service_instance_id=xrpld-standalone
endpoint=http://localhost:4318/v1/traces
exporter=otlp_http
sampling_ratio=1.0