fix(telemetry): phase-6 dashboards — rename UIDs, add $node filter, drop line numbers

Phase-6 introduces ledger-operations, peer-network, and the five StatsD
dashboards. Align them with the rest of the chain:

- Rename dashboard UIDs from `rippled-*` to `xrpld-*` so the provisioned
  UIDs match the post-rename-script documentation (`docs.sh` rewrites
  .md but not .json, so the two drifted). Runbook references
  `xrpld-rpc-perf`, `xrpld-transactions`, etc., now the JSON matches.
- Add the `$node` template variable + `exported_instance=~"$node"` filter
  to every target in the five `statsd-*` dashboards. Mirrors the pattern
  already used by consensus-health, ledger-operations, and peer-network
  per the project rule that every dashboard must support per-node
  filtering.
- Strip `:<line>` (and `:NN-NN` range) suffixes from C++ file references
  in every dashboard panel description and in docker/telemetry/TESTING.md.
  Line numbers drift on every refactor; the filename alone is enough to
  grep.
- Replace stale `rpc.request` entries with the real emitted span names
  (`rpc.http_request`, `rpc.ws_upgrade`, `rpc.ws_message`, `rpc.process`)
  in TESTING.md so operators can copy-paste the filters and hit real
  traces.
- Also drop the `:706` line ref from the `StatsDCollector.cpp` callout
  in `06-implementation-phases.md`.
This commit is contained in:
Pratik Mankawde
2026-05-14 16:51:14 +01:00
parent dfe91e071f
commit 44cdc8133e
9 changed files with 329 additions and 224 deletions

View File

@@ -350,7 +350,7 @@ xrpld has a mature metrics framework (`beast::insight`) that emits StatsD-format
### Wire Format Fix (Task 6.1) — DEFERRED
The `StatsDMeterImpl` in `StatsDCollector.cpp:706` sends metrics with `|m` suffix, which is non-standard StatsD. The OTel StatsD receiver silently drops these. Fix: change `|m` to `|c` (counter), which is semantically correct since meters are increment-only counters. Only 2 metrics are affected (`warn`, `drop` in Resource Manager).
The `StatsDMeterImpl` in `StatsDCollector.cpp` sends metrics with `|m` suffix, which is non-standard StatsD. The OTel StatsD receiver silently drops these. Fix: change `|m` to `|c` (counter), which is semantically correct since meters are increment-only counters. Only 2 metrics are affected (`warn`, `drop` in Resource Manager).
**Status**: Deferred as a separate change this is a breaking change for any StatsD backend that previously consumed the custom `|m` type. The Resource Warnings and Resource Drops dashboard panels will show no data until this fix is applied.

View File

@@ -376,25 +376,26 @@ See the "Verification Queries" section below.
All 16 production span names instrumented across Phases 2-5:
| Span Name | Source File | Phase | Key Attributes | How to Trigger |
| --------------------------- | --------------------- | ----- | ---------------------------------------------------------------------------------------- | ------------------------- |
| `rpc.request` | ServerHandler.cpp:271 | 2 | -- | Any HTTP RPC call |
| `rpc.process` | ServerHandler.cpp:573 | 2 | -- | Any HTTP RPC call |
| `rpc.ws_message` | ServerHandler.cpp:384 | 2 | -- | WebSocket RPC message |
| `rpc.command.<name>` | RPCHandler.cpp:161 | 2 | `xrpl.rpc.command`, `xrpl.rpc.version`, `xrpl.rpc.role` | Any RPC command |
| `tx.process` | NetworkOPs.cpp:1227 | 3 | `xrpl.tx.hash`, `xrpl.tx.local`, `xrpl.tx.path` | Submit transaction |
| `tx.receive` | PeerImp.cpp:1273 | 3 | `xrpl.peer.id` | Peer relays transaction |
| `consensus.proposal.send` | RCLConsensus.cpp:177 | 4 | `xrpl.consensus.round` | Consensus proposing phase |
| `consensus.ledger_close` | RCLConsensus.cpp:282 | 4 | `xrpl.consensus.ledger.seq`, `xrpl.consensus.mode` | Ledger close event |
| `consensus.accept` | RCLConsensus.cpp:395 | 4 | `xrpl.consensus.proposers`, `xrpl.consensus.round_time_ms` | Ledger accepted |
| `consensus.validation.send` | RCLConsensus.cpp:753 | 4 | `xrpl.consensus.ledger.seq`, `xrpl.consensus.proposing` | Validation sent |
| `consensus.accept.apply` | RCLConsensus.cpp:453 | 4 | `xrpl.consensus.close_time`, `close_time_correct`, `close_resolution_ms`, `state` | Ledger apply + close time |
| `tx.apply` | BuildLedger.cpp:88 | 5 | `xrpl.ledger.tx_count`, `xrpl.ledger.tx_failed` | Ledger close (tx set) |
| `ledger.build` | BuildLedger.cpp:31 | 5 | `xrpl.ledger.seq`, `xrpl.ledger.close_time`, `close_time_correct`, `close_resolution_ms` | Ledger build |
| `ledger.validate` | LedgerMaster.cpp:915 | 5 | `xrpl.ledger.seq`, `xrpl.ledger.validations` | Ledger validated |
| `ledger.store` | LedgerMaster.cpp:409 | 5 | `xrpl.ledger.seq` | Ledger stored |
| `peer.proposal.receive` | PeerImp.cpp:1667 | 5 | `xrpl.peer.id`, `xrpl.peer.proposal.trusted` | Peer sends proposal |
| `peer.validation.receive` | PeerImp.cpp:2264 | 5 | `xrpl.peer.id`, `xrpl.peer.validation.trusted` | Peer sends validation |
| Span Name | Source File | Phase | Key Attributes | How to Trigger |
| --------------------------- | ----------------- | ----- | ---------------------------------------------------------------------------------------- | ------------------------- |
| `rpc.http_request` | ServerHandler.cpp | 2 | -- | Any HTTP RPC call |
| `rpc.ws_upgrade` | ServerHandler.cpp | 2 | -- | WebSocket upgrade |
| `rpc.ws_message` | ServerHandler.cpp | 2 | -- | WebSocket RPC message |
| `rpc.process` | ServerHandler.cpp | 2 | -- | RPC processing |
| `rpc.command.<name>` | RPCHandler.cpp | 2 | `xrpl.rpc.command`, `xrpl.rpc.version`, `xrpl.rpc.role` | Any RPC command |
| `tx.process` | NetworkOPs.cpp | 3 | `xrpl.tx.hash`, `xrpl.tx.local`, `xrpl.tx.path` | Submit transaction |
| `tx.receive` | PeerImp.cpp | 3 | `xrpl.peer.id` | Peer relays transaction |
| `consensus.proposal.send` | RCLConsensus.cpp | 4 | `xrpl.consensus.round` | Consensus proposing phase |
| `consensus.ledger_close` | RCLConsensus.cpp | 4 | `xrpl.consensus.ledger.seq`, `xrpl.consensus.mode` | Ledger close event |
| `consensus.accept` | RCLConsensus.cpp | 4 | `xrpl.consensus.proposers`, `xrpl.consensus.round_time_ms` | Ledger accepted |
| `consensus.validation.send` | RCLConsensus.cpp | 4 | `xrpl.consensus.ledger.seq`, `xrpl.consensus.proposing` | Validation sent |
| `consensus.accept.apply` | RCLConsensus.cpp | 4 | `xrpl.consensus.close_time`, `close_time_correct`, `close_resolution_ms`, `state` | Ledger apply + close time |
| `tx.apply` | BuildLedger.cpp | 5 | `xrpl.ledger.tx_count`, `xrpl.ledger.tx_failed` | Ledger close (tx set) |
| `ledger.build` | BuildLedger.cpp | 5 | `xrpl.ledger.seq`, `xrpl.ledger.close_time`, `close_time_correct`, `close_resolution_ms` | Ledger build |
| `ledger.validate` | LedgerMaster.cpp | 5 | `xrpl.ledger.seq`, `xrpl.ledger.validations` | Ledger validated |
| `ledger.store` | LedgerMaster.cpp | 5 | `xrpl.ledger.seq` | Ledger stored |
| `peer.proposal.receive` | PeerImp.cpp | 5 | `xrpl.peer.id`, `xrpl.peer.proposal.trusted` | Peer sends proposal |
| `peer.validation.receive` | PeerImp.cpp | 5 | `xrpl.peer.id`, `xrpl.peer.validation.trusted` | Peer sends validation |
---

View File

@@ -10,7 +10,7 @@
"panels": [
{
"title": "Ledger Build Rate",
"description": "Rate at which new ledgers are being built. The ledger.build span (BuildLedger.cpp:31) wraps the entire buildLedgerImpl() function which creates a new ledger from a parent, applies transactions, flushes SHAMap nodes, and sets the accepted state. Should match the consensus close rate (~0.25/sec on mainnet with ~4s rounds).",
"description": "Rate at which new ledgers are being built. The ledger.build span (BuildLedger.cpp) wraps the entire buildLedgerImpl() function which creates a new ledger from a parent, applies transactions, flushes SHAMap nodes, and sets the accepted state. Should match the consensus close rate (~0.25/sec on mainnet with ~4s rounds).",
"type": "stat",
"gridPos": {
"h": 8,
@@ -88,7 +88,7 @@
},
{
"title": "Ledger Validation Rate",
"description": "Rate at which ledgers pass the validation threshold and are accepted as fully validated. The ledger.validate span (LedgerMaster.cpp:915) fires in checkAccept() only after the ledger receives sufficient trusted validations (>= quorum). Records xrpl.ledger.seq and validations (the number of validations received).",
"description": "Rate at which ledgers pass the validation threshold and are accepted as fully validated. The ledger.validate span (LedgerMaster.cpp) fires in checkAccept() only after the ledger receives sufficient trusted validations (>= quorum). Records xrpl.ledger.seq and validations (the number of validations received).",
"type": "stat",
"gridPos": {
"h": 8,
@@ -156,7 +156,7 @@
},
{
"title": "Transaction Apply Duration",
"description": "p95 and p50 duration of applying the consensus transaction set during ledger building. The tx.apply span (BuildLedger.cpp:88) wraps applyTransactions() which iterates through the CanonicalTXSet with multiple retry passes. Records tx_count (successful) and tx_failed (failed) as attributes.",
"description": "p95 and p50 duration of applying the consensus transaction set during ledger building. The tx.apply span (BuildLedger.cpp) wraps applyTransactions() which iterates through the CanonicalTXSet with multiple retry passes. Records tx_count (successful) and tx_failed (failed) as attributes.",
"type": "timeseries",
"gridPos": {
"h": 8,
@@ -241,7 +241,7 @@
},
{
"title": "Ledger Store Rate",
"description": "Rate at which ledgers are stored into the ledger history. The ledger.store span (LedgerMaster.cpp:409) wraps storeLedger() which inserts the ledger into the LedgerHistory cache. Records xrpl.ledger.seq. Should match the ledger build rate under normal operation.",
"description": "Rate at which ledgers are stored into the ledger history. The ledger.store span (LedgerMaster.cpp) wraps storeLedger() which inserts the ledger into the LedgerHistory cache. Records xrpl.ledger.seq. Should match the ledger build rate under normal operation.",
"type": "stat",
"gridPos": {
"h": 8,
@@ -349,5 +349,5 @@
"to": "now"
},
"title": "Ledger Operations",
"uid": "rippled-ledger-ops"
"uid": "xrpld-ledger-ops"
}

View File

@@ -11,7 +11,7 @@
"panels": [
{
"title": "Peer Proposal Receive Rate",
"description": "Rate of consensus proposals received from network peers. The peer.proposal.receive span (PeerImp.cpp:1667) fires in onMessage(TMProposeSet) for each incoming proposal. Records xrpl.peer.id (sending peer) and proposal_trusted (whether the proposer is in our UNL). Requires trace_peer=1 in the telemetry config.",
"description": "Rate of consensus proposals received from network peers. The peer.proposal.receive span (PeerImp.cpp) fires in onMessage(TMProposeSet) for each incoming proposal. Records xrpl.peer.id (sending peer) and proposal_trusted (whether the proposer is in our UNL). Requires trace_peer=1 in the telemetry config.",
"type": "timeseries",
"gridPos": {
"h": 8,
@@ -50,7 +50,7 @@
},
{
"title": "Peer Validation Receive Rate",
"description": "Rate of ledger validations received from network peers. The peer.validation.receive span (PeerImp.cpp:2264) fires in onMessage(TMValidation) for each incoming validation message. Records xrpl.peer.id (sending peer) and validation_trusted (whether the validator is trusted). Requires trace_peer=1 in the telemetry config.",
"description": "Rate of ledger validations received from network peers. The peer.validation.receive span (PeerImp.cpp) fires in onMessage(TMValidation) for each incoming validation message. Records xrpl.peer.id (sending peer) and validation_trusted (whether the validator is trusted). Requires trace_peer=1 in the telemetry config.",
"type": "timeseries",
"gridPos": {
"h": 8,
@@ -223,5 +223,5 @@
"to": "now"
},
"title": "Peer Network",
"uid": "rippled-peer-net"
"uid": "xrpld-peer-net"
}

View File

@@ -30,56 +30,56 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_data_get_Bytes_In",
"expr": "rippled_ledger_data_get_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Ledger Data Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_data_share_Bytes_In",
"expr": "rippled_ledger_data_share_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Ledger Data Share"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_data_Transaction_Set_candidate_get_Bytes_In",
"expr": "rippled_ledger_data_Transaction_Set_candidate_get_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "TX Set Candidate Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_data_Transaction_Set_candidate_share_Bytes_In",
"expr": "rippled_ledger_data_Transaction_Set_candidate_share_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "TX Set Candidate Share"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_data_Transaction_Node_get_Bytes_In",
"expr": "rippled_ledger_data_Transaction_Node_get_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "TX Node Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_data_Transaction_Node_share_Bytes_In",
"expr": "rippled_ledger_data_Transaction_Node_share_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "TX Node Share"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_data_Account_State_Node_get_Bytes_In",
"expr": "rippled_ledger_data_Account_State_Node_get_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Account State Node Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_data_Account_State_Node_share_Bytes_In",
"expr": "rippled_ledger_data_Account_State_Node_share_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Account State Node Share"
}
],
@@ -118,56 +118,56 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_share_Bytes_In",
"expr": "rippled_ledger_share_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Ledger Share In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_get_Bytes_In",
"expr": "rippled_ledger_get_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Ledger Get In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_Transaction_Set_candidate_share_Bytes_In",
"expr": "rippled_ledger_Transaction_Set_candidate_share_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "TX Set Candidate Share"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_Transaction_Set_candidate_get_Bytes_In",
"expr": "rippled_ledger_Transaction_Set_candidate_get_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "TX Set Candidate Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_Transaction_node_share_Bytes_In",
"expr": "rippled_ledger_Transaction_node_share_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "TX Node Share"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_Transaction_node_get_Bytes_In",
"expr": "rippled_ledger_Transaction_node_get_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "TX Node Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_Account_State_node_share_Bytes_In",
"expr": "rippled_ledger_Account_State_node_share_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Account State Share"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledger_Account_State_node_get_Bytes_In",
"expr": "rippled_ledger_Account_State_node_get_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Account State Get"
}
],
@@ -206,56 +206,56 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Ledger_get_Bytes_In",
"expr": "rippled_getobject_Ledger_get_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Ledger Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Ledger_share_Bytes_In",
"expr": "rippled_getobject_Ledger_share_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Ledger Share"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Transaction_get_Bytes_In",
"expr": "rippled_getobject_Transaction_get_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Transaction Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Transaction_share_Bytes_In",
"expr": "rippled_getobject_Transaction_share_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Transaction Share"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Transaction_node_get_Bytes_In",
"expr": "rippled_getobject_Transaction_node_get_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "TX Node Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Transaction_node_share_Bytes_In",
"expr": "rippled_getobject_Transaction_node_share_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "TX Node Share"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Account_State_node_get_Bytes_In",
"expr": "rippled_getobject_Account_State_node_get_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Account State Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Account_State_node_share_Bytes_In",
"expr": "rippled_getobject_Account_State_node_share_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Account State Share"
}
],
@@ -294,49 +294,49 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_CAS_get_Bytes_In",
"expr": "rippled_getobject_CAS_get_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "CAS Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_CAS_share_Bytes_In",
"expr": "rippled_getobject_CAS_share_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "CAS Share"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Fetch_Pack_share_Bytes_In",
"expr": "rippled_getobject_Fetch_Pack_share_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Fetch Pack Share"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Fetch_Pack_get_Bytes_In",
"expr": "rippled_getobject_Fetch_Pack_get_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Fetch Pack Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Transactions_get_Bytes_In",
"expr": "rippled_getobject_Transactions_get_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Transactions Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_get_Bytes_In",
"expr": "rippled_getobject_get_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Aggregate Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_share_Bytes_In",
"expr": "rippled_getobject_share_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Aggregate Share"
}
],
@@ -375,49 +375,49 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Ledger_get_Messages_In",
"expr": "rippled_getobject_Ledger_get_Messages_In{exported_instance=~\"$node\"}",
"legendFormat": "Ledger Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Transaction_get_Messages_In",
"expr": "rippled_getobject_Transaction_get_Messages_In{exported_instance=~\"$node\"}",
"legendFormat": "Transaction Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Transaction_node_get_Messages_In",
"expr": "rippled_getobject_Transaction_node_get_Messages_In{exported_instance=~\"$node\"}",
"legendFormat": "TX Node Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Account_State_node_get_Messages_In",
"expr": "rippled_getobject_Account_State_node_get_Messages_In{exported_instance=~\"$node\"}",
"legendFormat": "Account State Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_CAS_get_Messages_In",
"expr": "rippled_getobject_CAS_get_Messages_In{exported_instance=~\"$node\"}",
"legendFormat": "CAS Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Fetch_Pack_get_Messages_In",
"expr": "rippled_getobject_Fetch_Pack_get_Messages_In{exported_instance=~\"$node\"}",
"legendFormat": "Fetch Pack Get"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_getobject_Transactions_get_Messages_In",
"expr": "rippled_getobject_Transactions_get_Messages_In{exported_instance=~\"$node\"}",
"legendFormat": "Transactions Get"
}
],
@@ -463,7 +463,7 @@
"datasource": {
"type": "prometheus"
},
"expr": "topk(20, {__name__=~\"rippled_.*_Bytes_In\", __name__!~\"rippled_total_.*\"})",
"expr": "topk{exported_instance=~\"$node\"}(20, {__name__=~\"rippled_.*_Bytes_In\", __name__!~\"rippled_total_.*\"})",
"legendFormat": "{{__name__}}"
}
],
@@ -495,12 +495,33 @@
"schemaVersion": 39,
"tags": ["rippled", "statsd", "ledger", "sync", "telemetry"],
"templating": {
"list": []
"list": [
{
"name": "node",
"label": "Node",
"description": "Filter by xrpld node (service.instance.id \u2014 e.g. Node-1)",
"type": "query",
"query": "label_values(exported_instance)",
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"includeAll": true,
"allValue": ".*",
"current": {
"text": "All",
"value": "$__all"
},
"multi": true,
"refresh": 2,
"sort": 1
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"title": "Ledger Data & Sync (StatsD)",
"uid": "rippled-statsd-ledger-sync"
"uid": "xrpld-statsd-ledger-sync"
}

View File

@@ -11,7 +11,7 @@
"panels": [
{
"title": "Active Peers",
"description": "Number of active inbound and outbound peer connections. Sourced from Peer_Finder.Active_Inbound_Peers and Peer_Finder.Active_Outbound_Peers gauges (PeerfinderManager.cpp:214-215). A healthy mainnet node typically has 10-21 outbound and 0-85 inbound peers depending on configuration.",
"description": "Number of active inbound and outbound peer connections. Sourced from Peer_Finder.Active_Inbound_Peers and Peer_Finder.Active_Outbound_Peers gauges (PeerfinderManager.cpp). A healthy mainnet node typically has 10-21 outbound and 0-85 inbound peers depending on configuration.",
"type": "timeseries",
"gridPos": {
"h": 8,
@@ -30,14 +30,14 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_Peer_Finder_Active_Inbound_Peers",
"expr": "rippled_Peer_Finder_Active_Inbound_Peers{exported_instance=~\"$node\"}",
"legendFormat": "Inbound Peers"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_Peer_Finder_Active_Outbound_Peers",
"expr": "rippled_Peer_Finder_Active_Outbound_Peers{exported_instance=~\"$node\"}",
"legendFormat": "Outbound Peers"
}
],
@@ -57,7 +57,7 @@
},
{
"title": "Peer Disconnects",
"description": "Cumulative count of peer disconnections. Sourced from the Overlay.Peer_Disconnects gauge (OverlayImpl.h:557). A rising trend indicates network instability, aggressive peer management, or resource exhaustion causing connection drops.",
"description": "Cumulative count of peer disconnections. Sourced from the Overlay.Peer_Disconnects gauge (OverlayImpl.h). A rising trend indicates network instability, aggressive peer management, or resource exhaustion causing connection drops.",
"type": "timeseries",
"gridPos": {
"h": 8,
@@ -76,7 +76,7 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_Overlay_Peer_Disconnects",
"expr": "rippled_Overlay_Peer_Disconnects{exported_instance=~\"$node\"}",
"legendFormat": "Disconnects"
}
],
@@ -96,7 +96,7 @@
},
{
"title": "Total Network Bytes",
"description": "Rate of total bytes sent and received across all peer connections. Sourced from the total.Bytes_In and total.Bytes_Out traffic category gauges (OverlayImpl.h:535-548). Wrapped in rate() to show throughput rather than cumulative counter values.",
"description": "Rate of total bytes sent and received across all peer connections. Sourced from the total.Bytes_In and total.Bytes_Out traffic category gauges (OverlayImpl.h). Wrapped in rate() to show throughput rather than cumulative counter values.",
"type": "timeseries",
"gridPos": {
"h": 8,
@@ -115,14 +115,14 @@
"datasource": {
"type": "prometheus"
},
"expr": "rate(rippled_total_Bytes_In[5m])",
"expr": "rate{exported_instance=~\"$node\"}(rippled_total_Bytes_In[5m])",
"legendFormat": "Bytes In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rate(rippled_total_Bytes_Out[5m])",
"expr": "rate{exported_instance=~\"$node\"}(rippled_total_Bytes_Out[5m])",
"legendFormat": "Bytes Out"
}
],
@@ -142,7 +142,7 @@
},
{
"title": "Total Network Messages",
"description": "Total messages sent and received across all peer connections. Sourced from the total.Messages_In and total.Messages_Out traffic category gauges (OverlayImpl.h:535-548). Shows the overall message throughput of the overlay network.",
"description": "Total messages sent and received across all peer connections. Sourced from the total.Messages_In and total.Messages_Out traffic category gauges (OverlayImpl.h). Shows the overall message throughput of the overlay network.",
"type": "timeseries",
"gridPos": {
"h": 8,
@@ -161,14 +161,14 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_total_Messages_In",
"expr": "rippled_total_Messages_In{exported_instance=~\"$node\"}",
"legendFormat": "Messages In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_total_Messages_Out",
"expr": "rippled_total_Messages_Out{exported_instance=~\"$node\"}",
"legendFormat": "Messages Out"
}
],
@@ -207,21 +207,21 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_transactions_Messages_In",
"expr": "rippled_transactions_Messages_In{exported_instance=~\"$node\"}",
"legendFormat": "TX Messages In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_transactions_Messages_Out",
"expr": "rippled_transactions_Messages_Out{exported_instance=~\"$node\"}",
"legendFormat": "TX Messages Out"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_transactions_duplicate_Messages_In",
"expr": "rippled_transactions_duplicate_Messages_In{exported_instance=~\"$node\"}",
"legendFormat": "TX Duplicate In"
}
],
@@ -260,28 +260,28 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_proposals_Messages_In",
"expr": "rippled_proposals_Messages_In{exported_instance=~\"$node\"}",
"legendFormat": "Proposals In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_proposals_Messages_Out",
"expr": "rippled_proposals_Messages_Out{exported_instance=~\"$node\"}",
"legendFormat": "Proposals Out"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_proposals_untrusted_Messages_In",
"expr": "rippled_proposals_untrusted_Messages_In{exported_instance=~\"$node\"}",
"legendFormat": "Untrusted In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_proposals_duplicate_Messages_In",
"expr": "rippled_proposals_duplicate_Messages_In{exported_instance=~\"$node\"}",
"legendFormat": "Duplicate In"
}
],
@@ -320,28 +320,28 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_validations_Messages_In",
"expr": "rippled_validations_Messages_In{exported_instance=~\"$node\"}",
"legendFormat": "Validations In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_validations_Messages_Out",
"expr": "rippled_validations_Messages_Out{exported_instance=~\"$node\"}",
"legendFormat": "Validations Out"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_validations_untrusted_Messages_In",
"expr": "rippled_validations_untrusted_Messages_In{exported_instance=~\"$node\"}",
"legendFormat": "Untrusted In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_validations_duplicate_Messages_In",
"expr": "rippled_validations_duplicate_Messages_In{exported_instance=~\"$node\"}",
"legendFormat": "Duplicate In"
}
],
@@ -380,7 +380,7 @@
"datasource": {
"type": "prometheus"
},
"expr": "topk(10, {__name__=~\"rippled_.*_Bytes_In\", __name__!~\"rippled_total_.*\"})",
"expr": "topk{exported_instance=~\"$node\"}(10, {__name__=~\"rippled_.*_Bytes_In\", __name__!~\"rippled_total_.*\"})",
"legendFormat": "{{__name__}}"
}
],
@@ -677,42 +677,42 @@
"datasource": {
"type": "prometheus"
},
"expr": "rate(rippled_transactions_duplicate_Bytes_In[5m])",
"expr": "rate{exported_instance=~\"$node\"}(rippled_transactions_duplicate_Bytes_In[5m])",
"legendFormat": "TX Duplicate In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rate(rippled_transactions_duplicate_Bytes_Out[5m])",
"expr": "rate{exported_instance=~\"$node\"}(rippled_transactions_duplicate_Bytes_Out[5m])",
"legendFormat": "TX Duplicate Out"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rate(rippled_proposals_duplicate_Bytes_In[5m])",
"expr": "rate{exported_instance=~\"$node\"}(rippled_proposals_duplicate_Bytes_In[5m])",
"legendFormat": "Proposals Duplicate In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rate(rippled_proposals_duplicate_Bytes_Out[5m])",
"expr": "rate{exported_instance=~\"$node\"}(rippled_proposals_duplicate_Bytes_Out[5m])",
"legendFormat": "Proposals Duplicate Out"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rate(rippled_validations_duplicate_Bytes_In[5m])",
"expr": "rate{exported_instance=~\"$node\"}(rippled_validations_duplicate_Bytes_In[5m])",
"legendFormat": "Validations Duplicate In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rate(rippled_validations_duplicate_Bytes_Out[5m])",
"expr": "rate{exported_instance=~\"$node\"}(rippled_validations_duplicate_Bytes_Out[5m])",
"legendFormat": "Validations Duplicate Out"
}
],
@@ -751,7 +751,7 @@
"datasource": {
"type": "prometheus"
},
"expr": "topk(15, rate({__name__=~\"rippled_.*_Bytes_In\", __name__!~\"rippled_total_.*\"}[5m]))",
"expr": "topk{exported_instance=~\"$node\"}(15, rate({__name__=~\"rippled_.*_Bytes_In\", __name__!~\"rippled_total_.*\"}[5m]))",
"legendFormat": "{{__name__}}"
}
],
@@ -773,12 +773,33 @@
"schemaVersion": 39,
"tags": ["rippled", "statsd", "network", "telemetry"],
"templating": {
"list": []
"list": [
{
"name": "node",
"label": "Node",
"description": "Filter by xrpld node (service.instance.id \u2014 e.g. Node-1)",
"type": "query",
"query": "label_values(exported_instance)",
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"includeAll": true,
"allValue": ".*",
"current": {
"text": "All",
"value": "$__all"
},
"multi": true,
"refresh": 2,
"sort": 1
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"title": "Network Traffic (StatsD)",
"uid": "rippled-statsd-network"
"uid": "xrpld-statsd-network"
}

View File

@@ -11,7 +11,7 @@
"panels": [
{
"title": "Validated Ledger Age",
"description": "Age of the most recently validated ledger in seconds. Sourced from the LedgerMaster.Validated_Ledger_Age gauge (LedgerMaster.h:373) which is updated every collection interval via the insight hook. Values above 20s indicate the node is falling behind the network.",
"description": "Age of the most recently validated ledger in seconds. Sourced from the LedgerMaster.Validated_Ledger_Age gauge (LedgerMaster.h) which is updated every collection interval via the insight hook. Values above 20s indicate the node is falling behind the network.",
"type": "stat",
"gridPos": {
"h": 8,
@@ -30,7 +30,7 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_LedgerMaster_Validated_Ledger_Age",
"expr": "rippled_LedgerMaster_Validated_Ledger_Age{exported_instance=~\"$node\"}",
"legendFormat": "Validated Age"
}
],
@@ -59,7 +59,7 @@
},
{
"title": "Published Ledger Age",
"description": "Age of the most recently published ledger in seconds. Sourced from the LedgerMaster.Published_Ledger_Age gauge (LedgerMaster.h:374). Published ledger age should track close to validated ledger age. A growing gap indicates publish pipeline backlog.",
"description": "Age of the most recently published ledger in seconds. Sourced from the LedgerMaster.Published_Ledger_Age gauge (LedgerMaster.h). Published ledger age should track close to validated ledger age. A growing gap indicates publish pipeline backlog.",
"type": "stat",
"gridPos": {
"h": 8,
@@ -78,7 +78,7 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_LedgerMaster_Published_Ledger_Age",
"expr": "rippled_LedgerMaster_Published_Ledger_Age{exported_instance=~\"$node\"}",
"legendFormat": "Published Age"
}
],
@@ -107,7 +107,7 @@
},
{
"title": "Operating Mode Duration",
"description": "Cumulative time spent in each operating mode (Disconnected, Connected, Syncing, Tracking, Full). Sourced from State_Accounting.*_duration gauges (NetworkOPs.cpp:774-778). A healthy node should spend the vast majority of time in Full mode.",
"description": "Cumulative time spent in each operating mode (Disconnected, Connected, Syncing, Tracking, Full). Sourced from State_Accounting.*_duration gauges (NetworkOPs.cpp). A healthy node should spend the vast majority of time in Full mode.",
"type": "timeseries",
"gridPos": {
"h": 8,
@@ -126,35 +126,35 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_State_Accounting_Full_duration",
"expr": "rippled_State_Accounting_Full_duration{exported_instance=~\"$node\"}",
"legendFormat": "Full"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_State_Accounting_Tracking_duration",
"expr": "rippled_State_Accounting_Tracking_duration{exported_instance=~\"$node\"}",
"legendFormat": "Tracking"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_State_Accounting_Syncing_duration",
"expr": "rippled_State_Accounting_Syncing_duration{exported_instance=~\"$node\"}",
"legendFormat": "Syncing"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_State_Accounting_Connected_duration",
"expr": "rippled_State_Accounting_Connected_duration{exported_instance=~\"$node\"}",
"legendFormat": "Connected"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_State_Accounting_Disconnected_duration",
"expr": "rippled_State_Accounting_Disconnected_duration{exported_instance=~\"$node\"}",
"legendFormat": "Disconnected"
}
],
@@ -174,7 +174,7 @@
},
{
"title": "Operating Mode Transitions",
"description": "Count of transitions into each operating mode. Sourced from State_Accounting.*_transitions gauges (NetworkOPs.cpp:780-786). Frequent transitions out of Full mode indicate instability. Transitions to Disconnected or Syncing warrant investigation.",
"description": "Count of transitions into each operating mode. Sourced from State_Accounting.*_transitions gauges (NetworkOPs.cpp). Frequent transitions out of Full mode indicate instability. Transitions to Disconnected or Syncing warrant investigation.",
"type": "timeseries",
"gridPos": {
"h": 8,
@@ -193,35 +193,35 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_State_Accounting_Full_transitions",
"expr": "rippled_State_Accounting_Full_transitions{exported_instance=~\"$node\"}",
"legendFormat": "Full"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_State_Accounting_Tracking_transitions",
"expr": "rippled_State_Accounting_Tracking_transitions{exported_instance=~\"$node\"}",
"legendFormat": "Tracking"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_State_Accounting_Syncing_transitions",
"expr": "rippled_State_Accounting_Syncing_transitions{exported_instance=~\"$node\"}",
"legendFormat": "Syncing"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_State_Accounting_Connected_transitions",
"expr": "rippled_State_Accounting_Connected_transitions{exported_instance=~\"$node\"}",
"legendFormat": "Connected"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_State_Accounting_Disconnected_transitions",
"expr": "rippled_State_Accounting_Disconnected_transitions{exported_instance=~\"$node\"}",
"legendFormat": "Disconnected"
}
],
@@ -241,7 +241,7 @@
},
{
"title": "I/O Latency",
"description": "P95 and P50 of the I/O service loop latency in milliseconds. Sourced from the ios_latency event (Application.cpp:438) which measures how long it takes for the io_context to process a timer callback. Values above 10ms are logged; above 500ms trigger warnings. High values indicate thread pool saturation or blocking operations.",
"description": "P95 and P50 of the I/O service loop latency in milliseconds. Sourced from the ios_latency event (Application.cpp) which measures how long it takes for the io_context to process a timer callback. Values above 10ms are logged; above 500ms trigger warnings. High values indicate thread pool saturation or blocking operations.",
"type": "timeseries",
"gridPos": {
"h": 8,
@@ -260,14 +260,14 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ios_latency{quantile=\"0.95\"}",
"expr": "rippled_ios_latency{exported_instance=~\"$node\", quantile=\"0.95\"}",
"legendFormat": "P95 I/O Latency"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ios_latency{quantile=\"0.5\"}",
"expr": "rippled_ios_latency{exported_instance=~\"$node\", quantile=\"0.5\"}",
"legendFormat": "P50 I/O Latency"
}
],
@@ -287,7 +287,7 @@
},
{
"title": "Job Queue Depth",
"description": "Current number of jobs waiting in the job queue. Sourced from the job_count gauge (JobQueue.cpp:26). A sustained high value indicates the node cannot process work fast enough common during ledger replay or heavy RPC load.",
"description": "Current number of jobs waiting in the job queue. Sourced from the job_count gauge (JobQueue.cpp). A sustained high value indicates the node cannot process work fast enough \u2014 common during ledger replay or heavy RPC load.",
"type": "timeseries",
"gridPos": {
"h": 8,
@@ -306,7 +306,7 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_job_count",
"expr": "rippled_job_count{exported_instance=~\"$node\"}",
"legendFormat": "Job Queue Depth"
}
],
@@ -326,7 +326,7 @@
},
{
"title": "Ledger Fetch Rate",
"description": "Rate of ledger fetch requests initiated by the node. Sourced from the ledger_fetches counter (InboundLedgers.cpp:44) which increments each time the node requests a ledger from a peer. High rates indicate the node is catching up or missing ledgers.",
"description": "Rate of ledger fetch requests initiated by the node. Sourced from the ledger_fetches counter (InboundLedgers.cpp) which increments each time the node requests a ledger from a peer. High rates indicate the node is catching up or missing ledgers.",
"type": "stat",
"gridPos": {
"h": 8,
@@ -345,7 +345,7 @@
"datasource": {
"type": "prometheus"
},
"expr": "rate(rippled_ledger_fetches_total[5m])",
"expr": "rate{exported_instance=~\"$node\"}(rippled_ledger_fetches_total[5m])",
"legendFormat": "Fetches / Sec"
}
],
@@ -358,7 +358,7 @@
},
{
"title": "Ledger History Mismatches",
"description": "Rate of ledger history hash mismatches. Sourced from the ledger.history.mismatch counter (LedgerHistory.cpp:16) which increments when a built ledger hash does not match the expected validated hash. Non-zero values indicate consensus divergence or database corruption.",
"description": "Rate of ledger history hash mismatches. Sourced from the ledger.history.mismatch counter (LedgerHistory.cpp) which increments when a built ledger hash does not match the expected validated hash. Non-zero values indicate consensus divergence or database corruption.",
"type": "stat",
"gridPos": {
"h": 8,
@@ -377,7 +377,7 @@
"datasource": {
"type": "prometheus"
},
"expr": "rate(rippled_ledger_history_mismatch_total[5m])",
"expr": "rate{exported_instance=~\"$node\"}(rippled_ledger_history_mismatch_total[5m])",
"legendFormat": "Mismatches / Sec"
}
],
@@ -402,7 +402,7 @@
},
{
"title": "Key Jobs Execution Time",
"description": "Execution time for critical job types at the selected quantile. Sourced from per-job-type events in JobTypeData (JobTypeData.h:48). Shows how long key consensus, transaction, and maintenance jobs take to execute. Spikes indicate processing bottlenecks.",
"description": "Execution time for critical job types at the selected quantile. Sourced from per-job-type events in JobTypeData (JobTypeData.h). Shows how long key consensus, transaction, and maintenance jobs take to execute. Spikes indicate processing bottlenecks.",
"type": "timeseries",
"gridPos": {
"h": 8,
@@ -421,77 +421,77 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_acceptLedger{quantile=\"$quantile\"}",
"expr": "rippled_acceptLedger{exported_instance=~\"$node\", quantile=\"$quantile\"}",
"legendFormat": "Accept Ledger [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_advanceLedger{quantile=\"$quantile\"}",
"expr": "rippled_advanceLedger{exported_instance=~\"$node\", quantile=\"$quantile\"}",
"legendFormat": "Advance Ledger [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_transaction{quantile=\"$quantile\"}",
"expr": "rippled_transaction{exported_instance=~\"$node\", quantile=\"$quantile\"}",
"legendFormat": "Transaction [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_writeObjects{quantile=\"$quantile\"}",
"expr": "rippled_writeObjects{exported_instance=~\"$node\", quantile=\"$quantile\"}",
"legendFormat": "Write Objects [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_heartbeat{quantile=\"$quantile\"}",
"expr": "rippled_heartbeat{exported_instance=~\"$node\", quantile=\"$quantile\"}",
"legendFormat": "Heartbeat [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_sweep{quantile=\"$quantile\"}",
"expr": "rippled_sweep{exported_instance=~\"$node\", quantile=\"$quantile\"}",
"legendFormat": "Sweep [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_trustedValidation{quantile=\"$quantile\"}",
"expr": "rippled_trustedValidation{exported_instance=~\"$node\", quantile=\"$quantile\"}",
"legendFormat": "Trusted Validation [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_trustedProposal{quantile=\"$quantile\"}",
"expr": "rippled_trustedProposal{exported_instance=~\"$node\", quantile=\"$quantile\"}",
"legendFormat": "Trusted Proposal [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_publishNewLedger{quantile=\"$quantile\"}",
"expr": "rippled_publishNewLedger{exported_instance=~\"$node\", quantile=\"$quantile\"}",
"legendFormat": "Publish New Ledger [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_clientRPC{quantile=\"$quantile\"}",
"expr": "rippled_clientRPC{exported_instance=~\"$node\", quantile=\"$quantile\"}",
"legendFormat": "Client RPC [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledgerData{quantile=\"$quantile\"}",
"expr": "rippled_ledgerData{exported_instance=~\"$node\", quantile=\"$quantile\"}",
"legendFormat": "Ledger Data [{{quantile}}]"
}
],
@@ -511,7 +511,7 @@
},
{
"title": "Key Jobs Dequeue Wait Time",
"description": "Time spent waiting in the job queue before execution for critical job types. Sourced from per-job-type dequeue events (JobTypeData.h:47). High dequeue times indicate the job queue is backlogged and jobs are waiting too long to be scheduled.",
"description": "Time spent waiting in the job queue before execution for critical job types. Sourced from per-job-type dequeue events (JobTypeData.h). High dequeue times indicate the job queue is backlogged and jobs are waiting too long to be scheduled.",
"type": "timeseries",
"gridPos": {
"h": 8,
@@ -530,77 +530,77 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_acceptLedger_q{quantile=\"$quantile\"}",
"expr": "rippled_acceptLedger_q{exported_instance=~\"$node\", quantile=\"$quantile\"}",
"legendFormat": "Accept Ledger [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_advanceLedger_q{quantile=\"$quantile\"}",
"expr": "rippled_advanceLedger_q{exported_instance=~\"$node\", quantile=\"$quantile\"}",
"legendFormat": "Advance Ledger [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_transaction_q{quantile=\"$quantile\"}",
"expr": "rippled_transaction_q{exported_instance=~\"$node\", quantile=\"$quantile\"}",
"legendFormat": "Transaction [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_writeObjects_q{quantile=\"$quantile\"}",
"expr": "rippled_writeObjects_q{exported_instance=~\"$node\", quantile=\"$quantile\"}",
"legendFormat": "Write Objects [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_heartbeat_q{quantile=\"$quantile\"}",
"expr": "rippled_heartbeat_q{exported_instance=~\"$node\", quantile=\"$quantile\"}",
"legendFormat": "Heartbeat [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_sweep_q{quantile=\"$quantile\"}",
"expr": "rippled_sweep_q{exported_instance=~\"$node\", quantile=\"$quantile\"}",
"legendFormat": "Sweep [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_trustedValidation_q{quantile=\"$quantile\"}",
"expr": "rippled_trustedValidation_q{exported_instance=~\"$node\", quantile=\"$quantile\"}",
"legendFormat": "Trusted Validation [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_trustedProposal_q{quantile=\"$quantile\"}",
"expr": "rippled_trustedProposal_q{exported_instance=~\"$node\", quantile=\"$quantile\"}",
"legendFormat": "Trusted Proposal [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_publishNewLedger_q{quantile=\"$quantile\"}",
"expr": "rippled_publishNewLedger_q{exported_instance=~\"$node\", quantile=\"$quantile\"}",
"legendFormat": "Publish New Ledger [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_clientRPC_q{quantile=\"$quantile\"}",
"expr": "rippled_clientRPC_q{exported_instance=~\"$node\", quantile=\"$quantile\"}",
"legendFormat": "Client RPC [{{quantile}}]"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_ledgerData_q{quantile=\"$quantile\"}",
"expr": "rippled_ledgerData_q{exported_instance=~\"$node\", quantile=\"$quantile\"}",
"legendFormat": "Ledger Data [{{quantile}}]"
}
],
@@ -620,7 +620,7 @@
},
{
"title": "FullBelowCache Size",
"description": "Number of entries in the FullBelowCache. Sourced from the TaggedCache size gauge (TaggedCache.h:183) for the Node family full below cache (NodeFamily.cpp:29). This cache tracks which SHAMap nodes have all children present locally, avoiding redundant fetches during ledger acquisition.",
"description": "Number of entries in the FullBelowCache. Sourced from the TaggedCache size gauge (TaggedCache.h) for the Node family full below cache (NodeFamily.cpp). This cache tracks which SHAMap nodes have all children present locally, avoiding redundant fetches during ledger acquisition.",
"type": "timeseries",
"gridPos": {
"h": 8,
@@ -639,7 +639,7 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_Node_family_full_below_cache_size",
"expr": "rippled_Node_family_full_below_cache_size{exported_instance=~\"$node\"}",
"legendFormat": "FullBelowCache Size"
}
],
@@ -659,7 +659,7 @@
},
{
"title": "FullBelowCache Hit Rate",
"description": "Hit rate percentage for the FullBelowCache. Sourced from the TaggedCache hit_rate gauge (TaggedCache.h:184). A high hit rate means the node is efficiently reusing cached knowledge about complete SHAMap subtrees. Low hit rates during steady state warrant investigation.",
"description": "Hit rate percentage for the FullBelowCache. Sourced from the TaggedCache hit_rate gauge (TaggedCache.h). A high hit rate means the node is efficiently reusing cached knowledge about complete SHAMap subtrees. Low hit rates during steady state warrant investigation.",
"type": "gauge",
"gridPos": {
"h": 8,
@@ -678,7 +678,7 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_Node_family_full_below_cache_hit_rate",
"expr": "rippled_Node_family_full_below_cache_hit_rate{exported_instance=~\"$node\"}",
"legendFormat": "Hit Rate"
}
],
@@ -728,7 +728,7 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_LedgerMaster_Published_Ledger_Age - rippled_LedgerMaster_Validated_Ledger_Age",
"expr": "rippled_LedgerMaster_Published_Ledger_Age{exported_instance=~\"$node\"} - rippled_LedgerMaster_Validated_Ledger_Age",
"legendFormat": "Publish Gap"
}
],
@@ -757,7 +757,7 @@
},
{
"title": "State Duration Rate (Full vs Tracking)",
"description": "Rate of change of time spent in Full and Tracking operating modes, normalized to seconds. Sourced from State_Accounting duration gauges (NetworkOPs.cpp:774-778). In steady state the Full duration rate should be close to 1.0 (gaining one second of Full-mode time per wall-clock second). A drop below 1.0 means the node is spending time in other modes.",
"description": "Rate of change of time spent in Full and Tracking operating modes, normalized to seconds. Sourced from State_Accounting duration gauges (NetworkOPs.cpp). In steady state the Full duration rate should be close to 1.0 (gaining one second of Full-mode time per wall-clock second). A drop below 1.0 means the node is spending time in other modes.",
"type": "timeseries",
"gridPos": {
"h": 8,
@@ -776,14 +776,14 @@
"datasource": {
"type": "prometheus"
},
"expr": "rate(rippled_State_Accounting_Full_duration[5m]) / 1000000",
"expr": "rate{exported_instance=~\"$node\"}(rippled_State_Accounting_Full_duration[5m]) / 1000000",
"legendFormat": "Full Mode Rate"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rate(rippled_State_Accounting_Tracking_duration[5m]) / 1000000",
"expr": "rate{exported_instance=~\"$node\"}(rippled_State_Accounting_Tracking_duration[5m]) / 1000000",
"legendFormat": "Tracking Mode Rate"
}
],
@@ -822,7 +822,7 @@
"datasource": {
"type": "prometheus"
},
"expr": "{__name__=~\"rippled_(makeFetchPack|publishAcqLedger|untrustedValidation|manifest|localTransaction|ledgerReplayRequest|ledgerRequest|untrustedProposal|ledgerReplayTask|ledgerData|clientCommand|clientSubscribe|clientFeeChange|clientConsensus|clientAccountHistory|clientRPC|clientWebsocket|RPC|updatePaths|transaction|batch|advanceLedger|publishNewLedger|fetchTxnData|writeAhead|trustedValidation|writeObjects|acceptLedger|trustedProposal|sweep|clusterReport|heartbeat|administration|handleHaveTransactions|doTransactions)\", quantile=\"$quantile\"}",
"expr": "{__name__{exported_instance=~\"$node\"}=~\"rippled_(makeFetchPack|publishAcqLedger|untrustedValidation|manifest|localTransaction|ledgerReplayRequest|ledgerRequest|untrustedProposal|ledgerReplayTask|ledgerData|clientCommand|clientSubscribe|clientFeeChange|clientConsensus|clientAccountHistory|clientRPC|clientWebsocket|RPC|updatePaths|transaction|batch|advanceLedger|publishNewLedger|fetchTxnData|writeAhead|trustedValidation|writeObjects|acceptLedger|trustedProposal|sweep|clusterReport|heartbeat|administration|handleHaveTransactions|doTransactions)\", quantile=\"$quantile\"}",
"legendFormat": "{{__name__}} [{{quantile}}]"
}
],
@@ -861,7 +861,7 @@
"datasource": {
"type": "prometheus"
},
"expr": "{__name__=~\"rippled_(makeFetchPack_q|publishAcqLedger_q|untrustedValidation_q|manifest_q|localTransaction_q|ledgerReplayRequest_q|ledgerRequest_q|untrustedProposal_q|ledgerReplayTask_q|ledgerData_q|clientCommand_q|clientSubscribe_q|clientFeeChange_q|clientConsensus_q|clientAccountHistory_q|clientRPC_q|clientWebsocket_q|RPC_q|updatePaths_q|transaction_q|batch_q|advanceLedger_q|publishNewLedger_q|fetchTxnData_q|writeAhead_q|trustedValidation_q|writeObjects_q|acceptLedger_q|trustedProposal_q|sweep_q|clusterReport_q|heartbeat_q|administration_q|handleHaveTransactions_q|doTransactions_q)\", quantile=\"$quantile\"}",
"expr": "{__name__{exported_instance=~\"$node\"}=~\"rippled_(makeFetchPack_q|publishAcqLedger_q|untrustedValidation_q|manifest_q|localTransaction_q|ledgerReplayRequest_q|ledgerRequest_q|untrustedProposal_q|ledgerReplayTask_q|ledgerData_q|clientCommand_q|clientSubscribe_q|clientFeeChange_q|clientConsensus_q|clientAccountHistory_q|clientRPC_q|clientWebsocket_q|RPC_q|updatePaths_q|transaction_q|batch_q|advanceLedger_q|publishNewLedger_q|fetchTxnData_q|writeAhead_q|trustedValidation_q|writeObjects_q|acceptLedger_q|trustedProposal_q|sweep_q|clusterReport_q|heartbeat_q|administration_q|handleHaveTransactions_q|doTransactions_q)\", quantile=\"$quantile\"}",
"legendFormat": "{{__name__}} [{{quantile}}]"
}
],
@@ -884,6 +884,26 @@
"tags": ["rippled", "statsd", "node-health", "telemetry"],
"templating": {
"list": [
{
"name": "node",
"label": "Node",
"description": "Filter by xrpld node (service.instance.id \u2014 e.g. Node-1)",
"type": "query",
"query": "label_values(exported_instance)",
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"includeAll": true,
"allValue": ".*",
"current": {
"text": "All",
"value": "$__all"
},
"multi": true,
"refresh": 2,
"sort": 1
},
{
"name": "quantile",
"label": "Quantile",
@@ -926,5 +946,5 @@
"to": "now"
},
"title": "Node Health (StatsD)",
"uid": "rippled-statsd-node-health"
"uid": "xrpld-statsd-node-health"
}

View File

@@ -30,42 +30,42 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_squelch_Messages_In",
"expr": "rippled_squelch_Messages_In{exported_instance=~\"$node\"}",
"legendFormat": "Squelch In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_squelch_Messages_Out",
"expr": "rippled_squelch_Messages_Out{exported_instance=~\"$node\"}",
"legendFormat": "Squelch Out"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_squelch_suppressed_Messages_In",
"expr": "rippled_squelch_suppressed_Messages_In{exported_instance=~\"$node\"}",
"legendFormat": "Suppressed In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_squelch_suppressed_Messages_Out",
"expr": "rippled_squelch_suppressed_Messages_Out{exported_instance=~\"$node\"}",
"legendFormat": "Suppressed Out"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_squelch_ignored_Messages_In",
"expr": "rippled_squelch_ignored_Messages_In{exported_instance=~\"$node\"}",
"legendFormat": "Ignored In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_squelch_ignored_Messages_Out",
"expr": "rippled_squelch_ignored_Messages_Out{exported_instance=~\"$node\"}",
"legendFormat": "Ignored Out"
}
],
@@ -104,42 +104,42 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_overhead_Bytes_In",
"expr": "rippled_overhead_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Base Overhead In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_overhead_Bytes_Out",
"expr": "rippled_overhead_Bytes_Out{exported_instance=~\"$node\"}",
"legendFormat": "Base Overhead Out"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_overhead_cluster_Bytes_In",
"expr": "rippled_overhead_cluster_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Cluster In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_overhead_cluster_Bytes_Out",
"expr": "rippled_overhead_cluster_Bytes_Out{exported_instance=~\"$node\"}",
"legendFormat": "Cluster Out"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_overhead_manifest_Bytes_In",
"expr": "rippled_overhead_manifest_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Manifest In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_overhead_manifest_Bytes_Out",
"expr": "rippled_overhead_manifest_Bytes_Out{exported_instance=~\"$node\"}",
"legendFormat": "Manifest Out"
}
],
@@ -178,28 +178,28 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_validator_lists_Bytes_In",
"expr": "rippled_validator_lists_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Bytes In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_validator_lists_Bytes_Out",
"expr": "rippled_validator_lists_Bytes_Out{exported_instance=~\"$node\"}",
"legendFormat": "Bytes Out"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_validator_lists_Messages_In",
"expr": "rippled_validator_lists_Messages_In{exported_instance=~\"$node\"}",
"legendFormat": "Messages In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_validator_lists_Messages_Out",
"expr": "rippled_validator_lists_Messages_Out{exported_instance=~\"$node\"}",
"legendFormat": "Messages Out"
}
],
@@ -255,28 +255,28 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_set_get_Bytes_In",
"expr": "rippled_set_get_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Set Get In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_set_get_Bytes_Out",
"expr": "rippled_set_get_Bytes_Out{exported_instance=~\"$node\"}",
"legendFormat": "Set Get Out"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_set_share_Bytes_In",
"expr": "rippled_set_share_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Set Share In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_set_share_Bytes_Out",
"expr": "rippled_set_share_Bytes_Out{exported_instance=~\"$node\"}",
"legendFormat": "Set Share Out"
}
],
@@ -315,28 +315,28 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_have_transactions_Messages_In",
"expr": "rippled_have_transactions_Messages_In{exported_instance=~\"$node\"}",
"legendFormat": "Have TX In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_have_transactions_Messages_Out",
"expr": "rippled_have_transactions_Messages_Out{exported_instance=~\"$node\"}",
"legendFormat": "Have TX Out"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_requested_transactions_Messages_In",
"expr": "rippled_requested_transactions_Messages_In{exported_instance=~\"$node\"}",
"legendFormat": "Requested TX In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_requested_transactions_Messages_Out",
"expr": "rippled_requested_transactions_Messages_Out{exported_instance=~\"$node\"}",
"legendFormat": "Requested TX Out"
}
],
@@ -375,28 +375,28 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_unknown_Bytes_In",
"expr": "rippled_unknown_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Unknown Bytes In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_unknown_Bytes_Out",
"expr": "rippled_unknown_Bytes_Out{exported_instance=~\"$node\"}",
"legendFormat": "Unknown Bytes Out"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_unknown_Messages_In",
"expr": "rippled_unknown_Messages_In{exported_instance=~\"$node\"}",
"legendFormat": "Unknown Messages In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_unknown_Messages_Out",
"expr": "rippled_unknown_Messages_Out{exported_instance=~\"$node\"}",
"legendFormat": "Unknown Messages Out"
}
],
@@ -452,28 +452,28 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_proof_path_request_Bytes_In",
"expr": "rippled_proof_path_request_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Request Bytes In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_proof_path_request_Bytes_Out",
"expr": "rippled_proof_path_request_Bytes_Out{exported_instance=~\"$node\"}",
"legendFormat": "Request Bytes Out"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_proof_path_response_Bytes_In",
"expr": "rippled_proof_path_response_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Response Bytes In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_proof_path_response_Bytes_Out",
"expr": "rippled_proof_path_response_Bytes_Out{exported_instance=~\"$node\"}",
"legendFormat": "Response Bytes Out"
}
],
@@ -512,28 +512,28 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_replay_delta_request_Bytes_In",
"expr": "rippled_replay_delta_request_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Request Bytes In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_replay_delta_request_Bytes_Out",
"expr": "rippled_replay_delta_request_Bytes_Out{exported_instance=~\"$node\"}",
"legendFormat": "Request Bytes Out"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_replay_delta_response_Bytes_In",
"expr": "rippled_replay_delta_response_Bytes_In{exported_instance=~\"$node\"}",
"legendFormat": "Response Bytes In"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_replay_delta_response_Bytes_Out",
"expr": "rippled_replay_delta_response_Bytes_Out{exported_instance=~\"$node\"}",
"legendFormat": "Response Bytes Out"
}
],
@@ -555,12 +555,33 @@
"schemaVersion": 39,
"tags": ["rippled", "statsd", "overlay", "network", "telemetry"],
"templating": {
"list": []
"list": [
{
"name": "node",
"label": "Node",
"description": "Filter by xrpld node (service.instance.id \u2014 e.g. Node-1)",
"type": "query",
"query": "label_values(exported_instance)",
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"includeAll": true,
"allValue": ".*",
"current": {
"text": "All",
"value": "$__all"
},
"multi": true,
"refresh": 2,
"sort": 1
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"title": "Overlay Traffic Detail (StatsD)",
"uid": "rippled-statsd-overlay-detail"
"uid": "xrpld-statsd-overlay-detail"
}

View File

@@ -11,7 +11,7 @@
"panels": [
{
"title": "RPC Request Rate (StatsD)",
"description": "Rate of RPC requests as counted by the beast::insight counter. Sourced from rpc.requests (ServerHandler.cpp:108) which increments on every HTTP and WebSocket RPC request. Compare with the span-based rpc.request rate in the RPC Performance dashboard for cross-validation.",
"description": "Rate of RPC requests as counted by the beast::insight counter. Sourced from rpc.requests (ServerHandler.cpp) which increments on every HTTP and WebSocket RPC request. Compare with the span-based rpc.request rate in the RPC Performance dashboard for cross-validation.",
"type": "stat",
"gridPos": {
"h": 8,
@@ -30,7 +30,7 @@
"datasource": {
"type": "prometheus"
},
"expr": "rate(rippled_rpc_requests_total[5m])",
"expr": "rate{exported_instance=~\"$node\"}(rippled_rpc_requests_total[5m])",
"legendFormat": "Requests / Sec"
}
],
@@ -43,7 +43,7 @@
},
{
"title": "RPC Response Time (StatsD)",
"description": "P95 and P50 of RPC response time from the beast::insight timer. Sourced from the rpc.time event (ServerHandler.cpp:110) which records elapsed milliseconds for each RPC response. This measures the full HTTP handler time, not just command execution. Compare with span-based rpc.request duration.",
"description": "P95 and P50 of RPC response time from the beast::insight timer. Sourced from the rpc.time event (ServerHandler.cpp) which records elapsed milliseconds for each RPC response. This measures the full HTTP handler time, not just command execution. Compare with span-based rpc.request duration.",
"type": "timeseries",
"gridPos": {
"h": 8,
@@ -62,14 +62,14 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_rpc_time{quantile=\"0.95\"}",
"expr": "rippled_rpc_time{exported_instance=~\"$node\", quantile=\"0.95\"}",
"legendFormat": "P95 Response Time"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_rpc_time{quantile=\"0.5\"}",
"expr": "rippled_rpc_time{exported_instance=~\"$node\", quantile=\"0.5\"}",
"legendFormat": "P50 Response Time"
}
],
@@ -89,7 +89,7 @@
},
{
"title": "RPC Response Size",
"description": "P95 and P50 of RPC response payload size in bytes. Sourced from the rpc.size event (ServerHandler.cpp:109) which records the byte length of each RPC JSON response. Large responses may indicate expensive queries (e.g. account_tx with many results) or API misuse.",
"description": "P95 and P50 of RPC response payload size in bytes. Sourced from the rpc.size event (ServerHandler.cpp) which records the byte length of each RPC JSON response. Large responses may indicate expensive queries (e.g. account_tx with many results) or API misuse.",
"type": "timeseries",
"gridPos": {
"h": 8,
@@ -108,14 +108,14 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_rpc_size{quantile=\"0.95\"}",
"expr": "rippled_rpc_size{exported_instance=~\"$node\", quantile=\"0.95\"}",
"legendFormat": "P95 Response Size"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_rpc_size{quantile=\"0.5\"}",
"expr": "rippled_rpc_size{exported_instance=~\"$node\", quantile=\"0.5\"}",
"legendFormat": "P50 Response Size"
}
],
@@ -135,7 +135,7 @@
},
{
"title": "RPC Response Time Distribution",
"description": "Distribution of RPC response times from the beast::insight timer showing P50, P90, P95, and P99 quantiles. Sourced from the rpc.time event (ServerHandler.cpp:110). Useful for detecting bimodal latency or long-tail requests.",
"description": "Distribution of RPC response times from the beast::insight timer showing P50, P90, P95, and P99 quantiles. Sourced from the rpc.time event (ServerHandler.cpp). Useful for detecting bimodal latency or long-tail requests.",
"type": "timeseries",
"gridPos": {
"h": 8,
@@ -154,28 +154,28 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_rpc_time{quantile=\"0.5\"}",
"expr": "rippled_rpc_time{exported_instance=~\"$node\", quantile=\"0.5\"}",
"legendFormat": "P50"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_rpc_time{quantile=\"0.9\"}",
"expr": "rippled_rpc_time{exported_instance=~\"$node\", quantile=\"0.9\"}",
"legendFormat": "P90"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_rpc_time{quantile=\"0.95\"}",
"expr": "rippled_rpc_time{exported_instance=~\"$node\", quantile=\"0.95\"}",
"legendFormat": "P95"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_rpc_time{quantile=\"0.99\"}",
"expr": "rippled_rpc_time{exported_instance=~\"$node\", quantile=\"0.99\"}",
"legendFormat": "P99"
}
],
@@ -195,7 +195,7 @@
},
{
"title": "Pathfinding Fast Duration",
"description": "P95 and P50 of fast pathfinding execution time. Sourced from the pathfind_fast event (PathRequests.h:23) which records the duration of the fast pathfinding algorithm. Fast pathfinding uses a simplified search that trades accuracy for speed.",
"description": "P95 and P50 of fast pathfinding execution time. Sourced from the pathfind_fast event (PathRequests.h) which records the duration of the fast pathfinding algorithm. Fast pathfinding uses a simplified search that trades accuracy for speed.",
"type": "timeseries",
"gridPos": {
"h": 8,
@@ -214,14 +214,14 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_pathfind_fast{quantile=\"0.95\"}",
"expr": "rippled_pathfind_fast{exported_instance=~\"$node\", quantile=\"0.95\"}",
"legendFormat": "P95 Fast Pathfind"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_pathfind_fast{quantile=\"0.5\"}",
"expr": "rippled_pathfind_fast{exported_instance=~\"$node\", quantile=\"0.5\"}",
"legendFormat": "P50 Fast Pathfind"
}
],
@@ -241,7 +241,7 @@
},
{
"title": "Pathfinding Full Duration",
"description": "P95 and P50 of full pathfinding execution time. Sourced from the pathfind_full event (PathRequests.h:24) which records the duration of the exhaustive pathfinding search. Full pathfinding is more expensive and can take significantly longer than fast mode.",
"description": "P95 and P50 of full pathfinding execution time. Sourced from the pathfind_full event (PathRequests.h) which records the duration of the exhaustive pathfinding search. Full pathfinding is more expensive and can take significantly longer than fast mode.",
"type": "timeseries",
"gridPos": {
"h": 8,
@@ -260,14 +260,14 @@
"datasource": {
"type": "prometheus"
},
"expr": "rippled_pathfind_full{quantile=\"0.95\"}",
"expr": "rippled_pathfind_full{exported_instance=~\"$node\", quantile=\"0.95\"}",
"legendFormat": "P95 Full Pathfind"
},
{
"datasource": {
"type": "prometheus"
},
"expr": "rippled_pathfind_full{quantile=\"0.5\"}",
"expr": "rippled_pathfind_full{exported_instance=~\"$node\", quantile=\"0.5\"}",
"legendFormat": "P50 Full Pathfind"
}
],
@@ -287,7 +287,7 @@
},
{
"title": "Resource Warnings Rate",
"description": "Rate of resource warning events from the Resource Manager. Sourced from the warn meter (Logic.h:33) which increments when a consumer (peer or RPC client) exceeds the warning threshold for resource usage. A rising rate indicates aggressive clients that may need throttling. NOTE: This panel will show no data until the |m -> |c fix is applied in StatsDCollector.cpp:706 (Phase 6 Task 6.1).",
"description": "Rate of resource warning events from the Resource Manager. Sourced from the warn meter (Logic.h) which increments when a consumer (peer or RPC client) exceeds the warning threshold for resource usage. A rising rate indicates aggressive clients that may need throttling. NOTE: This panel will show no data until the |m -> |c fix is applied in StatsDCollector.cpp (Phase 6 Task 6.1).",
"type": "stat",
"gridPos": {
"h": 8,
@@ -306,7 +306,7 @@
"datasource": {
"type": "prometheus"
},
"expr": "rate(rippled_warn_total[5m])",
"expr": "rate{exported_instance=~\"$node\"}(rippled_warn_total[5m])",
"legendFormat": "Warnings / Sec"
}
],
@@ -335,7 +335,7 @@
},
{
"title": "Resource Drops Rate",
"description": "Rate of resource drop events from the Resource Manager. Sourced from the drop meter (Logic.h:34) which increments when a consumer is disconnected or blocked due to excessive resource usage. Non-zero values mean the node is actively rejecting abusive connections. NOTE: This panel will show no data until the |m -> |c fix is applied in StatsDCollector.cpp:706 (Phase 6 Task 6.1).",
"description": "Rate of resource drop events from the Resource Manager. Sourced from the drop meter (Logic.h) which increments when a consumer is disconnected or blocked due to excessive resource usage. Non-zero values mean the node is actively rejecting abusive connections. NOTE: This panel will show no data until the |m -> |c fix is applied in StatsDCollector.cpp (Phase 6 Task 6.1).",
"type": "stat",
"gridPos": {
"h": 8,
@@ -354,7 +354,7 @@
"datasource": {
"type": "prometheus"
},
"expr": "rate(rippled_drop_total[5m])",
"expr": "rate{exported_instance=~\"$node\"}(rippled_drop_total[5m])",
"legendFormat": "Drops / Sec"
}
],
@@ -385,12 +385,33 @@
"schemaVersion": 39,
"tags": ["rippled", "statsd", "rpc", "pathfinding", "telemetry"],
"templating": {
"list": []
"list": [
{
"name": "node",
"label": "Node",
"description": "Filter by xrpld node (service.instance.id \u2014 e.g. Node-1)",
"type": "query",
"query": "label_values(exported_instance)",
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"includeAll": true,
"allValue": ".*",
"current": {
"text": "All",
"value": "$__all"
},
"multi": true,
"refresh": 2,
"sort": 1
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"title": "RPC & Pathfinding (StatsD)",
"uid": "rippled-statsd-rpc"
"uid": "xrpld-statsd-rpc"
}