mirror of
https://github.com/XRPLF/rippled.git
synced 2026-06-03 00:36:48 +00:00
fix(telemetry): restore StatsD receiver, fix metric prefix and doc errors
The StatsD receiver config was lost during a branch rebase (--ours conflict resolution dropped it). Re-add the statsd receiver to the OTel Collector config and wire it into the metrics pipeline so beast::insight UDP metrics flow to Prometheus. Also fixes: - Metric prefix mismatch: docs used xrpld_ but dashboards/tests use rippled_ — align all documentation to match the runnable stack - Remove phantom Peer_Disconnects_Charges from docs (plain atomic, not a beast::insight gauge) - Remove premature .codecov.yml exclusions for Phase 7 OTelCollector files that don't exist on this branch Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -207,7 +207,7 @@ Add to `xrpld.cfg`:
|
||||
[insight]
|
||||
server=statsd
|
||||
address=127.0.0.1:8125
|
||||
prefix=xrpld
|
||||
prefix=rippled
|
||||
```
|
||||
|
||||
The OTel Collector receives these via a `statsd` receiver on UDP port 8125 and exports them to Prometheus alongside spanmetrics.
|
||||
@@ -216,38 +216,38 @@ The OTel Collector receives these via a `statsd` receiver on UDP port 8125 and e
|
||||
|
||||
#### Gauges
|
||||
|
||||
| Prometheus Metric | Source | Description |
|
||||
| ------------------------------------------- | ------------------------- | -------------------------------------------------------------------------- |
|
||||
| `xrpld_LedgerMaster_Validated_Ledger_Age` | LedgerMaster.h:373 | Age of validated ledger (seconds) |
|
||||
| `xrpld_LedgerMaster_Published_Ledger_Age` | LedgerMaster.h:374 | Age of published ledger (seconds) |
|
||||
| `xrpld_State_Accounting_{Mode}_duration` | NetworkOPs.cpp:774 | Time in each operating mode (Disconnected/Connected/Syncing/Tracking/Full) |
|
||||
| `xrpld_State_Accounting_{Mode}_transitions` | NetworkOPs.cpp:780 | Transition count per mode |
|
||||
| `xrpld_Peer_Finder_Active_Inbound_Peers` | PeerfinderManager.cpp:214 | Active inbound peer connections |
|
||||
| `xrpld_Peer_Finder_Active_Outbound_Peers` | PeerfinderManager.cpp:215 | Active outbound peer connections |
|
||||
| `xrpld_Overlay_Peer_Disconnects` | OverlayImpl.h:557 | Peer disconnect count |
|
||||
| `xrpld_job_count` | JobQueue.cpp:26 | Current job queue depth |
|
||||
| `xrpld_{category}_Bytes_In/Out` | OverlayImpl.h:535 | Overlay traffic bytes per category (57 categories) |
|
||||
| `xrpld_{category}_Messages_In/Out` | OverlayImpl.h:535 | Overlay traffic messages per category |
|
||||
| Prometheus Metric | Source | Description |
|
||||
| --------------------------------------------- | ------------------------- | -------------------------------------------------------------------------- |
|
||||
| `rippled_LedgerMaster_Validated_Ledger_Age` | LedgerMaster.h:373 | Age of validated ledger (seconds) |
|
||||
| `rippled_LedgerMaster_Published_Ledger_Age` | LedgerMaster.h:374 | Age of published ledger (seconds) |
|
||||
| `rippled_State_Accounting_{Mode}_duration` | NetworkOPs.cpp:774 | Time in each operating mode (Disconnected/Connected/Syncing/Tracking/Full) |
|
||||
| `rippled_State_Accounting_{Mode}_transitions` | NetworkOPs.cpp:780 | Transition count per mode |
|
||||
| `rippled_Peer_Finder_Active_Inbound_Peers` | PeerfinderManager.cpp:214 | Active inbound peer connections |
|
||||
| `rippled_Peer_Finder_Active_Outbound_Peers` | PeerfinderManager.cpp:215 | Active outbound peer connections |
|
||||
| `rippled_Overlay_Peer_Disconnects` | OverlayImpl.h:557 | Peer disconnect count |
|
||||
| `rippled_job_count` | JobQueue.cpp:26 | Current job queue depth |
|
||||
| `rippled_{category}_Bytes_In/Out` | OverlayImpl.h:535 | Overlay traffic bytes per category (57 categories) |
|
||||
| `rippled_{category}_Messages_In/Out` | OverlayImpl.h:535 | Overlay traffic messages per category |
|
||||
|
||||
#### Counters
|
||||
|
||||
| Prometheus Metric | Source | Description |
|
||||
| ------------------------------- | --------------------- | ------------------------------ |
|
||||
| `xrpld_rpc_requests` | ServerHandler.cpp:108 | Total RPC request count |
|
||||
| `xrpld_ledger_fetches` | InboundLedgers.cpp:44 | Ledger fetch request count |
|
||||
| `xrpld_ledger_history_mismatch` | LedgerHistory.cpp:16 | Ledger hash mismatch count |
|
||||
| `xrpld_warn` | Logic.h:33 | Resource manager warning count |
|
||||
| `xrpld_drop` | Logic.h:34 | Resource manager drop count |
|
||||
| Prometheus Metric | Source | Description |
|
||||
| --------------------------------- | --------------------- | ------------------------------ |
|
||||
| `rippled_rpc_requests` | ServerHandler.cpp:108 | Total RPC request count |
|
||||
| `rippled_ledger_fetches` | InboundLedgers.cpp:44 | Ledger fetch request count |
|
||||
| `rippled_ledger_history_mismatch` | LedgerHistory.cpp:16 | Ledger hash mismatch count |
|
||||
| `rippled_warn` | Logic.h:33 | Resource manager warning count |
|
||||
| `rippled_drop` | Logic.h:34 | Resource manager drop count |
|
||||
|
||||
#### Histograms (from StatsD timers)
|
||||
|
||||
| Prometheus Metric | Source | Description |
|
||||
| --------------------- | --------------------- | ------------------------------ |
|
||||
| `xrpld_rpc_time` | ServerHandler.cpp:110 | RPC response time (ms) |
|
||||
| `xrpld_rpc_size` | ServerHandler.cpp:109 | RPC response size (bytes) |
|
||||
| `xrpld_ios_latency` | Application.cpp:438 | I/O service loop latency (ms) |
|
||||
| `xrpld_pathfind_fast` | PathRequests.h:23 | Fast pathfinding duration (ms) |
|
||||
| `xrpld_pathfind_full` | PathRequests.h:24 | Full pathfinding duration (ms) |
|
||||
| Prometheus Metric | Source | Description |
|
||||
| ----------------------- | --------------------- | ------------------------------ |
|
||||
| `rippled_rpc_time` | ServerHandler.cpp:110 | RPC response time (ms) |
|
||||
| `rippled_rpc_size` | ServerHandler.cpp:109 | RPC response size (bytes) |
|
||||
| `rippled_ios_latency` | Application.cpp:438 | I/O service loop latency (ms) |
|
||||
| `rippled_pathfind_fast` | PathRequests.h:23 | Fast pathfinding duration (ms) |
|
||||
| `rippled_pathfind_full` | PathRequests.h:24 | Full pathfinding duration (ms) |
|
||||
|
||||
## Grafana Dashboards
|
||||
|
||||
@@ -320,42 +320,42 @@ Requires `trace_peer=1` in the `[telemetry]` config section.
|
||||
|
||||
### Node Health — StatsD (`xrpld-statsd-node-health`)
|
||||
|
||||
| Panel | Type | PromQL | Labels Used |
|
||||
| -------------------------- | ---------- | ---------------------------------------------------- | ----------- |
|
||||
| Validated Ledger Age | stat | `xrpld_LedgerMaster_Validated_Ledger_Age` | — |
|
||||
| Published Ledger Age | stat | `xrpld_LedgerMaster_Published_Ledger_Age` | — |
|
||||
| Operating Mode Duration | timeseries | `xrpld_State_Accounting_*_duration` | — |
|
||||
| Operating Mode Transitions | timeseries | `xrpld_State_Accounting_*_transitions` | — |
|
||||
| I/O Latency | timeseries | `histogram_quantile(0.95, xrpld_ios_latency_bucket)` | — |
|
||||
| Job Queue Depth | timeseries | `xrpld_job_count` | — |
|
||||
| Ledger Fetch Rate | stat | `rate(xrpld_ledger_fetches[5m])` | — |
|
||||
| Ledger History Mismatches | stat | `rate(xrpld_ledger_history_mismatch[5m])` | — |
|
||||
| Panel | Type | PromQL | Labels Used |
|
||||
| -------------------------- | ---------- | ------------------------------------------------------ | ----------- |
|
||||
| Validated Ledger Age | stat | `rippled_LedgerMaster_Validated_Ledger_Age` | — |
|
||||
| Published Ledger Age | stat | `rippled_LedgerMaster_Published_Ledger_Age` | — |
|
||||
| Operating Mode Duration | timeseries | `rippled_State_Accounting_*_duration` | — |
|
||||
| Operating Mode Transitions | timeseries | `rippled_State_Accounting_*_transitions` | — |
|
||||
| I/O Latency | timeseries | `histogram_quantile(0.95, rippled_ios_latency_bucket)` | — |
|
||||
| Job Queue Depth | timeseries | `rippled_job_count` | — |
|
||||
| Ledger Fetch Rate | stat | `rate(rippled_ledger_fetches[5m])` | — |
|
||||
| Ledger History Mismatches | stat | `rate(rippled_ledger_history_mismatch[5m])` | — |
|
||||
|
||||
### Network Traffic — StatsD (`xrpld-statsd-network`)
|
||||
|
||||
| Panel | Type | PromQL | Labels Used |
|
||||
| ---------------------- | ---------- | ------------------------------------ | ----------- |
|
||||
| Active Peers | timeseries | `xrpld_Peer_Finder_Active_*_Peers` | — |
|
||||
| Peer Disconnects | timeseries | `xrpld_Overlay_Peer_Disconnects` | — |
|
||||
| Total Network Bytes | timeseries | `xrpld_total_Bytes_In/Out` | — |
|
||||
| Total Network Messages | timeseries | `xrpld_total_Messages_In/Out` | — |
|
||||
| Transaction Traffic | timeseries | `xrpld_transactions_Messages_In/Out` | — |
|
||||
| Proposal Traffic | timeseries | `xrpld_proposals_Messages_In/Out` | — |
|
||||
| Validation Traffic | timeseries | `xrpld_validations_Messages_In/Out` | — |
|
||||
| Traffic by Category | bargauge | `topk(10, xrpld_*_Bytes_In)` | — |
|
||||
| Panel | Type | PromQL | Labels Used |
|
||||
| ---------------------- | ---------- | -------------------------------------- | ----------- |
|
||||
| Active Peers | timeseries | `rippled_Peer_Finder_Active_*_Peers` | — |
|
||||
| Peer Disconnects | timeseries | `rippled_Overlay_Peer_Disconnects` | — |
|
||||
| Total Network Bytes | timeseries | `rippled_total_Bytes_In/Out` | — |
|
||||
| Total Network Messages | timeseries | `rippled_total_Messages_In/Out` | — |
|
||||
| Transaction Traffic | timeseries | `rippled_transactions_Messages_In/Out` | — |
|
||||
| Proposal Traffic | timeseries | `rippled_proposals_Messages_In/Out` | — |
|
||||
| Validation Traffic | timeseries | `rippled_validations_Messages_In/Out` | — |
|
||||
| Traffic by Category | bargauge | `topk(10, rippled_*_Bytes_In)` | — |
|
||||
|
||||
### RPC & Pathfinding — StatsD (`xrpld-statsd-rpc`)
|
||||
|
||||
| Panel | Type | PromQL | Labels Used |
|
||||
| ------------------------- | ---------- | ------------------------------------------------------ | ----------- |
|
||||
| RPC Request Rate | stat | `rate(xrpld_rpc_requests[5m])` | — |
|
||||
| RPC Response Time | timeseries | `histogram_quantile(0.95, xrpld_rpc_time_bucket)` | — |
|
||||
| RPC Response Size | timeseries | `histogram_quantile(0.95, xrpld_rpc_size_bucket)` | — |
|
||||
| RPC Response Time Heatmap | heatmap | `xrpld_rpc_time_bucket` | — |
|
||||
| Pathfinding Fast Duration | timeseries | `histogram_quantile(0.95, xrpld_pathfind_fast_bucket)` | — |
|
||||
| Pathfinding Full Duration | timeseries | `histogram_quantile(0.95, xrpld_pathfind_full_bucket)` | — |
|
||||
| Resource Warnings Rate | stat | `rate(xrpld_warn[5m])` | — |
|
||||
| Resource Drops Rate | stat | `rate(xrpld_drop[5m])` | — |
|
||||
| Panel | Type | PromQL | Labels Used |
|
||||
| ------------------------- | ---------- | -------------------------------------------------------- | ----------- |
|
||||
| RPC Request Rate | stat | `rate(rippled_rpc_requests[5m])` | — |
|
||||
| RPC Response Time | timeseries | `histogram_quantile(0.95, rippled_rpc_time_bucket)` | — |
|
||||
| RPC Response Size | timeseries | `histogram_quantile(0.95, rippled_rpc_size_bucket)` | — |
|
||||
| RPC Response Time Heatmap | heatmap | `rippled_rpc_time_bucket` | — |
|
||||
| Pathfinding Fast Duration | timeseries | `histogram_quantile(0.95, rippled_pathfind_fast_bucket)` | — |
|
||||
| Pathfinding Full Duration | timeseries | `histogram_quantile(0.95, rippled_pathfind_full_bucket)` | — |
|
||||
| Resource Warnings Rate | stat | `rate(rippled_warn[5m])` | — |
|
||||
| Resource Drops Rate | stat | `rate(rippled_drop[5m])` | — |
|
||||
|
||||
### Span → Metric → Dashboard Summary
|
||||
|
||||
|
||||
Reference in New Issue
Block a user