Pratik Mankawde
45ffe8e2ec
fix(telemetry): add missing counters, fix dashboard metric name, clean dead code
...
- Add rippled_validation_agreements_total and rippled_validation_missed_total
counter declarations and creation (wiring to ValidationTracker pending rebase)
- Fix peer-quality dashboard: query rippled_server_info{metric="peer_disconnects_resources"}
instead of non-existent rippled_Overlay_Peer_Disconnects_Charges
- Remove dead getCountsJson() call in storageDetail callback
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-03-31 22:31:49 +01:00
Pratik Mankawde
b92354715d
feat(telemetry): add validator health, peer quality dashboards and ledger economy panels (Tasks 9.11-9.13)
...
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-03-31 22:31:49 +01:00
Pratik Mankawde
936c73982d
docs: update Phase 9 docs and dashboard for push_metrics.py parity gauges
...
- Add Task 9.7a to Phase9_taskList.md documenting new gauges
- Add metric tables to 09-data-collection-reference.md (server_info,
build_info, complete_ledgers, db_metrics, extended cache/nodestore)
- Update metric counts from ~50 to ~68 in 06-implementation-phases.md
- Add OTel MetricsRegistry gauge reference to telemetry-runbook.md
- Add 11 new panels to system-node-health.json Grafana dashboard
(server state, uptime, peers, validated seq, last close info,
build version, complete ledgers, db sizes, historical fetch rate,
peer disconnects)
- Fix leftover merge conflict marker in 08-appendix.md
- Add ripplex/mseconds to cspell dictionary
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 22:31:49 +01:00
Pratik Mankawde
892fee638a
Phase 9: Metric gap fill - nodestore, cache, TxQ, load factor dashboards
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 22:31:49 +01:00
Pratik Mankawde
facc111c22
Phase 8: Log-trace correlation with Loki and filelog receiver
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 22:31:49 +01:00
Pratik Mankawde
5ec9f3f30a
Phase 7: Native OTel metrics migration
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 22:31:49 +01:00
Pratik Mankawde
8f364ed6f4
Phase 6: StatsD metrics integration into telemetry pipeline
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 22:31:49 +01:00
Pratik Mankawde
fdec3ce5c4
Phase 8: Log-trace correlation with Loki and filelog receiver
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 22:31:37 +01:00
Pratik Mankawde
aa062ecdbe
Phase 7: Native OTel metrics migration
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 22:31:37 +01:00
Pratik Mankawde
0e15f95543
Phase 6: StatsD metrics integration into telemetry pipeline
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 22:31:37 +01:00
Pratik Mankawde
2f7064ace6
Phase 7: Native OTel metrics migration
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 22:31:24 +01:00
Pratik Mankawde
21192e9b3f
Phase 6: StatsD metrics integration into telemetry pipeline
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 22:31:07 +01:00
Pratik Mankawde
7e5591318f
Phase 5b: Ledger, peer, and tx spans with expanded Grafana dashboards
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 22:30:59 +01:00
Pratik Mankawde
87ed778efe
refactor(telemetry): migrate integration test and docs from Jaeger to Tempo API
...
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-03-31 22:29:30 +01:00
Pratik Mankawde
d0ff82801c
fix: use docker/telemetry/data/ for runtime data and add .gitignore
...
Move xrpld data paths from ./data/ to docker/telemetry/data/ so runtime
files stay within the docker telemetry directory. Add .gitignore to
exclude the data directory from version control.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-03-31 22:29:30 +01:00
Pratik Mankawde
f940290866
Phase 5: Documentation, deployment configs, integration test infrastructure
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 22:29:30 +01:00
Pratik Mankawde
a127711b86
Phase 4: Consensus tracing - round lifecycle, proposals, validations, close time
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 22:28:33 +01:00
Pratik Mankawde
88d17e4c04
Phase 3: Transaction tracing - protobuf context propagation, PeerImp, NetworkOPs
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 22:28:27 +01:00
Pratik Mankawde
945faac770
Phase 2: RPC tracing - span macros, attributes, WebSocket, command spans
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 22:28:22 +01:00
Pratik Mankawde
8421134420
refactor(telemetry): remove Jaeger service, exporter, and datasource
...
Tempo is now the sole trace backend. Remove Jaeger all-in-one service
from docker-compose, otlp/jaeger exporter from OTel Collector config,
and Jaeger Grafana datasource provisioning file.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-03-31 22:28:12 +01:00
Pratik Mankawde
a7470615be
Phase 1b: Telemetry core infrastructure - CMake, Conan, SpanGuard, config
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 22:28:12 +01:00