Pratik Mankawde
c4bafa3c93
fix(telemetry): add missing counters, fix dashboard metric name, clean dead code
...
- Add rippled_validation_agreements_total and rippled_validation_missed_total
counter declarations and creation (wiring to ValidationTracker pending rebase)
- Fix peer-quality dashboard: query rippled_server_info{metric="peer_disconnects_resources"}
instead of non-existent rippled_Overlay_Peer_Disconnects_Charges
- Remove dead getCountsJson() call in storageDetail callback
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-03-31 16:39:40 +01:00
Pratik Mankawde
dfda2e4185
feat(telemetry): add validator health, peer quality dashboards and ledger economy panels (Tasks 9.11-9.13)
...
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-03-31 16:39:40 +01:00
Pratik Mankawde
6738f8b9ab
docs: update Phase 9 docs and dashboard for push_metrics.py parity gauges
...
- Add Task 9.7a to Phase9_taskList.md documenting new gauges
- Add metric tables to 09-data-collection-reference.md (server_info,
build_info, complete_ledgers, db_metrics, extended cache/nodestore)
- Update metric counts from ~50 to ~68 in 06-implementation-phases.md
- Add OTel MetricsRegistry gauge reference to telemetry-runbook.md
- Add 11 new panels to system-node-health.json Grafana dashboard
(server state, uptime, peers, validated seq, last close info,
build version, complete ledgers, db sizes, historical fetch rate,
peer disconnects)
- Fix leftover merge conflict marker in 08-appendix.md
- Add ripplex/mseconds to cspell dictionary
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 16:39:40 +01:00
Pratik Mankawde
43d36ff4f0
Phase 9: Metric gap fill - nodestore, cache, TxQ, load factor dashboards
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 16:39:40 +01:00
Pratik Mankawde
6916734eae
Phase 8: Log-trace correlation with Loki and filelog receiver
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 16:39:40 +01:00
Pratik Mankawde
4137495282
Phase 7: Native OTel metrics migration
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 16:39:39 +01:00
Pratik Mankawde
7ad43d4c21
Phase 6: StatsD metrics integration into telemetry pipeline
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 16:39:39 +01:00
Pratik Mankawde
753e7721e0
Phase 5b: Ledger, peer, and tx spans with expanded Grafana dashboards
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 16:39:39 +01:00
Pratik Mankawde
58aa308052
fix: use docker/telemetry/data/ for runtime data and add .gitignore
...
Move xrpld data paths from ./data/ to docker/telemetry/data/ so runtime
files stay within the docker telemetry directory. Add .gitignore to
exclude the data directory from version control.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-03-31 16:39:20 +01:00
Pratik Mankawde
c07dd573fe
Phase 5: Documentation, deployment configs, integration test infrastructure
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 13:50:38 +01:00
Pratik Mankawde
69d4b77abf
Phase 4: Consensus tracing - round lifecycle, proposals, validations, close time
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 13:09:49 +01:00
Pratik Mankawde
3436f93870
Phase 3: Transaction tracing - protobuf context propagation, PeerImp, NetworkOPs
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 13:09:40 +01:00
Pratik Mankawde
ab6946319c
Phase 2: RPC tracing - span macros, attributes, WebSocket, command spans
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-30 16:07:10 +01:00
Pratik Mankawde
0f0c188111
Phase 1b: Telemetry core infrastructure - CMake, Conan, SpanGuard, config
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-30 15:58:38 +01:00