Commit Graph

14 Commits

Author SHA1 Message Date
Pratik Mankawde
c4bafa3c93 fix(telemetry): add missing counters, fix dashboard metric name, clean dead code
- Add rippled_validation_agreements_total and rippled_validation_missed_total
  counter declarations and creation (wiring to ValidationTracker pending rebase)
- Fix peer-quality dashboard: query rippled_server_info{metric="peer_disconnects_resources"}
  instead of non-existent rippled_Overlay_Peer_Disconnects_Charges
- Remove dead getCountsJson() call in storageDetail callback

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 16:39:40 +01:00
Pratik Mankawde
dfda2e4185 feat(telemetry): add validator health, peer quality dashboards and ledger economy panels (Tasks 9.11-9.13)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 16:39:40 +01:00
Pratik Mankawde
6738f8b9ab docs: update Phase 9 docs and dashboard for push_metrics.py parity gauges
- Add Task 9.7a to Phase9_taskList.md documenting new gauges
- Add metric tables to 09-data-collection-reference.md (server_info,
  build_info, complete_ledgers, db_metrics, extended cache/nodestore)
- Update metric counts from ~50 to ~68 in 06-implementation-phases.md
- Add OTel MetricsRegistry gauge reference to telemetry-runbook.md
- Add 11 new panels to system-node-health.json Grafana dashboard
  (server state, uptime, peers, validated seq, last close info,
  build version, complete ledgers, db sizes, historical fetch rate,
  peer disconnects)
- Fix leftover merge conflict marker in 08-appendix.md
- Add ripplex/mseconds to cspell dictionary

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-31 16:39:40 +01:00
Pratik Mankawde
43d36ff4f0 Phase 9: Metric gap fill - nodestore, cache, TxQ, load factor dashboards
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-31 16:39:40 +01:00
Pratik Mankawde
6916734eae Phase 8: Log-trace correlation with Loki and filelog receiver
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-31 16:39:40 +01:00
Pratik Mankawde
4137495282 Phase 7: Native OTel metrics migration
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-31 16:39:39 +01:00
Pratik Mankawde
7ad43d4c21 Phase 6: StatsD metrics integration into telemetry pipeline
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-31 16:39:39 +01:00
Pratik Mankawde
753e7721e0 Phase 5b: Ledger, peer, and tx spans with expanded Grafana dashboards
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-31 16:39:39 +01:00
Pratik Mankawde
58aa308052 fix: use docker/telemetry/data/ for runtime data and add .gitignore
Move xrpld data paths from ./data/ to docker/telemetry/data/ so runtime
files stay within the docker telemetry directory. Add .gitignore to
exclude the data directory from version control.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 16:39:20 +01:00
Pratik Mankawde
c07dd573fe Phase 5: Documentation, deployment configs, integration test infrastructure
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-31 13:50:38 +01:00
Pratik Mankawde
69d4b77abf Phase 4: Consensus tracing - round lifecycle, proposals, validations, close time
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-31 13:09:49 +01:00
Pratik Mankawde
3436f93870 Phase 3: Transaction tracing - protobuf context propagation, PeerImp, NetworkOPs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-31 13:09:40 +01:00
Pratik Mankawde
ab6946319c Phase 2: RPC tracing - span macros, attributes, WebSocket, command spans
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 16:07:10 +01:00
Pratik Mankawde
0f0c188111 Phase 1b: Telemetry core infrastructure - CMake, Conan, SpanGuard, config
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 15:58:38 +01:00