Pratik Mankawde
7e149f7773
refactor(telemetry): remove residual Jaeger references across chain
...
Fix remaining Jaeger references that accumulated across intermediate
branches in the stacked PR chain. These were in files modified by
multiple phases where the per-branch fixes didn't cover all additions.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-03-31 22:35:04 +01:00
Pratik Mankawde
e1f30c1a22
docs: update data-collection-reference and presentation for external dashboard parity
...
- Fix validations_checked_total recording site (NetworkOPs, not LedgerMaster)
- Add Slide 11 to presentation: External Dashboard Parity overview with
Mermaid diagrams for new metric categories, ValidationTracker sequence,
and new dashboard summary
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-03-31 22:32:02 +01:00
Pratik Mankawde
5de8c520d1
Phase 10: Workload validation - synthetic load generation and telemetry checks
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 22:32:02 +01:00
Pratik Mankawde
81298ceb9f
docs: add external dashboard parity tasks and metric reference for Phase 9
...
Add Tasks 9.11-9.13 (Validator Health, Peer Quality, Ledger Economy dashboards),
new metric tables in data-collection-reference, and monitoring sections in runbook
covering validation agreement, validator health, peer quality, and state tracking.
Source: external dashboard parity design spec (2026-03-30).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 22:31:49 +01:00
Pratik Mankawde
936c73982d
docs: update Phase 9 docs and dashboard for push_metrics.py parity gauges
...
- Add Task 9.7a to Phase9_taskList.md documenting new gauges
- Add metric tables to 09-data-collection-reference.md (server_info,
build_info, complete_ledgers, db_metrics, extended cache/nodestore)
- Update metric counts from ~50 to ~68 in 06-implementation-phases.md
- Add OTel MetricsRegistry gauge reference to telemetry-runbook.md
- Add 11 new panels to system-node-health.json Grafana dashboard
(server state, uptime, peers, validated seq, last close info,
build version, complete ledgers, db sizes, historical fetch rate,
peer disconnects)
- Fix leftover merge conflict marker in 08-appendix.md
- Add ripplex/mseconds to cspell dictionary
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 22:31:49 +01:00
Pratik Mankawde
892fee638a
Phase 9: Metric gap fill - nodestore, cache, TxQ, load factor dashboards
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 22:31:49 +01:00
Pratik Mankawde
fdec3ce5c4
Phase 8: Log-trace correlation with Loki and filelog receiver
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 22:31:37 +01:00
Pratik Mankawde
2f7064ace6
Phase 7: Native OTel metrics migration
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 22:31:24 +01:00
Pratik Mankawde
1ef234de9d
docs(telemetry): replace Jaeger with Tempo in data collection reference
...
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-03-31 22:31:07 +01:00
Pratik Mankawde
a37cf74868
docs: add peerDisconnectsCharges metric to data collection reference
...
Bridge the existing beast::insight gauge for resource-limit peer
disconnects (peerDisconnectsCharges_) into the StatsD metric inventory.
Part of the external dashboard parity initiative.
See docs/superpowers/specs/2026-03-30-external-dashboard-parity-design.md
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 22:31:07 +01:00
Pratik Mankawde
21192e9b3f
Phase 6: StatsD metrics integration into telemetry pipeline
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-31 22:31:07 +01:00