Phase-6 introduces ledger-operations, peer-network, and the five StatsD
dashboards. Align them with the rest of the chain:
- Rename dashboard UIDs from `rippled-*` to `xrpld-*` so the provisioned
UIDs match the post-rename-script documentation (`docs.sh` rewrites
.md but not .json, so the two drifted). Runbook references
`xrpld-rpc-perf`, `xrpld-transactions`, etc., now the JSON matches.
- Add the `$node` template variable + `exported_instance=~"$node"` filter
to every target in the five `statsd-*` dashboards. Mirrors the pattern
already used by consensus-health, ledger-operations, and peer-network
per the project rule that every dashboard must support per-node
filtering.
- Strip `:<line>` (and `:NN-NN` range) suffixes from C++ file references
in every dashboard panel description and in docker/telemetry/TESTING.md.
Line numbers drift on every refactor; the filename alone is enough to
grep.
- Replace stale `rpc.request` entries with the real emitted span names
(`rpc.http_request`, `rpc.ws_upgrade`, `rpc.ws_message`, `rpc.process`)
in TESTING.md so operators can copy-paste the filters and hit real
traces.
- Also drop the `:706` line ref from the `StatsDCollector.cpp` callout
in `06-implementation-phases.md`.
Phase-1a plan documents advertised OTLP/gRPC on port 4317 as the default
exporter, four unparsed [telemetry] config keys, and "Phase 4a Complete"
status with exit-criteria checkboxes marked done. Every downstream branch
through Phase 5 ships only OTLP/HTTP on port 4318 via OtlpHttpExporterFactory,
never parses the advertised keys, and the Phase 4 work is not yet delivered.
Fixes:
- 02-design-decisions.md: flip §2.1.1 SDK dependency recommendations to
OTLP/HTTP (shipped) with OTLP/gRPC marked Future. Update §2.2 architecture
diagram and text from OTLP/gRPC:4317 to OTLP/HTTP:4318. Rewrite §2.2.1 as
"OTLP/HTTP (Shipped)" and §2.2.2 as "OTLP/gRPC (Future Work — Planned
Upgrade)" with a concrete checklist (Conan dep, config parsing, factory
branch, runbook/dashboard updates) for landing the gRPC transport later.
- 05-configuration-reference.md: drop the fabricated exporter/otlp_grpc key
and the :4317 default from the sample config block and the options-summary
table. Move trace_pathfind, trace_txq, trace_validator, trace_amendment
into a new "Planned (not yet implemented)" table citing the phase that will
add each one. Keep the example config minimal so copy-paste does not produce
a silently-ignored stanza.
- 06-implementation-phases.md: reset Phase 4 Exit Criteria checkboxes from
[x] to [ ] (Phase 4 is not shipped at Phase-1a time). Rename "Phase 4a
Complete" to "Phase 4a Plan" and describe the work as future. Replace the
broken forward link to Phase4_taskList.md (introduced in the Phase 2 PR)
with a sentence pointing readers to where that spec will land. Renumber
the final section 6.12 to 6.11 so it sits directly after 6.10; section 6.11
("Effort Summary") was intentionally removed in earlier edits.
- Phase2_taskList: update attr refs to bare names, note node-health
attrs moved to resource level.
- 02-design-decisions: strip xrpl.pathfind.* prefix from planned attrs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update OpenTelemetryPlan docs and Telemetry.h doc example to reflect
the renamed per-span attributes: xrpl.rpc.command -> command,
xrpl.rpc.status -> rpc_status, xrpl.grpc.method -> method, etc.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Build fixes in PeerImp.cpp:
- Rename duplicate `span` variable to `consSpan` in proposal and
validation handlers to avoid redefinition error
- Fix `->` on non-pointer SpanGuard (now correctly on shared_ptr)
- Fix move-only type copy in lambda capture
Clang-tidy fixes:
- Concatenate nested namespaces in LedgerSpanNames.h and PeerSpanNames.h
- Add missing SpanNames.h includes in BuildLedger.cpp, LedgerMaster.cpp,
PeerImp.cpp for direct seg:: symbol usage
- Add missing <chrono> and <cstdint> includes in BuildLedger.cpp
- Remove unused Feature.h include from BuildLedger.cpp
Rename check fix:
- Run docs.sh to rename rippled_ metric prefixes to xrpld_ in
09-data-collection-reference.md and telemetry-runbook.md
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fix quorum attribute to use actual validator quorum instead of proposer
count, add missing ConsensusState::Expired handling in haveConsensus()
span, move ConsensusSpanNames.h to xrpld/consensus/ to resolve
levelization cycle, remove unused constants, enrich proposal receive
span with sequence, and correct stale documentation references.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update Phase4_taskList.md and 06-implementation-phases.md to reflect
completed implementation of all remaining Phase 4/4a tasks (4.2-4.6,
4a.5, 4a.6, 4a.8). Update exit criteria and summary tables.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Record the close time voting threshold and consensus state on
consensus.update_positions and consensus.check spans:
- xrpl.consensus.close_time_threshold: the avCT_CONSENSUS_PCT (75%)
threshold required for close time agreement
- xrpl.consensus.have_close_time_consensus: whether validators
reached close time consensus in this iteration
These attributes enable dashboards to show how the close time
voting process converges (or stalls) across consensus iterations.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Instrument the consensus subsystem with OpenTelemetry spans covering
the full round lifecycle: round start, establish phase, proposal send,
ledger close, position updates, consensus check, accept, validation
send, and mode changes.
Key design choices adapted from the original Phase 4 implementation
to the new SpanGuard factory pattern introduced in Phase 3:
- Add SpanGuard::hashSpan() for category-gated hash-derived trace IDs
(consensus round spans share trace_id across validators via ledger hash)
- Add SpanGuard::addEvent() overload with key-value attribute pairs
(used for dispute.resolve events during position updates)
- Add ConsensusSpanNames.h with compile-time span name constants
following the colocated *SpanNames.h pattern from Phase 3
- Add consensusTraceStrategy config option ("deterministic"/"attribute")
for cross-node trace correlation strategy selection
- Use SpanGuard::linkedSpan() for follows-from relationships between
consecutive rounds and cross-thread validation spans
- Use SpanGuard::captureContext() for thread-safe context propagation
from consensus thread to jtACCEPT worker thread
Spans produced: consensus.round, consensus.proposal.send,
consensus.ledger_close, consensus.establish, consensus.update_positions,
consensus.check, consensus.accept, consensus.accept.apply,
consensus.validation.send, consensus.mode_change
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Wire trace context into P2P message flow so distributed traces
link across nodes. TX relay injects SpanGuard context via
PropagationHelpers.h; consensus propose/validate injects via
TraceContextPropagator.h. Receive-side extraction in PeerImp
creates child spans for proposals and validations.
- Add TraceBytes struct and SpanGuard::getTraceBytes() for
extracting raw trace context without OTel type dependencies
- Add PropagationHelpers.h: injectSpanContext(SpanGuard, proto)
- Add ConsensusReceiveTracing.h: proposalReceiveSpan(),
validationReceiveSpan() with parent context extraction
- NetworkOPs::apply(): inject tx.process context before relay
- RCLConsensus::propose()/validate(): inject active span context
- PeerImp: create receive spans for proposals and validations
with sender's trace context as parent
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move TxQSpanNames.h include to correct alphabetical position, update
levelization results for new xrpld.telemetry module dependencies,
and apply rename script to docs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add trace_id = txHash[0:16] strategy so all nodes handling the same
transaction independently produce spans under the same trace_id,
combined with protobuf span_id propagation for parent-child ordering.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace references to old XRPL_TRACE_TX/CONSENSUS macros with
SpanGuard::span(TraceCategory, ...) factory calls introduced in Phase 1c.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds xrpl.peer.version attribute to tx.receive spans for version-mismatch
correlation during network upgrades.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Wire trace context into P2P message flow so distributed traces
link across nodes. TX relay injects SpanGuard context via
PropagationHelpers.h; consensus propose/validate injects via
TraceContextPropagator.h. Receive-side extraction in PeerImp
creates child spans for proposals and validations.
- Add TraceBytes struct and SpanGuard::getTraceBytes() for
extracting raw trace context without OTel type dependencies
- Add PropagationHelpers.h: injectSpanContext(SpanGuard, proto)
- Add ConsensusReceiveTracing.h: proposalReceiveSpan(),
validationReceiveSpan() with parent context extraction
- NetworkOPs::apply(): inject tx.process context before relay
- RCLConsensus::propose()/validate(): inject active span context
- PeerImp: create receive spans for proposals and validations
with sender's trace context as parent
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The StatsD receiver config was lost during a branch rebase (--ours
conflict resolution dropped it). Re-add the statsd receiver to the
OTel Collector config and wire it into the metrics pipeline so
beast::insight UDP metrics flow to Prometheus.
Also fixes:
- Metric prefix mismatch: docs used xrpld_ but dashboards/tests use
rippled_ — align all documentation to match the runnable stack
- Remove phantom Peer_Disconnects_Charges from docs (plain atomic,
not a beast::insight gauge)
- Remove premature .codecov.yml exclusions for Phase 7 OTelCollector
files that don't exist on this branch
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace remaining rippled/Ripple references with xrpld/XRPL in
data collection reference, implementation phases, and runbook docs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resolve merge conflicts taking phase 4 consensus span improvements,
fix bashate indentation in integration test script, and apply rename
script to Phase5 integration test docs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move TxQSpanNames.h include to correct alphabetical position, update
levelization results for new xrpld.telemetry module dependencies,
and apply rename script to docs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add unknownCommand and wsUpgrade span name constants to RpcSpanNames.h,
fix SpanGuardFactory tests to use the 3-argument SpanGuard::span() API,
update levelization results, and apply rename script to docs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix quorum attribute to use actual validator quorum instead of proposer
count, add missing ConsensusState::Expired handling in haveConsensus()
span, move ConsensusSpanNames.h to xrpld/consensus/ to resolve
levelization cycle, remove unused constants, enrich proposal receive
span with sequence, and correct stale documentation references.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update Phase4_taskList.md and 06-implementation-phases.md to reflect
completed implementation of all remaining Phase 4/4a tasks (4.2-4.6,
4a.5, 4a.6, 4a.8). Update exit criteria and summary tables.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>