Update OpenTelemetryPlan docs and Telemetry.h doc example to reflect
the renamed per-span attributes: xrpl.rpc.command -> command,
xrpl.rpc.status -> rpc_status, xrpl.grpc.method -> method, etc.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Drop xrpl.pathfind.* prefix from per-span attrs (source_account,
dest_account, fast, search_level, num_complete_paths, num_paths,
num_requests).
- Keep xrpl.pathfind.ledger_index qualified (rule 5: distinct from
xrpl.ledger.seq).
- Remove per-span nodeAmendmentBlocked/nodeServerState calls from
RPCHandler — promoted to resource-level attrs.
- Mark node-health attrs in SpanNames.h as RESOURCE-ONLY with doc.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Clang-tidy fixes:
- Concatenate nested namespaces (modernize-concat-nested-namespaces)
in OTelCollector.h, OTelCollector.cpp, ValidationTracker.h/.cpp
- Add missing direct includes (misc-include-cleaner) in
ValidationTracker.cpp, test, CollectorManager.cpp, OTelCollector.cpp
- Make lock_guard variables const (misc-const-correctness)
- Add braces around single-line if/else (readability-braces-around-statements)
- Use designated initializer for WindowEvent (modernize-use-designated-initializers)
- Initialize LedgerEvent::seq field (cppcoreguidelines-pro-type-member-init)
Linker fix:
- Add ValidationTracker.cpp as source to xrpl.test.telemetry target
(it lives in src/xrpld/ but the test links against libxrpl only)
Levelization fix:
- Remove stale dependency edges from ordering.txt that were introduced
by the erroneous develop-merge commit
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix use-after-free: extract gauge callback to static function and call
RemoveCallback in ~OTelGaugeImpl() before unregistering from collector
- Use memory_order_acq_rel on callHooks() debounce CAS for proper
happens-before relationship between hook invocations
- Add explicit 2s timeout to ForceFlush() in destructor to prevent
blocking indefinitely when OTLP endpoint is unreachable at shutdown
- Add OTLP receiver to metrics pipeline so native OTel metrics from
xrpld are actually received by the collector
- Remove stale health check port from docker-compose (extension was
removed from collector config)
- Clarify fallback docs: StatsD path requires re-enabling receiver/port
- Fix comments: Counter uses uint64_t not int64_t, gauge clamps to
[0, INT64_MAX] not [0, UINT64_MAX]
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The blanket revert of f4555c80fe also un-reverted some files that had
been correctly matched to phase-6 (nodestore Backend API refactor,
Vault_test changes). Restore those to the base branch state so the
phase-7 PR only contains telemetry changes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- protocol/README.md: restore historical GitHub URL path (src/ripple/)
- Config.cpp: restore configLegacyName as "rippled.cfg" (legacy name
must remain as-is for backward compatibility)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reverts 259 files that carried unrelated upstream changes through the
phase-6 merge: enum class removals (cppcoreguidelines-use-enum-class),
scoped_lock→lock_guard conversions (modernize-use-scoped-lock),
nodestore Backend API changes (void const* key), .clang-tidy config,
test infrastructure deletions, and miscellaneous develop changes.
These changes belong on develop, not in the telemetry PR chain.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Revert reusable-build-test-config.yml to develop (action SHA update
and "Show test failure summary" step removal don't belong here)
- Revert upload-conan-deps.yml to develop (action SHA update)
- Revert features.macro: BatchInnerSigs and Batch back to Supported::no
(these feature flag changes are unrelated to telemetry)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Prettier formatting for markdown docs and OTelCollector header
- docs.sh rippled→xrpld renames in OTelCollector.cpp comments/strings
- Updated levelization ordering with new dependency edges
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add [[nodiscard]] to getConsensusTraceStrategy, getYays, getNays
- Add missing <string>, SpanGuard.h, SpanNames.h includes
- Fix widening cast placement (cast before arithmetic, not after)
- Replace nested ternary with lambda for const dir variable
- Add braces to if/else-if chains in Consensus.h
- Concatenate nested namespaces in ConsensusSpanNames.h
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Concatenate nested namespaces in SpanNames.h, RpcSpanNames.h, GrpcSpanNames.h
- Remove unused InfoSub.h and NetworkOPs.h includes from RPCHandler.cpp
- Add missing <string_view> includes in RPCHandler.cpp and GRPCServer.cpp
- Replace nested ternary with if/else-if in RPCHandler.cpp
- Add IWYU pragma keep for json_body.h in ServerHandler.cpp
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instrument the consensus subsystem with OpenTelemetry spans covering
the full round lifecycle: round start, establish phase, proposal send,
ledger close, position updates, consensus check, accept, validation
send, and mode changes.
Key design choices adapted from the original Phase 4 implementation
to the new SpanGuard factory pattern introduced in Phase 3:
- Add SpanGuard::hashSpan() for category-gated hash-derived trace IDs
(consensus round spans share trace_id across validators via ledger hash)
- Add SpanGuard::addEvent() overload with key-value attribute pairs
(used for dispute.resolve events during position updates)
- Add ConsensusSpanNames.h with compile-time span name constants
following the colocated *SpanNames.h pattern from Phase 3
- Add consensusTraceStrategy config option ("deterministic"/"attribute")
for cross-node trace correlation strategy selection
- Use SpanGuard::linkedSpan() for follows-from relationships between
consecutive rounds and cross-thread validation spans
- Use SpanGuard::captureContext() for thread-safe context propagation
from consensus thread to jtACCEPT worker thread
Spans produced: consensus.round, consensus.proposal.send,
consensus.ledger_close, consensus.establish, consensus.update_positions,
consensus.check, consensus.accept, consensus.accept.apply,
consensus.validation.send, consensus.mode_change
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Wire trace context into P2P message flow so distributed traces
link across nodes. TX relay injects SpanGuard context via
PropagationHelpers.h; consensus propose/validate injects via
TraceContextPropagator.h. Receive-side extraction in PeerImp
creates child spans for proposals and validations.
- Add TraceBytes struct and SpanGuard::getTraceBytes() for
extracting raw trace context without OTel type dependencies
- Add PropagationHelpers.h: injectSpanContext(SpanGuard, proto)
- Add ConsensusReceiveTracing.h: proposalReceiveSpan(),
validationReceiveSpan() with parent context extraction
- NetworkOPs::apply(): inject tx.process context before relay
- RCLConsensus::propose()/validate(): inject active span context
- PeerImp: create receive spans for proposals and validations
with sender's trace context as parent
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace SpanGuard::txSpan(prefix, name, hash) with the generic
SpanGuard::hashSpan(TraceCategory, name, hash) that accepts a
TraceCategory parameter instead of hardcoding Transactions. This
enables reuse for consensus round spans (Phase 4) and any future
subsystem needing deterministic cross-node trace correlation via
hash-derived trace IDs.
Both overloads are replaced:
- hashSpan(cat, name, hash, size) — standalone with random span_id
- hashSpan(cat, name, hash, size, parentSpanId, parentSize, flags)
— with remote parent from protobuf context propagation
Add full span name constants (tx_span::receive, tx_span::process)
to TxSpanNames.h following the ConsensusSpanNames.h pattern.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Mark local variables in extractFromProtobuf() and injectToProtobuf()
as const since they are not modified after initialization: traceId,
spanId, flags, spanCtx, and span.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace per-call std::random_device with thread_local std::mt19937 in
txSpan() for span ID generation. random_device is ~423x slower due to
/dev/urandom syscalls on each construction; mt19937 is seeded once per
thread and reused for all subsequent span IDs.
Update the SpanGuard class ASCII diagram to include txSpan factory
methods that were added in the hash-derived trace ID commit.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Derive trace_id from txHash[0:16] so all nodes handling the same
transaction produce spans under the same trace. Protobuf span_id
propagation provides parent-child relay ordering when available.
- Add SpanGuard::txSpan() factory methods (hash-derived trace ID)
- Add TxTracing.h helpers: txReceiveSpan(), txProcessSpan()
- Update PeerImp and NetworkOPs to use the new helpers
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Instrument the consensus subsystem with OpenTelemetry spans covering
the full round lifecycle: round start, establish phase, proposal send,
ledger close, position updates, consensus check, accept, validation
send, and mode changes.
Key design choices adapted from the original Phase 4 implementation
to the new SpanGuard factory pattern introduced in Phase 3:
- Add SpanGuard::hashSpan() for category-gated hash-derived trace IDs
(consensus round spans share trace_id across validators via ledger hash)
- Add SpanGuard::addEvent() overload with key-value attribute pairs
(used for dispute.resolve events during position updates)
- Add ConsensusSpanNames.h with compile-time span name constants
following the colocated *SpanNames.h pattern from Phase 3
- Add consensusTraceStrategy config option ("deterministic"/"attribute")
for cross-node trace correlation strategy selection
- Use SpanGuard::linkedSpan() for follows-from relationships between
consecutive rounds and cross-thread validation spans
- Use SpanGuard::captureContext() for thread-safe context propagation
from consensus thread to jtACCEPT worker thread
Spans produced: consensus.round, consensus.proposal.send,
consensus.ledger_close, consensus.establish, consensus.update_positions,
consensus.check, consensus.accept, consensus.accept.apply,
consensus.validation.send, consensus.mode_change
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Wire trace context into P2P message flow so distributed traces
link across nodes. TX relay injects SpanGuard context via
PropagationHelpers.h; consensus propose/validate injects via
TraceContextPropagator.h. Receive-side extraction in PeerImp
creates child spans for proposals and validations.
- Add TraceBytes struct and SpanGuard::getTraceBytes() for
extracting raw trace context without OTel type dependencies
- Add PropagationHelpers.h: injectSpanContext(SpanGuard, proto)
- Add ConsensusReceiveTracing.h: proposalReceiveSpan(),
validationReceiveSpan() with parent context extraction
- NetworkOPs::apply(): inject tx.process context before relay
- RCLConsensus::propose()/validate(): inject active span context
- PeerImp: create receive spans for proposals and validations
with sender's trace context as parent
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace SpanGuard::txSpan(prefix, name, hash) with the generic
SpanGuard::hashSpan(TraceCategory, name, hash) that accepts a
TraceCategory parameter instead of hardcoding Transactions. This
enables reuse for consensus round spans (Phase 4) and any future
subsystem needing deterministic cross-node trace correlation via
hash-derived trace IDs.
Both overloads are replaced:
- hashSpan(cat, name, hash, size) — standalone with random span_id
- hashSpan(cat, name, hash, size, parentSpanId, parentSize, flags)
— with remote parent from protobuf context propagation
Add full span name constants (tx_span::receive, tx_span::process)
to TxSpanNames.h following the ConsensusSpanNames.h pattern.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Mark local variables in extractFromProtobuf() and injectToProtobuf()
as const since they are not modified after initialization: traceId,
spanId, flags, spanCtx, and span.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace per-call std::random_device with thread_local std::mt19937 in
txSpan() for span ID generation. random_device is ~423x slower due to
/dev/urandom syscalls on each construction; mt19937 is seeded once per
thread and reused for all subsequent span IDs.
Update the SpanGuard class ASCII diagram to include txSpan factory
methods that were added in the hash-derived trace ID commit.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>