Phase-3 (PR #6425) is scoped to transaction tracing only; consensus
tracing belongs to phase-4 (PR #6426). The previous commit on this
branch removed the namespace/attribute scaffolding c6c019ed8b leaked
into phase-3, but phase-3 still carried the consensus span construction
and trace-context propagation introduced in earlier commits
(61cb1faf8f, 93bed03d8d). Move that out too so phase-3 creates and
propagates no consensus spans of any kind.
Removed:
- src/xrpld/telemetry/ConsensusReceiveTracing.h (deleted; phase-4
owns it).
- PeerImp.cpp: remove the std::make_shared<SpanGuard>(
proposalReceiveSpan(...))/validationReceiveSpan(...) constructions
in onMessage(TMProposeSet)/onMessage(TMValidation), drop the
sp = std::move(span) lambda captures, and drop the
#include <xrpld/telemetry/ConsensusReceiveTracing.h>.
- RCLConsensus.cpp: drop the two telemetry::injectToProtobuf() blocks
that injected the active trace context into TMProposeSet (in
Adaptor::propose, after addSuppression) and TMValidation (in
Adaptor::validate, around the broadcast call). Drop the now-unused
#include of TraceContextPropagator.h and the
XRPL_ENABLE_TELEMETRY-gated include of
opentelemetry/context/runtime_context.h.
- TraceContextPropagator.h: update file-level @see comment to drop
the ConsensusReceiveTracing.h reference and to scope the
"wired into the P2P message flow via PropagationHelpers.h"
sentence to TMTransaction only.
- PropagationHelpers.h: replace the
@see ConsensusReceiveTracing.h cross-reference with
@see TxTracing.h.
Inert consensus metadata (TraceCategory::Consensus enum value,
seg::consensus constant, isCategoryEnabled/categoryToSpanKind switch
arms, the SpanGuard.h doc-comment example) is intentionally preserved
on phase-3: nothing references it after this commit, but phase-4
needs it and removing it would widen the phase-3 -> phase-4 merge
surface for no benefit.
Verified via git grep: no remaining phase-3 references to
proposalReceiveSpan, validationReceiveSpan, ConsensusReceiveTracing,
consensus_span::, consensus.proposal, or consensus.validation.
Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>
Commit c6c019ed8b ("addressed code review comments") bundled tx-tracing
review fixes with consensus-tracing scaffolding that belongs on
pratik/otel-phase4-consensus-tracing (PR #6426). This commit lifts the
consensus-only parts back out of phase-3 so PR #6425 stays scoped to
transaction tracing. Phase-4 already carries the same content (via prior
phase-3 -> phase-4 merges) plus its own evolution on top, so nothing is
moved across — only removed here.
Removed:
- src/xrpld/app/consensus/ConsensusSpanNames.h (deleted; phase-4 owns
it).
- PeerImp.cpp: revert onMessage(TMProposeSet)/onMessage(TMValidation)
consensus-attr setAttribute calls to the hardcoded
"xrpl.consensus.{trusted,round,ledger.seq}" strings used before
c6c019ed8b. Drop the now-unused
#include <xrpld/app/consensus/ConsensusSpanNames.h>.
- RCLConsensus::Adaptor::validate(): remove the
TODO(observability/secure-OTel) block on validation trace_context.
- TraceContextPropagator.h: remove the TODO(observability/secure-OTel)
block on injectToProtobuf().
Tx-tracing parts of c6c019ed8b are intentionally untouched.
No phase-3 caller of telemetry::consensus_span:: remains; verified via
git grep. No test on phase-3 references the removed header.
Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>
- Drop xrpl.node.amendment_blocked / xrpl.node.server_state from telemetry
surface (constants in SpanNames.h, two filters in tempo.yaml). Operators
read the same data via server_info / server_state RPC; OTel SDK 1.18.0
cannot refresh resource attrs at runtime so resource-level emission was
not viable either.
- Namespace all pathfind span attributes under pathfind_* (underscore form
per Phase 1c rule 5). Renames in PathFindSpanNames.h and call sites in
PathRequest.cpp, PathRequestManager.cpp, plus the rule-5 retention
xrpl.pathfind.ledger_index -> pathfind_ledger_index.
- Wire pathfind_source_account / pathfind_dest_account on pathfind.request
in doPathFind / doRipplePathFind handlers (only when present + string).
- Collapse per-asset pathfind.discover / pathfind.rank spans into one
pathfind.discover hoisted around the per-source-asset loop in
PathRequest::findPaths. Span count goes from 2N to 1 per RPC call;
per-asset breakdown traded for bounded storage and cardinality. Trade-off
documented inline.
- Fix pathfind_num_paths semantics: now sums getBestPaths().size() across
the loop (paths actually returned) instead of the maxPaths input cap.
- PathRequestManager::updateAll: move span creation after the locked
requests_ snapshot, early-return when no active subscriptions exist
(avoids empty span on every ledger close), set pathfind_num_requests
= requests.size().
- Update Phase2_taskList.md and 02-design-decisions.md to match.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Update OpenTelemetryPlan docs and Telemetry.h doc example to reflect
the renamed per-span attributes: xrpl.rpc.command -> command,
xrpl.rpc.status -> rpc_status, xrpl.grpc.method -> method, etc.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Drop xrpl.pathfind.* prefix from per-span attrs (source_account,
dest_account, fast, search_level, num_complete_paths, num_paths,
num_requests).
- Keep xrpl.pathfind.ledger_index qualified (rule 5: distinct from
xrpl.ledger.seq).
- Remove per-span nodeAmendmentBlocked/nodeServerState calls from
RPCHandler — promoted to resource-level attrs.
- Mark node-health attrs in SpanNames.h as RESOURCE-ONLY with doc.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Concatenate nested namespaces in SpanNames.h, RpcSpanNames.h, GrpcSpanNames.h
- Remove unused InfoSub.h and NetworkOPs.h includes from RPCHandler.cpp
- Add missing <string_view> includes in RPCHandler.cpp and GRPCServer.cpp
- Replace nested ternary with if/else-if in RPCHandler.cpp
- Add IWYU pragma keep for json_body.h in ServerHandler.cpp
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Wire trace context into P2P message flow so distributed traces
link across nodes. TX relay injects SpanGuard context via
PropagationHelpers.h; consensus propose/validate injects via
TraceContextPropagator.h. Receive-side extraction in PeerImp
creates child spans for proposals and validations.
- Add TraceBytes struct and SpanGuard::getTraceBytes() for
extracting raw trace context without OTel type dependencies
- Add PropagationHelpers.h: injectSpanContext(SpanGuard, proto)
- Add ConsensusReceiveTracing.h: proposalReceiveSpan(),
validationReceiveSpan() with parent context extraction
- NetworkOPs::apply(): inject tx.process context before relay
- RCLConsensus::propose()/validate(): inject active span context
- PeerImp: create receive spans for proposals and validations
with sender's trace context as parent
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace SpanGuard::txSpan(prefix, name, hash) with the generic
SpanGuard::hashSpan(TraceCategory, name, hash) that accepts a
TraceCategory parameter instead of hardcoding Transactions. This
enables reuse for consensus round spans (Phase 4) and any future
subsystem needing deterministic cross-node trace correlation via
hash-derived trace IDs.
Both overloads are replaced:
- hashSpan(cat, name, hash, size) — standalone with random span_id
- hashSpan(cat, name, hash, size, parentSpanId, parentSize, flags)
— with remote parent from protobuf context propagation
Add full span name constants (tx_span::receive, tx_span::process)
to TxSpanNames.h following the ConsensusSpanNames.h pattern.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Mark local variables in extractFromProtobuf() and injectToProtobuf()
as const since they are not modified after initialization: traceId,
spanId, flags, spanCtx, and span.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace per-call std::random_device with thread_local std::mt19937 in
txSpan() for span ID generation. random_device is ~423x slower due to
/dev/urandom syscalls on each construction; mt19937 is seeded once per
thread and reused for all subsequent span IDs.
Update the SpanGuard class ASCII diagram to include txSpan factory
methods that were added in the hash-derived trace ID commit.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Derive trace_id from txHash[0:16] so all nodes handling the same
transaction produce spans under the same trace. Protobuf span_id
propagation provides parent-child relay ordering when available.
- Add SpanGuard::txSpan() factory methods (hash-derived trace ID)
- Add TxTracing.h helpers: txReceiveSpan(), txProcessSpan()
- Update PeerImp and NetworkOPs to use the new helpers
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add [[maybe_unused]] to RAII span variables in PathFind/RipplePathFind
handlers (Clang -Wunused-variable with -Werror)
- Restore over-renamed values: rippledb, rippled.cfg, historical GitHub URL
- Concatenate nested namespaces in SpanNames.h and PathFindSpanNames.h
(modernize-concat-nested-namespaces)
- Add missing includes and const qualifiers in test files
- Suppress intentional use-after-move in SpanGuardFactory move test
- Remove unused NetworkOPs.h include from PathRequest.cpp
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use C++17 concatenated namespaces, add [[nodiscard]] to query methods,
add missing direct includes, and use pass-by-value + std::move in
NullTelemetry constructor.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Move node health attribute strings to compile-time constants in
SpanNames.h (attr::nodeAmendmentBlocked, attr::nodeServerState)
- Add Tempo search filters for node health attributes
- Remove unnecessary .c_str() on strOperatingMode() return
- Add samplingRatio clamping test (values > 1.0 and < 0.0)
- Fix Task 2.3 status: delivered in Phase 1c, not Phase 2
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Centralise scattered string literals into compile-time constants using
StaticStr<N> and join() for dot-separated composition. Shared primitives
live in SpanNames.h; RPC-specific names in RpcSpanNames.h. Future modules
(consensus, peer, ledger) add their own *SpanNames.h without bloating
the central header.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Concatenate nested namespaces (modernize-concat-nested-namespaces)
- Add [[nodiscard]] to factory and accessor methods
- NOLINT no-op stub instance methods that must stay non-static for API
parity with the real implementation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove the single-arg span(name) factory that creates unconditional
spans without category gating. All call sites use the 3-arg
span(TraceCategory, prefix, name) variant which checks whether the
category is enabled in config before creating a span. The 1-arg form
was dead code with no callers.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace rpcSpan(), txSpan(), consensusSpan(), peerSpan(), ledgerSpan()
with a single span(TraceCategory, prefix, name) factory method. Adding
a new traceable subsystem now requires only a new enum value and one
switch case — no new methods or header changes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- SpanContext::isValid(): add inline no-op when XRPL_ENABLE_TELEMETRY
is not defined, preventing a linker error if called in that path
- linkedSpan(): set kIsRootSpanKey on the StartSpanOptions parent
context so linked spans start a genuinely independent sub-tree
instead of silently becoming children of the current active span
- Telemetry::instance_: use std::atomic with acquire/release ordering
to avoid a data race between start()/stop() and factory methods
called from worker threads
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Redesign SpanGuard with pimpl idiom to hide all OpenTelemetry types
from public headers. Add global Telemetry accessor so SpanGuard factory
methods work without explicit Telemetry references. Add child/linked
span creation and cross-thread context propagation. Update plan docs
to reflect macro removal in favor of SpanGuard factory pattern.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add span discard mechanism that drops unwanted spans before they enter
the batch export queue, saving both network bandwidth and storage.
FilteringSpanProcessor is a custom SpanProcessor decorator that wraps
BatchSpanProcessor. SpanGuard::discard() sets a thread-local flag
(tl_discardCurrentSpan) before calling Span::End(). The OTel SDK calls
OnEnd() synchronously on the same thread, where the flag is checked and
cleared to drop the span.
New file: DiscardFlag.h — zero-dependency header for the thread-local
flag, avoiding transitive include bloat from Telemetry.h.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>