From c585d9b66cd02f6a2aa2eaad80fd77c40f3d94d3 Mon Sep 17 00:00:00 2001 From: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com> Date: Tue, 21 Apr 2026 15:19:58 +0100 Subject: [PATCH] docs(telemetry): add deterministic TX trace ID design (Task 3.9) Add trace_id = txHash[0:16] strategy so all nodes handling the same transaction independently produce spans under the same trace_id, combined with protobuf span_id propagation for parent-child ordering. Co-Authored-By: Claude Opus 4.6 (1M context) --- OpenTelemetryPlan/02-design-decisions.md | 79 ++++++++++ .../05-configuration-reference.md | 54 ++++--- OpenTelemetryPlan/06-implementation-phases.md | 55 ++++--- OpenTelemetryPlan/Phase3_taskList.md | 148 +++++++++++++++++- 4 files changed, 293 insertions(+), 43 deletions(-) diff --git a/OpenTelemetryPlan/02-design-decisions.md b/OpenTelemetryPlan/02-design-decisions.md index fe87fc78db..c0c5d2f5d7 100644 --- a/OpenTelemetryPlan/02-design-decisions.md +++ b/OpenTelemetryPlan/02-design-decisions.md @@ -417,6 +417,85 @@ redact_peer_address=1 # Remove peer IP addresses > **WS** = WebSocket +### 2.5.0 Deterministic Trace ID Strategy + +Both transaction and consensus tracing use **deterministic trace IDs** derived from +a globally known hash, so all nodes handling the same workflow independently produce +spans under the same `trace_id`. This is combined with protobuf `span_id` propagation +for parent-child relay ordering when available. + +#### Transactions — `trace_id = txHash[0:16]` + +Every node that handles a transaction knows its `txID` (the `uint256` transaction +hash). The first 16 bytes of this hash are used as the OTel `trace_id`: + +``` +uint256 txHash: A1B2C3D4 E5F6A7B8 C9D0E1F2 A3B4C5D6 E7F8A9B0 C1D2E3F4 A5B6C7D8 E9F0A1B2 + |---------- trace_id (16 bytes) ---------| (remaining 16 bytes unused) +``` + +Each node generates a **random 8-byte `span_id`** so its span is unique within the +shared trace. When protobuf `TraceContext` is present in the incoming `TMTransaction`, +the sender's `span_id` is extracted and used as the parent — preserving the relay +chain as a parent-child tree. When absent (older peers, first hop from client), the +span appears as a root in the same trace — correlation is preserved, only the tree +structure degrades. + +``` +Node A (submitter) Node B (relay) Node C (relay) +trace_id: A1B2... trace_id: A1B2... trace_id: A1B2... +span_id: 1234 (random) span_id: 5678 (random) span_id: 9ABC (random) +parent: (none) parent: 1234 (proto) parent: 5678 (proto) + ↑ ↑ + protobuf propagation protobuf propagation +``` + +If protobuf propagation fails at Node B (old peer): + +``` +Node A Node B (old peer) Node C +trace_id: A1B2... trace_id: A1B2... trace_id: A1B2... +span_id: 1234 span_id: 5678 span_id: 9ABC +parent: (none) parent: (none) parent: 5678 (proto) + ↑ no parent, but same trace_id — still grouped +``` + +#### Consensus — `trace_id = prevLedgerHash[0:16]` + +All validators in the same consensus round share the same `previousLedger.id()`. +The first 16 bytes are used as trace_id. See [Phase 4a implementation status](./06-implementation-phases.md) +and `createDeterministicContext()` in `RCLConsensus.cpp` for the implementation. + +Switchable via `consensus_trace_strategy` config: +`"deterministic"` (default) or `"attribute"` (random trace_id, correlation via attribute queries). + +#### Why Not Random IDs with Propagation Only? + +Random trace IDs require **unbroken context propagation** across every hop. In a +mixed-version network (common during upgrades), older peers silently drop the +`trace_context` protobuf field. The trace splits and downstream spans become +impossible to find. Deterministic IDs make correlation **propagation-resilient** — the trace +backend groups all spans for the same transaction/round regardless of whether +propagation succeeded. + +#### Why Keep Protobuf Propagation? + +Deterministic trace IDs alone provide correlation (all spans grouped) but not +**causality** (which node relayed to which). Protobuf `span_id` propagation adds +parent-child ordering that shows the exact relay path. The two mechanisms complement +each other: + +| Mechanism | Provides | Fails when | +| ---------------------------- | --------------------------- | -------------------------------------- | +| Deterministic trace_id | Cross-node correlation | Never (hash is always known) | +| Protobuf span_id propagation | Parent-child relay ordering | Older peer drops `trace_context` field | + +#### Implementation Reference + +The utility function `createDeterministicTxContext(uint256 const& txHash)` follows +the same pattern as `createDeterministicContext(uint256 const& ledgerId)` in +`RCLConsensus.cpp`. See [Phase 3 Task 3.9](./Phase3_taskList.md) for the full spec. + ### 2.5.1 Propagation Boundaries ```mermaid diff --git a/OpenTelemetryPlan/05-configuration-reference.md b/OpenTelemetryPlan/05-configuration-reference.md index 1f56a7abf0..bdb0b0bb22 100644 --- a/OpenTelemetryPlan/05-configuration-reference.md +++ b/OpenTelemetryPlan/05-configuration-reference.md @@ -61,6 +61,14 @@ Add to `cfg/xrpld-example.cfg`: # trace_validator=0 # Validator list and manifest updates (low volume) # trace_amendment=0 # Amendment voting (very low volume) # +# # Trace ID strategies for cross-node correlation +# # "deterministic" (default) derives trace_id from a workflow hash +# # (txHash for transactions, prevLedgerHash for consensus) so all nodes +# # produce spans under the same trace_id for the same workflow. +# # "attribute" uses random trace_id; correlation via attribute queries. +# tx_trace_strategy=deterministic +# consensus_trace_strategy=deterministic +# # # Service identification (automatically detected if not specified) # # service_name=xrpld # # service_instance_id= @@ -71,28 +79,30 @@ enabled=0 ### 5.1.2 Configuration Options Summary -| Option | Type | Default | Description | -| --------------------- | ------ | ---------------- | ----------------------------------------- | -| `enabled` | bool | `false` | Enable/disable telemetry | -| `exporter` | string | `"otlp_grpc"` | Exporter type: otlp_grpc, otlp_http, none | -| `endpoint` | string | `localhost:4317` | OTLP collector endpoint | -| `use_tls` | bool | `false` | Enable TLS for exporter connection | -| `tls_ca_cert` | string | `""` | Path to CA certificate file | -| `sampling_ratio` | float | `1.0` | Sampling ratio (0.0-1.0) | -| `batch_size` | uint | `512` | Spans per export batch | -| `batch_delay_ms` | uint | `5000` | Max delay before sending batch (ms) | -| `max_queue_size` | uint | `2048` | Maximum queued spans | -| `trace_transactions` | bool | `true` | Enable transaction tracing | -| `trace_consensus` | bool | `true` | Enable consensus tracing | -| `trace_rpc` | bool | `true` | Enable RPC tracing | -| `trace_peer` | bool | `false` | Enable peer message tracing (high volume) | -| `trace_ledger` | bool | `true` | Enable ledger tracing | -| `trace_pathfind` | bool | `true` | Enable path computation tracing | -| `trace_txq` | bool | `true` | Enable transaction queue tracing | -| `trace_validator` | bool | `false` | Enable validator list/manifest tracing | -| `trace_amendment` | bool | `false` | Enable amendment voting tracing | -| `service_name` | string | `"xrpld"` | Service name for traces | -| `service_instance_id` | string | `` | Instance identifier | +| Option | Type | Default | Description | +| -------------------------- | ------ | ----------------- | ---------------------------------------------------------------------------------------------------------- | +| `enabled` | bool | `false` | Enable/disable telemetry | +| `exporter` | string | `"otlp_grpc"` | Exporter type: otlp_grpc, otlp_http, none | +| `endpoint` | string | `localhost:4317` | OTLP collector endpoint | +| `use_tls` | bool | `false` | Enable TLS for exporter connection | +| `tls_ca_cert` | string | `""` | Path to CA certificate file | +| `sampling_ratio` | float | `1.0` | Sampling ratio (0.0-1.0) | +| `batch_size` | uint | `512` | Spans per export batch | +| `batch_delay_ms` | uint | `5000` | Max delay before sending batch (ms) | +| `max_queue_size` | uint | `2048` | Maximum queued spans | +| `trace_transactions` | bool | `true` | Enable transaction tracing | +| `trace_consensus` | bool | `true` | Enable consensus tracing | +| `trace_rpc` | bool | `true` | Enable RPC tracing | +| `trace_peer` | bool | `false` | Enable peer message tracing (high volume) | +| `trace_ledger` | bool | `true` | Enable ledger tracing | +| `trace_pathfind` | bool | `true` | Enable path computation tracing | +| `trace_txq` | bool | `true` | Enable transaction queue tracing | +| `trace_validator` | bool | `false` | Enable validator list/manifest tracing | +| `trace_amendment` | bool | `false` | Enable amendment voting tracing | +| `tx_trace_strategy` | string | `"deterministic"` | TX trace ID strategy: `"deterministic"` (trace_id = txHash[0:16]) or `"attribute"` (random) | +| `consensus_trace_strategy` | string | `"deterministic"` | Consensus trace ID strategy: `"deterministic"` (trace_id = prevLedgerHash[0:16]) or `"attribute"` (random) | +| `service_name` | string | `"xrpld"` | Service name for traces | +| `service_instance_id` | string | `` | Instance identifier | --- diff --git a/OpenTelemetryPlan/06-implementation-phases.md b/OpenTelemetryPlan/06-implementation-phases.md index ccf1fd54d4..c5c693d7a0 100644 --- a/OpenTelemetryPlan/06-implementation-phases.md +++ b/OpenTelemetryPlan/06-implementation-phases.md @@ -118,21 +118,31 @@ gantt ## 6.4 Phase 3: Transaction Tracing (Weeks 5-6) -**Objective**: Trace transaction lifecycle across network +**Objective**: Trace transaction lifecycle across network with deterministic cross-node correlation ### Tasks -| Task | Description | -| ---- | ---------------------------------------------------- | -| 3.1 | Define `TraceContext` Protocol Buffer message | -| 3.2 | Implement protobuf context serialization | -| 3.3 | Instrument `PeerImp::handleTransaction()` | -| 3.4 | Instrument `NetworkOPs::submitTransaction()` | -| 3.5 | Instrument HashRouter integration | -| 3.6 | Fee escalation instrumentation (`fee.escalate` span) | -| 3.7 | Implement relay context propagation | -| 3.8 | Integration tests (multi-node) | -| 3.9 | Performance benchmarks | +| Task | Description | +| ---- | -------------------------------------------------------------- | +| 3.1 | Define `TraceContext` Protocol Buffer message | +| 3.2 | Implement protobuf context serialization | +| 3.3 | Instrument `PeerImp::handleTransaction()` | +| 3.4 | Instrument `NetworkOPs::submitTransaction()` | +| 3.5 | Instrument HashRouter integration | +| 3.6 | Fee escalation instrumentation (`fee.escalate` span) | +| 3.7 | Implement relay context propagation | +| 3.8 | Integration tests (multi-node) | +| 3.9 | Deterministic transaction trace ID (`trace_id = txHash[0:16]`) | +| 3.10 | Performance benchmarks | + +### Deterministic Trace ID (Task 3.9) + +Transaction spans use **deterministic trace IDs** derived from the transaction hash: +`trace_id = txHash[0:16]`. All nodes handling the same transaction independently +produce spans under the same trace_id. Protobuf `span_id` propagation (Task 3.7) +additionally provides parent-child relay ordering when available. See +[02-design-decisions.md §2.5.0](./02-design-decisions.md) for the design rationale +and [Phase3_taskList.md Task 3.9](./Phase3_taskList.md) for the full implementation spec. ### Exit Criteria @@ -141,6 +151,8 @@ gantt - [ ] HashRouter deduplication visible in traces - [ ] Multi-node integration tests passing - [ ] <5% overhead on transaction throughput +- [ ] Deterministic trace_id: all nodes produce same trace_id for same transaction +- [ ] Protobuf span_id propagation preserves parent-child ordering when available --- @@ -443,15 +455,18 @@ Clear, measurable criteria for each phase. ### 6.10.3 Phase 3: Transaction Tracing -| Criterion | Measurement | Target | -| ---------------- | ------------------------------- | ---------------------------------- | -| Local Trace | Submit → validate → TxQ traced | Single-node test passes | -| Cross-Node | Context propagates via protobuf | Multi-node test passes | -| Relay Visibility | relay_count attribute correct | Spot check 100 txs | -| HashRouter | Deduplication visible in trace | Duplicate txs show suppressed=true | -| Performance | TX throughput overhead | <5% degradation | +| Criterion | Measurement | Target | +| --------------------- | ------------------------------------------------- | -------------------------------------------------------- | +| Local Trace | Submit → validate → TxQ traced | Single-node test passes | +| Cross-Node | Context propagates via protobuf | Multi-node test passes | +| Deterministic TraceID | Same trace_id on all nodes for same tx | Multi-node test: query by txHash[0:16] returns all spans | +| Relay Ordering | Protobuf span_id propagation creates parent-child | Tempo trace tree shows relay chain | +| Graceful Degradation | Old peer drops trace_context | Spans still grouped by deterministic trace_id | +| Relay Visibility | relay_count attribute correct | Spot check 100 txs | +| HashRouter | Deduplication visible in trace | Duplicate txs show suppressed=true | +| Performance | TX throughput overhead | <5% degradation | -**Definition of Done**: Transaction traces span 3+ nodes in test network, performance within bounds. +**Definition of Done**: Transaction traces span 3+ nodes in test network with deterministic trace_id correlation, parent-child ordering via protobuf propagation, and performance within bounds. ### 6.10.4 Phase 4: Consensus Tracing diff --git a/OpenTelemetryPlan/Phase3_taskList.md b/OpenTelemetryPlan/Phase3_taskList.md index a0a27c3434..e5eb90cb3d 100644 --- a/OpenTelemetryPlan/Phase3_taskList.md +++ b/OpenTelemetryPlan/Phase3_taskList.md @@ -253,6 +253,149 @@ --- +## Task 3.9: Deterministic Transaction Trace ID + +> **Upstream**: Task 3.2 (protobuf serialization), Task 3.3 (PeerImp span exists). +> **Downstream**: Phase 10 (workload validation can query by tx hash directly). +> **Pattern**: Mirrors the consensus deterministic trace ID in Phase 4a +> (`createDeterministicContext` in `RCLConsensus.cpp`), adapted for transactions. + +**Objective**: Derive the trace_id for transaction spans deterministically from the +transaction hash so that all nodes handling the same transaction independently produce +spans under the same trace_id — regardless of whether protobuf context propagation +succeeds. + +**Why**: The current approach creates spans with random trace_ids and relies entirely +on protobuf `TraceContext` propagation to link them. If any hop in the relay chain +drops the context (older peers, message corruption, mixed-version networks), the trace +splits and downstream spans become impossible to find. With deterministic trace_ids, +correlation is guaranteed because every node derives the same trace_id from the same +`txID`. + +**Approach — deterministic trace_id + protobuf span_id propagation**: + +1. Derive `trace_id = txHash[0:16]` (first 16 bytes of the 32-byte transaction hash). +2. Generate a random 8-byte `span_id` per node (each node's span is unique within + the shared trace). +3. Create the span under this deterministic context as parent. +4. **Additionally**, if protobuf `TraceContext` is present in the incoming + `TMTransaction` message, extract the sender's `span_id` and use it as the span's + parent — this preserves parent-child ordering in the trace tree. +5. If protobuf context is absent (older peer, first hop), the span still has the + correct deterministic `trace_id` — it appears as a sibling root in the same trace + rather than being lost. + +This gives the best of both worlds: guaranteed cross-node correlation via deterministic +`trace_id`, plus parent-child relay ordering via protobuf `span_id` when available. + +**What to do**: + +- Create `createDeterministicTxContext(uint256 const& txHash)` utility function: + - Location: shared header or file-local in `PeerImp.cpp` and `NetworkOPs.cpp` + (or a shared telemetry utility if both need it). + - Pattern: identical to `createDeterministicContext(uint256 const& ledgerId)` in + `RCLConsensus.cpp` — take `txHash[0:16]` as trace_id, random span_id via + `crypto_prng()`, sampled flag set, `remote=false`. + - Guard behind `#ifdef XRPL_ENABLE_TELEMETRY`. + + ```cpp + opentelemetry::context::Context + createDeterministicTxContext(uint256 const& txHash) + { + namespace trace = opentelemetry::trace; + + // First 16 bytes of the 32-byte tx hash as trace ID. + trace::TraceId traceId( + opentelemetry::nostd::span(txHash.data(), 16)); + + // Random span_id so each node's span is unique within the trace. + uint8_t spanIdBytes[8]; + crypto_prng()(spanIdBytes, sizeof(spanIdBytes)); + trace::SpanId spanId( + opentelemetry::nostd::span(spanIdBytes, 8)); + + trace::SpanContext syntheticCtx( + traceId, spanId, trace::TraceFlags(1), /* remote = */ false); + + return opentelemetry::context::Context{}.SetValue( + trace::kSpanKey, + opentelemetry::nostd::shared_ptr( + new trace::DefaultSpan(syntheticCtx))); + } + ``` + +- Edit `src/xrpld/overlay/detail/PeerImp.cpp` — restructure `handleTransaction()`: + - **Move span creation after deserialization** (txID must be known first): + 1. Deserialize `STTx` and get `txID` (existing code at line ~1382). + 2. Create deterministic parent context: `auto detCtx = createDeterministicTxContext(txID)`. + 3. If `m->has_trace_context()`: extract protobuf context via `extractFromProtobuf()`, + **combine** with deterministic trace_id — use the protobuf span_id as parent + to preserve relay ordering, but override trace_id with the deterministic one. + 4. If no protobuf context: create span under `detCtx` directly. + 5. Set all existing attributes (`hash`, `peerId`, `peerVersion`, `suppressed`, etc.). + + - **Combining deterministic trace_id with protobuf parent span_id**: + When both are available, construct a synthetic `SpanContext` with: + - `trace_id` = `txHash[0:16]` (deterministic) + - `span_id` = extracted from protobuf (sender's span_id → becomes parent) + - `trace_flags` = from protobuf + - `remote` = true (came from another node) + + ```cpp + // Pseudo-code for the combined context: + auto detTraceId = trace::TraceId(txHash.data(), 16); + auto remoteSpanId = /* from extractFromProtobuf */; + auto remoteFlags = /* from extractFromProtobuf */; + + trace::SpanContext combinedCtx( + detTraceId, remoteSpanId, remoteFlags, /* remote = */ true); + // Use as parent context for the new span. + ``` + +- Edit `src/xrpld/app/misc/NetworkOPs.cpp` — update `processTransaction()`: + - `transaction->getID()` is already available at the top of the function. + - Create deterministic parent context from `txID`. + - Create `tx.process` span under this context. + - No protobuf context to extract here (NetworkOPs is intra-node), so + deterministic context alone is sufficient. + +- Add `tx_trace_strategy` attribute to spans: + - Add `inline constexpr auto traceStrategy = join(xrplTx, makeStr("trace_strategy"));` + to `TxSpanNames.h`. + - Set on each tx span: `span.setAttribute(tx_span::attr::traceStrategy, "deterministic")`. + +**Key new/modified files**: + +- `src/xrpld/overlay/detail/PeerImp.cpp` — restructured span creation +- `src/xrpld/app/misc/NetworkOPs.cpp` — deterministic context for tx.process +- `src/xrpld/telemetry/TxSpanNames.h` — new `traceStrategy` attribute constant +- New or shared utility for `createDeterministicTxContext()` (location TBD: could be + a shared header like `include/xrpl/telemetry/DeterministicContext.h`, or file-local + if only used in two places) + +**Interaction with existing tasks**: + +- **Task 3.3 (PeerImp instrumentation)**: The span creation in `handleTransaction()` + must be restructured — the span currently starts before `txID` is known. This task + moves it after deserialization. +- **Task 3.6 (Relay context propagation)**: Protobuf injection at the relay site + remains the same — `injectToProtobuf()` serializes the current span's `span_id`. + The receiver extracts it and combines with the deterministic `trace_id`. +- **Phase 4a (Consensus deterministic trace ID)**: This task follows the same pattern. + Consider extracting a shared utility (e.g., `createDeterministicContext(uint256)`) + that both consensus and transaction tracing use. + +**Exit Criteria**: + +- [ ] `tx.receive` and `tx.process` spans have deterministic trace_id = `txHash[0:16]` +- [ ] All nodes handling the same transaction produce spans under the same trace_id +- [ ] Protobuf `span_id` propagation still works when available (parent-child ordering) +- [ ] Missing protobuf context (old peer) degrades gracefully to sibling spans, not lost traces +- [ ] `xrpl.tx.trace_strategy` attribute set to `"deterministic"` on all tx spans +- [ ] Trace queryable by tx hash (truncate hash → trace_id → direct lookup in Tempo) + +--- + ## Summary | Task | Description | New Files | Modified Files | Depends On | @@ -265,8 +408,9 @@ | 3.6 | Relay context propagation | 0 | 1-2 | 3.3, 3.5 | | 3.7 | Build verification and testing | 0 | 0 | 3.1-3.6 | | 3.8 | TX span peer version attribute | 0 | 1 | 3.3 | +| 3.9 | Deterministic transaction trace ID | 0-1 | 3 | 3.2, 3.3 | -**Parallel work**: Tasks 3.1 and 3.4 can start in parallel. Task 3.2 depends on 3.1. Tasks 3.3 and 3.5 depend on 3.2. Task 3.6 depends on 3.3 and 3.5. Task 3.8 depends on 3.3 (span must exist). +**Parallel work**: Tasks 3.1 and 3.4 can start in parallel. Task 3.2 depends on 3.1. Tasks 3.3 and 3.5 depend on 3.2. Task 3.6 depends on 3.3 and 3.5. Task 3.8 depends on 3.3 (span must exist). Task 3.9 depends on 3.2 and 3.3. **Exit Criteria** (from [06-implementation-phases.md §6.11.3](./06-implementation-phases.md)): @@ -274,3 +418,5 @@ - [ ] Trace context in Protocol Buffer messages - [ ] HashRouter deduplication visible in traces - [ ] <5% overhead on transaction throughput +- [ ] Deterministic trace_id: same trace_id for same tx across all nodes +- [ ] Protobuf span_id propagation preserves parent-child ordering when available