mirror of
https://github.com/XRPLF/rippled.git
synced 2026-06-02 16:26:48 +00:00
docs: correct OTel overhead estimates against SDK benchmarks
Verified CPU, memory, and network overhead calculations against official OTel C++ SDK benchmarks (969 CI runs) and source code analysis. Key corrections: - Span creation: 200-500ns → 500-1000ns (SDK BM_SpanCreation median ~1000ns; original estimate matched API no-op, not SDK path) - Per-TX overhead: 2.4μs → 4.0μs (2.0% vs 1.2%; still within 1-3%) - Active span memory: ~200 bytes → ~500-800 bytes (Span wrapper + SpanData + std::map attribute storage) - Static memory: ~456KB → ~8.3MB (BatchSpanProcessor worker thread stack ~8MB was omitted) - Total memory ceiling: ~2.3MB → ~10MB - Memory success metric target: <5MB → <10MB - AddEvent: 50-80ns → 100-200ns Added Section 3.5.4 with links to all benchmark sources. Updated presentation.md with matching corrections. High-level conclusions unchanged (1-3% CPU, negligible consensus). Also includes: review fixes, cross-document consistency improvements, additional component tracing docs (PathFinding, TxQ, Validator, etc.), context size corrections (32 → 25 bytes). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -15,6 +15,33 @@ Distributed tracing is a method for tracking data objects as they flow through d
|
||||
|
||||
---
|
||||
|
||||
## Actors and Actions at a Glance
|
||||
|
||||
### Actors
|
||||
|
||||
| Who (Plain English) | Technical Term |
|
||||
| ---------------------------------------------- | --------------- |
|
||||
| A single unit of work being tracked | Span |
|
||||
| The complete journey of a request | Trace |
|
||||
| Data that links spans across services | Trace Context |
|
||||
| Code that creates spans and propagates context | Instrumentation |
|
||||
| Service that receives and processes traces | Collector |
|
||||
| Storage and visualization system | Backend (Tempo) |
|
||||
| Decision logic for which traces to keep | Sampler |
|
||||
|
||||
### Actions
|
||||
|
||||
| What Happens (Plain English) | Technical Term |
|
||||
| --------------------------------------- | ----------------------- |
|
||||
| Start tracking a new operation | Create a Span |
|
||||
| Connect a child operation to its parent | Set `parent_span_id` |
|
||||
| Group all related operations together | Share a `trace_id` |
|
||||
| Pass tracking data between services | Context Propagation |
|
||||
| Decide whether to record a trace | Sampling (Head or Tail) |
|
||||
| Send completed traces to storage | Export (OTLP) |
|
||||
|
||||
---
|
||||
|
||||
## Core Concepts
|
||||
|
||||
### 1. Trace
|
||||
@@ -33,16 +60,16 @@ Trace ID: abc123
|
||||
|
||||
A **span** represents a single unit of work within a trace. Each span has:
|
||||
|
||||
| Attribute | Description | Example |
|
||||
| ---------------- | --------------------- | -------------------------- |
|
||||
| `trace_id` | Links to parent trace | `abc123` |
|
||||
| `span_id` | Unique identifier | `span456` |
|
||||
| `parent_span_id` | Parent span (if any) | `p_span123` |
|
||||
| `name` | Operation name | `rpc.submit` |
|
||||
| `start_time` | When work began | `2024-01-15T10:30:00Z` |
|
||||
| `end_time` | When work completed | `2024-01-15T10:30:00.050Z` |
|
||||
| `attributes` | Key-value metadata | `tx.hash=ABC...` |
|
||||
| `status` | OK, ERROR MSG | `OK` |
|
||||
| Attribute | Description | Example |
|
||||
| ---------------- | -------------------------------- | -------------------------- |
|
||||
| `trace_id` | Identifies the trace | `event123` |
|
||||
| `span_id` | Unique identifier | `span456` |
|
||||
| `parent_span_id` | Parent span (if any) | `p_span123` |
|
||||
| `name` | Operation name | `rpc.submit` |
|
||||
| `start_time` | When work began (local time) | `2024-01-15T10:30:00Z` |
|
||||
| `end_time` | When work completed (local time) | `2024-01-15T10:30:00.050Z` |
|
||||
| `attributes` | Key-value metadata | `tx.hash=ABC...` |
|
||||
| `status` | OK, ERROR MSG | `OK` |
|
||||
|
||||
### 3. Trace Context
|
||||
|
||||
@@ -74,6 +101,13 @@ flowchart TB
|
||||
style E fill:#bf360c,stroke:#8c2809,color:#ffffff
|
||||
```
|
||||
|
||||
**Reading the diagram:**
|
||||
|
||||
- **tx.submit (blue, root)**: The top-level span representing the entire transaction submission; all other spans are its descendants.
|
||||
- **tx.validate, tx.relay, tx.apply (green)**: Direct children of tx.submit, representing the three main stages -- validation, relay to peers, and application to the ledger.
|
||||
- **ledger.update (red)**: A grandchild span nested under tx.apply, representing the actual ledger state mutation triggered by applying the transaction.
|
||||
- **Arrows (parent to child)**: Each arrow indicates a parent-child span relationship where the parent's completion depends on the child finishing.
|
||||
|
||||
The same trace visualized as a **timeline (Gantt chart)**:
|
||||
|
||||
```
|
||||
@@ -92,6 +126,284 @@ ledger │ │▓▓▓▓▓▓▓▓▓▓▓▓▓
|
||||
|
||||
---
|
||||
|
||||
## Span Relationships
|
||||
|
||||
Spans don't always form simple parent-child trees. Distributed tracing defines several relationship types to capture different causal patterns:
|
||||
|
||||
### 1. Parent-Child (ChildOf)
|
||||
|
||||
The default relationship. The parent span **depends on** or **contains** the child span. The child runs within the scope of the parent.
|
||||
|
||||
```
|
||||
tx.submit (parent)
|
||||
├── tx.validate (child) ← parent waits for this
|
||||
├── tx.relay (child) ← parent waits for this
|
||||
└── tx.apply (child) ← parent waits for this
|
||||
```
|
||||
|
||||
**When to use:** Synchronous calls, nested operations, any case where the parent's completion depends on the child.
|
||||
|
||||
### 2. Follows-From
|
||||
|
||||
A causal relationship where the first span **triggers** the second, but does **not wait** for it. The originator fires and moves on.
|
||||
|
||||
```
|
||||
Time →
|
||||
|
||||
tx.receive [=======]
|
||||
↓ triggers (follows-from)
|
||||
tx.relay [===========] ← runs independently
|
||||
```
|
||||
|
||||
**When to use:** Asynchronous jobs, queued work, fire-and-forget patterns. For example, a node receives a transaction and queues it for relay — the relay span _follows from_ the receive span but the receiver doesn't wait for relaying to complete.
|
||||
|
||||
> **OpenTracing** defined `FollowsFrom` as a first-class reference type alongside `ChildOf`.
|
||||
> **OpenTelemetry** represents this using **Span Links** with descriptive attributes instead (see below).
|
||||
|
||||
### 3. Span Links (Cross-Trace and Non-Hierarchical)
|
||||
|
||||
Links connect spans that are **causally related but not in a parent-child hierarchy**. Unlike parent-child, links can cross trace boundaries.
|
||||
|
||||
```
|
||||
Trace A Trace B
|
||||
────── ──────
|
||||
batch.schedule batch.execute
|
||||
├─ item.enqueue (span X) ┌──► process.item
|
||||
├─ item.enqueue (span Y) ───┤ (links to X, Y, Z)
|
||||
├─ item.enqueue (span Z) └──►
|
||||
```
|
||||
|
||||
**Use cases:**
|
||||
|
||||
| Pattern | Description |
|
||||
| -------------------- | --------------------------------------------------------------------------- |
|
||||
| **Batch processing** | A batch span links back to all individual spans that contributed to it |
|
||||
| **Fan-in** | An aggregation span links to the multiple producer spans it merges |
|
||||
| **Fan-out** | Multiple downstream spans link back to the single span that triggered them |
|
||||
| **Async handoff** | A deferred job links back to the request that queued it (follows-from) |
|
||||
| **Cross-trace** | Correlating spans across independent traces (e.g., retries, related events) |
|
||||
|
||||
**Link structure:** Each link carries the target span's context plus optional attributes:
|
||||
|
||||
```
|
||||
Link {
|
||||
trace_id: <target trace>
|
||||
span_id: <target span>
|
||||
attributes: { "link.description": "triggered by batch scheduler" }
|
||||
}
|
||||
```
|
||||
|
||||
### Relationship Summary
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
subgraph parent_child["Parent-Child"]
|
||||
direction TB
|
||||
P["Parent"] --> C["Child"]
|
||||
end
|
||||
|
||||
subgraph follows_from["Follows-From"]
|
||||
direction TB
|
||||
A["Span A"] -.->|triggers| B["Span B"]
|
||||
end
|
||||
|
||||
subgraph links["Span Links"]
|
||||
direction TB
|
||||
X["Span X\n(Trace 1)"] -.-|link| Y["Span Y\n(Trace 2)"]
|
||||
end
|
||||
|
||||
parent_child ~~~ follows_from ~~~ links
|
||||
|
||||
style P fill:#0d47a1,stroke:#082f6a,color:#ffffff
|
||||
style C fill:#1b5e20,stroke:#0d3d14,color:#ffffff
|
||||
style A fill:#0d47a1,stroke:#082f6a,color:#ffffff
|
||||
style B fill:#bf360c,stroke:#8c2809,color:#ffffff
|
||||
style X fill:#4a148c,stroke:#38006b,color:#ffffff
|
||||
style Y fill:#4a148c,stroke:#38006b,color:#ffffff
|
||||
```
|
||||
|
||||
| Relationship | Same Trace? | Dependency? | OTel Mechanism |
|
||||
| ---------------- | ----------- | -------------------------- | ----------------- |
|
||||
| **Parent-Child** | Yes | Parent depends on child | `parent_span_id` |
|
||||
| **Follows-From** | Usually | Causal but no dependency | Link + attributes |
|
||||
| **Span Link** | Either | Correlation, no dependency | Link + attributes |
|
||||
|
||||
---
|
||||
|
||||
## Trace ID Generation
|
||||
|
||||
A `trace_id` is a 128-bit (16-byte) identifier that groups all spans belonging to one logical operation. How it's generated determines how easily you can find and correlate traces later.
|
||||
|
||||
### General Approaches
|
||||
|
||||
#### 1. Random (W3C Default)
|
||||
|
||||
Generate a random 128-bit ID when a trace starts. Standard approach for most services.
|
||||
|
||||
```
|
||||
trace_id = random_128_bits()
|
||||
```
|
||||
|
||||
| Pros | Cons |
|
||||
| --------------------------- | --------------------------------------------- |
|
||||
| Simple, standard | No natural correlation to domain events |
|
||||
| Guaranteed unique per trace | If propagation is lost, trace is broken |
|
||||
| Works with all OTel tooling | "Find trace for TX abc" requires index lookup |
|
||||
|
||||
#### 2. Deterministic (Derived from Domain Data)
|
||||
|
||||
Compute the trace_id from a hash of a natural identifier. Every node independently derives the **same** trace_id for the same event.
|
||||
|
||||
```
|
||||
trace_id = SHA-256(domain_identifier)[0:16] // truncate to 128 bits
|
||||
```
|
||||
|
||||
| Pros | Cons |
|
||||
| --------------------------------------------------- | ---------------------------------------------------------- |
|
||||
| Propagation-resilient — same ID computed everywhere | Same event processed twice (retry) shares trace_id |
|
||||
| Natural search — domain ID maps directly to trace | Non-standard (tooling assumes random) |
|
||||
| No coordination needed between nodes | 256→128 bit truncation (collision risk negligible at ~2⁶⁴) |
|
||||
|
||||
#### 3. Hybrid (Deterministic Prefix + Random Suffix)
|
||||
|
||||
First 8 bytes derived from domain data, last 8 bytes random.
|
||||
|
||||
```
|
||||
trace_id = SHA-256(domain_identifier)[0:8] || random_64_bits()
|
||||
```
|
||||
|
||||
| Pros | Cons |
|
||||
| ------------------------------------------- | ---------------------------------------- |
|
||||
| Prefix search: "find all traces for TX abc" | Must propagate to maintain full trace_id |
|
||||
| Unique per processing instance | More complex generation logic |
|
||||
| Retries get distinct trace_ids | Partial correlation only (prefix match) |
|
||||
|
||||
### XRPL Workflow Analysis
|
||||
|
||||
XRPL has a unique advantage: its core workflows produce **globally unique 256-bit hashes** that are known on every node. This makes deterministic trace_id generation practical in ways most systems can't achieve.
|
||||
|
||||
#### Natural Identifiers by Workflow
|
||||
|
||||
| Workflow | Natural Identifier | Size | Known at Start? | Same on All Nodes? |
|
||||
| ------------------- | --------------------------------- | ---------- | ----------------------------- | -------------------------------- |
|
||||
| **Transaction** | Transaction hash (`tid_`) | 256-bit | Yes — computed before signing | Yes — hash of canonical tx data |
|
||||
| **Consensus round** | Previous ledger hash + ledger seq | 256+32 bit | Yes — known when round opens | Yes — all validators agree |
|
||||
| **Validation** | Ledger hash being validated | 256-bit | Yes — from consensus result | Yes — same closed ledger |
|
||||
| **Ledger catch-up** | Target ledger hash | 256-bit | Yes — we know what to fetch | Yes — identifies ledger globally |
|
||||
|
||||
#### Where These Identifiers Live in Code
|
||||
|
||||
```
|
||||
Transaction: STTx::getTransactionID() → uint256 tid_
|
||||
TMTransaction::rawTransaction → recompute hash from bytes
|
||||
|
||||
Consensus: ConsensusProposal::prevLedger_ → uint256 (previous ledger hash)
|
||||
ConsensusProposal::position_ → uint256 (TxSet hash)
|
||||
LedgerHeader::seq → uint32_t (ledger sequence)
|
||||
|
||||
Validation: STValidation::getLedgerHash() → uint256
|
||||
STValidation::getNodeID() → NodeID (160-bit)
|
||||
|
||||
Ledger fetch: InboundLedger constructor → uint256 hash, uint32_t seq
|
||||
TMGetLedger::ledgerHash → bytes (uint256)
|
||||
```
|
||||
|
||||
### Recommended Strategy: Workflow-Scoped Deterministic
|
||||
|
||||
Each workflow type derives its trace_id from its natural domain identifier:
|
||||
|
||||
```
|
||||
Transaction trace: trace_id = SHA-256("tx" || tx_hash)[0:16]
|
||||
Consensus trace: trace_id = SHA-256("cons" || prev_ledger_hash || ledger_seq)[0:16]
|
||||
Ledger catch-up: trace_id = SHA-256("fetch" || target_ledger_hash)[0:16]
|
||||
```
|
||||
|
||||
The string prefix (`"tx"`, `"cons"`, `"fetch"`) prevents collisions between workflows that might share underlying hashes.
|
||||
|
||||
**Why this works for XRPL:**
|
||||
|
||||
1. **Propagation-resilient** — Even if a P2P message drops trace context, every node independently computes the same trace_id from the same tx_hash or ledger_hash. Spans still correlate.
|
||||
|
||||
2. **Zero-cost search** — "Show me the trace for transaction ABC" becomes a direct lookup: compute `SHA-256("tx" || ABC)[0:16]` and query. No secondary index needed.
|
||||
|
||||
3. **Cross-workflow linking via Span Links** — A consensus trace links to individual transaction traces. A validation span links to the consensus trace. This connects the full picture without forcing everything into one giant trace.
|
||||
|
||||
### Cross-Workflow Correlation
|
||||
|
||||
Each workflow gets its own trace. Span Links tie them together:
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
subgraph tx_trace["Transaction Trace"]
|
||||
direction LR
|
||||
Tn["trace_id = f(tx_hash)"]:::note --> T1["tx.receive"] --> T2["tx.validate"] --> T3["tx.relay"]
|
||||
end
|
||||
|
||||
subgraph cons_trace["Consensus Trace"]
|
||||
direction LR
|
||||
Cn["trace_id = f(prev_ledger, seq)"]:::note --> C1["cons.open"] --> C2["cons.propose"] --> C3["cons.accept"]
|
||||
end
|
||||
|
||||
subgraph val_trace["Validation"]
|
||||
direction LR
|
||||
Vn["spans within consensus trace"]:::note --> V1["val.create"] --> V2["val.broadcast"]
|
||||
end
|
||||
|
||||
subgraph fetch_trace["Catch-Up Trace"]
|
||||
direction LR
|
||||
Fn["trace_id = f(ledger_hash)"]:::note --> F1["fetch.request"] --> F2["fetch.receive"] --> F3["fetch.apply"]
|
||||
end
|
||||
|
||||
C1 -.-|"span link\n(tx traces)"| T3
|
||||
C3 --> V1
|
||||
F1 -.-|"span link\n(target ledger)"| C3
|
||||
|
||||
classDef note fill:none,stroke:#888,stroke-dasharray:5 5,color:#333,font-style:italic
|
||||
style T1 fill:#0d47a1,stroke:#082f6a,color:#ffffff
|
||||
style T2 fill:#0d47a1,stroke:#082f6a,color:#ffffff
|
||||
style T3 fill:#0d47a1,stroke:#082f6a,color:#ffffff
|
||||
style C1 fill:#1b5e20,stroke:#0d3d14,color:#ffffff
|
||||
style C2 fill:#1b5e20,stroke:#0d3d14,color:#ffffff
|
||||
style C3 fill:#1b5e20,stroke:#0d3d14,color:#ffffff
|
||||
style V1 fill:#bf360c,stroke:#8c2809,color:#ffffff
|
||||
style V2 fill:#bf360c,stroke:#8c2809,color:#ffffff
|
||||
style F1 fill:#4a148c,stroke:#38006b,color:#ffffff
|
||||
style F2 fill:#4a148c,stroke:#38006b,color:#ffffff
|
||||
style F3 fill:#4a148c,stroke:#38006b,color:#ffffff
|
||||
```
|
||||
|
||||
**Reading the diagram:**
|
||||
|
||||
- **Transaction Trace (blue)**: An independent trace whose `trace_id` is deterministically derived from the transaction hash. Contains receive, validate, and relay spans.
|
||||
- **Consensus Trace (green)**: An independent trace whose `trace_id` is derived from the previous ledger hash and sequence number. Covers the open, propose, and accept phases.
|
||||
- **Validation (red)**: Validation spans live within the consensus trace (not a separate trace). They are created after the accept phase completes.
|
||||
- **Catch-Up Trace (purple)**: An independent trace for ledger acquisition, derived from the target ledger hash. Used when a node is behind and fetching missing ledgers.
|
||||
- **Dotted arrows (span links)**: Cross-trace correlations. Consensus links to transaction traces it included; catch-up links to the consensus trace that produced the target ledger.
|
||||
- **Solid arrow (C3 to V1)**: A parent-child relationship -- validation spans are direct children of the consensus accept span within the same trace.
|
||||
|
||||
**How a query flows:**
|
||||
|
||||
```
|
||||
"Why was TX abc slow?"
|
||||
1. Compute trace_id = SHA-256("tx" || abc)[0:16]
|
||||
2. Find transaction trace → see it was included in consensus round N
|
||||
3. Follow span link → consensus trace for round N
|
||||
4. See which phase was slow (propose? accept?)
|
||||
5. If a node was catching up, follow link → catch-up trace
|
||||
```
|
||||
|
||||
### Trade-offs to Consider
|
||||
|
||||
| Concern | Mitigation |
|
||||
| ----------------------------- | ----------------------------------------------------------------------------------------------------------------------------- |
|
||||
| **Retries get same trace_id** | Add `attempt` attribute to root span; spans have unique span_ids and timestamps |
|
||||
| **256→128 bit truncation** | Birthday-bound collision at ~2⁶⁴ operations — negligible for XRPL's throughput |
|
||||
| **Non-standard generation** | OTel spec allows any 16-byte non-zero value; tooling works on the hex string |
|
||||
| **Hash computation cost** | SHA-256 is ~0.3μs per call; XRPL already computes these hashes for other purposes |
|
||||
| **Late-binding identifiers** | Ledger hash isn't known until after consensus — validation spans use ledger_seq as fallback, then link to the consensus trace |
|
||||
|
||||
---
|
||||
|
||||
## Distributed Traces Across Nodes
|
||||
|
||||
In distributed systems like rippled, traces span **multiple independent nodes**. The trace context must be propagated in network messages:
|
||||
@@ -118,20 +430,27 @@ sequenceDiagram
|
||||
Note over NodeA,NodeC: All spans share trace_id: abc123<br/>enabling correlation across nodes
|
||||
```
|
||||
|
||||
**Reading the diagram:**
|
||||
|
||||
- **Client**: The external entity that submits a transaction. It does not carry trace context -- the trace originates at the first node.
|
||||
- **Node A**: The entry point that creates a new trace (trace_id: abc123) and the root span `tx.receive`. It relays the transaction to peers with trace context attached.
|
||||
- **Node B and Node C**: Peer nodes that receive the relayed transaction along with the propagated trace context. Each creates a child span under Node A's span, preserving the same `trace_id`.
|
||||
- **Arrows with trace context**: The relay messages carry `trace_id` and `parent_span_id`, allowing each downstream node to link its spans back to the originating span on Node A.
|
||||
|
||||
---
|
||||
|
||||
## Context Propagation
|
||||
|
||||
For traces to work across nodes, **trace context must be propagated** in messages.
|
||||
|
||||
### What's in the Context (32 bytes)
|
||||
### What's in the Context (~26 bytes)
|
||||
|
||||
| Field | Size | Description |
|
||||
| ------------- | ---------- | ------------------------------------------------------- |
|
||||
| `trace_id` | 16 bytes | Identifies the entire trace (constant across all nodes) |
|
||||
| `span_id` | 8 bytes | The sender's current span (becomes parent on receiver) |
|
||||
| `trace_flags` | 4 bytes | Sampling decision flags |
|
||||
| `trace_state` | ~0-4 bytes | Optional vendor-specific data |
|
||||
| Field | Size | Description |
|
||||
| ------------- | -------- | ------------------------------------------------------- |
|
||||
| `trace_id` | 16 bytes | Identifies the entire trace (constant across all nodes) |
|
||||
| `span_id` | 8 bytes | The sender's current span (becomes parent on receiver) |
|
||||
| `trace_flags` | 1 byte | Sampling decision (bit 0 = sampled; bits 1-7 reserved) |
|
||||
| `trace_state` | variable | Optional vendor-specific data (typically omitted) |
|
||||
|
||||
### How span_id Changes at Each Hop
|
||||
|
||||
@@ -165,11 +484,11 @@ There are two patterns:
|
||||
### HTTP/RPC Headers (W3C Trace Context)
|
||||
|
||||
```
|
||||
traceparent: 00-abc123def456-span789-01
|
||||
│ │ │ │
|
||||
│ │ │ └── Flags (sampled)
|
||||
│ │ └── Parent span ID
|
||||
│ └── Trace ID
|
||||
traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01
|
||||
│ │ │ │
|
||||
│ │ │ └── Flags (sampled)
|
||||
│ │ └── Parent span ID (16 hex)
|
||||
│ └── Trace ID (32 hex)
|
||||
└── Version
|
||||
```
|
||||
|
||||
@@ -228,16 +547,20 @@ Trace completes → Collector evaluates:
|
||||
|
||||
## Glossary
|
||||
|
||||
| Term | Definition |
|
||||
| ------------------- | --------------------------------------------------------------- |
|
||||
| **Trace** | Complete journey of a request, identified by `trace_id` |
|
||||
| **Span** | Single operation within a trace |
|
||||
| **Context** | Data propagated between services (`trace_id`, `span_id`, flags) |
|
||||
| **Instrumentation** | Code that creates spans and propagates context |
|
||||
| **Collector** | Service that receives, processes, and exports traces |
|
||||
| **Backend** | Storage/visualization system (Jaeger, Tempo, etc.) |
|
||||
| **Head Sampling** | Sampling decision at trace start |
|
||||
| **Tail Sampling** | Sampling decision after trace completes |
|
||||
| Term | Definition |
|
||||
| -------------------- | ------------------------------------------------------------------- |
|
||||
| **Trace** | Complete journey of a request, identified by `trace_id` |
|
||||
| **Span** | Single operation within a trace |
|
||||
| **Parent-Child** | Span relationship where the parent depends on the child |
|
||||
| **Follows-From** | Causal relationship where originator doesn't wait for the result |
|
||||
| **Span Link** | Non-hierarchical connection between spans, possibly across traces |
|
||||
| **Deterministic ID** | Trace ID derived from domain data (e.g., tx_hash) instead of random |
|
||||
| **Context** | Data propagated between services (`trace_id`, `span_id`, flags) |
|
||||
| **Instrumentation** | Code that creates spans and propagates context |
|
||||
| **Collector** | Service that receives, processes, and exports traces |
|
||||
| **Backend** | Storage/visualization system (Tempo) |
|
||||
| **Head Sampling** | Sampling decision at trace start |
|
||||
| **Tail Sampling** | Sampling decision after trace completes |
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -7,6 +7,8 @@
|
||||
|
||||
## 1.1 Current rippled Architecture Overview
|
||||
|
||||
> **WS** = WebSocket | **UNL** = Unique Node List | **TxQ** = Transaction Queue | **StatsD** = Statistics Daemon
|
||||
|
||||
The rippled node software consists of several interconnected components that need instrumentation for distributed tracing:
|
||||
|
||||
```mermaid
|
||||
@@ -16,6 +18,7 @@ flowchart TB
|
||||
RPC["RPC Server<br/>(HTTP/WS/gRPC)"]
|
||||
Overlay["Overlay<br/>(P2P Network)"]
|
||||
Consensus["Consensus<br/>(RCLConsensus)"]
|
||||
ValidatorList["ValidatorList<br/>(UNL Mgmt)"]
|
||||
end
|
||||
|
||||
JobQueue["JobQueue<br/>(Thread Pool)"]
|
||||
@@ -24,6 +27,13 @@ flowchart TB
|
||||
NetworkOPs["NetworkOPs<br/>(Tx Processing)"]
|
||||
LedgerMaster["LedgerMaster<br/>(Ledger Mgmt)"]
|
||||
NodeStore["NodeStore<br/>(Database)"]
|
||||
InboundLedgers["InboundLedgers<br/>(Ledger Sync)"]
|
||||
end
|
||||
|
||||
subgraph appservices["Application Services"]
|
||||
PathFind["PathFinding<br/>(Payment Paths)"]
|
||||
TxQ["TxQ<br/>(Fee Escalation)"]
|
||||
LoadMgr["LoadManager<br/>(Fee/Load)"]
|
||||
end
|
||||
|
||||
subgraph observability["Existing Observability"]
|
||||
@@ -34,27 +44,92 @@ flowchart TB
|
||||
|
||||
services --> JobQueue
|
||||
JobQueue --> processing
|
||||
JobQueue --> appservices
|
||||
end
|
||||
|
||||
style rippled fill:#424242,stroke:#212121,color:#ffffff
|
||||
style services fill:#1565c0,stroke:#0d47a1,color:#ffffff
|
||||
style processing fill:#2e7d32,stroke:#1b5e20,color:#ffffff
|
||||
style appservices fill:#6a1b9a,stroke:#4a148c,color:#ffffff
|
||||
style observability fill:#e65100,stroke:#bf360c,color:#ffffff
|
||||
```
|
||||
|
||||
**Reading the diagram:**
|
||||
|
||||
- **Core Services (blue)**: The entry points into rippled -- RPC Server handles client requests, Overlay manages peer-to-peer networking, Consensus drives agreement, and ValidatorList manages trusted validators.
|
||||
- **JobQueue (center)**: The asynchronous thread pool that decouples Core Services from the Processing and Application layers. All work flows through it.
|
||||
- **Processing Layer (green)**: Core business logic -- NetworkOPs processes transactions, LedgerMaster manages ledger state, NodeStore handles persistence, and InboundLedgers synchronizes missing data.
|
||||
- **Application Services (purple)**: Higher-level features -- PathFinding computes payment routes, TxQ manages fee-based queuing, and LoadManager tracks server load.
|
||||
- **Existing Observability (orange)**: The current monitoring stack (PerfLog, Insight, Journal logging) that OpenTelemetry will complement, not replace.
|
||||
- **Arrows (Services to JobQueue to layers)**: Work originates at Core Services, is enqueued onto the JobQueue, and dispatched to Processing or Application layers for execution.
|
||||
|
||||
---
|
||||
|
||||
## 1.1.1 Actors and Actions
|
||||
|
||||
### Actors
|
||||
|
||||
| Who (Plain English) | Technical Term |
|
||||
| ----------------------------------------- | -------------------------- |
|
||||
| Network node running XRPL software | rippled node |
|
||||
| External client submitting requests | RPC Client |
|
||||
| Network neighbor sharing data | Peer (PeerImp) |
|
||||
| Request handler for client queries | RPC Server (ServerHandler) |
|
||||
| Command executor for specific RPC methods | RPCHandler |
|
||||
| Agreement process between nodes | Consensus (RCLConsensus) |
|
||||
| Transaction processing coordinator | NetworkOPs |
|
||||
| Background task scheduler | JobQueue |
|
||||
| Ledger state manager | LedgerMaster |
|
||||
| Payment route calculator | PathFinding (Pathfinder) |
|
||||
| Transaction waiting room | TxQ (Transaction Queue) |
|
||||
| Fee adjustment system | LoadManager |
|
||||
| Trusted validator list manager | ValidatorList |
|
||||
| Protocol upgrade tracker | AmendmentTable |
|
||||
| Ledger state hash tree | SHAMap |
|
||||
| Persistent key-value storage | NodeStore |
|
||||
|
||||
### Actions
|
||||
|
||||
| What Happens (Plain English) | Technical Term |
|
||||
| ---------------------------------------------- | ---------------------- |
|
||||
| Client sends a request to a node | `rpc.request` |
|
||||
| Node executes a specific RPC command | `rpc.command.*` |
|
||||
| Node receives a transaction from a peer | `tx.receive` |
|
||||
| Node checks if a transaction is valid | `tx.validate` |
|
||||
| Node forwards a transaction to neighbors | `tx.relay` |
|
||||
| Nodes agree on which transactions to include | `consensus.round` |
|
||||
| Consensus progresses through phases | `consensus.phase.*` |
|
||||
| Node builds a new confirmed ledger | `ledger.build` |
|
||||
| Node fetches missing ledger data from peers | `ledger.acquire` |
|
||||
| Node computes payment routes | `pathfind.compute` |
|
||||
| Node queues a transaction for later processing | `txq.enqueue` |
|
||||
| Node increases fees due to high load | `fee.escalate` |
|
||||
| Node fetches the latest trusted validator list | `validator.list.fetch` |
|
||||
| Node votes on a protocol amendment | `amendment.vote` |
|
||||
| Node synchronizes state tree data | `shamap.sync` |
|
||||
|
||||
---
|
||||
|
||||
## 1.2 Key Components for Instrumentation
|
||||
|
||||
| Component | Location | Purpose | Trace Value |
|
||||
| ----------------- | ------------------------------------------ | ------------------------ | ---------------------------- |
|
||||
| **Overlay** | `src/xrpld/overlay/` | P2P communication | Message propagation timing |
|
||||
| **PeerImp** | `src/xrpld/overlay/detail/PeerImp.cpp` | Individual peer handling | Per-peer latency |
|
||||
| **RCLConsensus** | `src/xrpld/app/consensus/RCLConsensus.cpp` | Consensus algorithm | Round timing, phase analysis |
|
||||
| **NetworkOPs** | `src/xrpld/app/misc/NetworkOPs.cpp` | Transaction processing | Tx lifecycle tracking |
|
||||
| **ServerHandler** | `src/xrpld/rpc/detail/ServerHandler.cpp` | RPC entry point | Request latency |
|
||||
| **RPCHandler** | `src/xrpld/rpc/detail/RPCHandler.cpp` | Command execution | Per-command timing |
|
||||
| **JobQueue** | `src/xrpl/core/JobQueue.h` | Async task execution | Queue wait times |
|
||||
> **TxQ** = Transaction Queue | **UNL** = Unique Node List
|
||||
|
||||
| Component | Location | Purpose | Trace Value |
|
||||
| ------------------ | ------------------------------------------ | ------------------------ | -------------------------------- |
|
||||
| **Overlay** | `src/xrpld/overlay/` | P2P communication | Message propagation timing |
|
||||
| **PeerImp** | `src/xrpld/overlay/detail/PeerImp.cpp` | Individual peer handling | Per-peer latency |
|
||||
| **RCLConsensus** | `src/xrpld/app/consensus/RCLConsensus.cpp` | Consensus algorithm | Round timing, phase analysis |
|
||||
| **NetworkOPs** | `src/xrpld/app/misc/NetworkOPs.cpp` | Transaction processing | Tx lifecycle tracking |
|
||||
| **ServerHandler** | `src/xrpld/rpc/detail/ServerHandler.cpp` | RPC entry point | Request latency |
|
||||
| **RPCHandler** | `src/xrpld/rpc/detail/RPCHandler.cpp` | Command execution | Per-command timing |
|
||||
| **JobQueue** | `src/xrpl/core/JobQueue.h` | Async task execution | Queue wait times |
|
||||
| **PathFinding** | `src/xrpld/app/paths/` | Payment path computation | Path latency, cache hits |
|
||||
| **TxQ** | `src/xrpld/app/misc/TxQ.cpp` | Transaction queue/fees | Queue depth, eviction rates |
|
||||
| **LoadManager** | `src/xrpld/app/main/LoadManager.cpp` | Fee escalation/load | Fee levels, load factors |
|
||||
| **InboundLedgers** | `src/xrpld/app/ledger/InboundLedgers.cpp` | Ledger acquisition | Sync time, peer reliability |
|
||||
| **ValidatorList** | `src/xrpld/app/misc/ValidatorList.cpp` | UNL management | List freshness, fetch failures |
|
||||
| **AmendmentTable** | `src/xrpld/app/misc/AmendmentTable.cpp` | Protocol amendments | Voting status, activation events |
|
||||
| **SHAMap** | `src/xrpld/shamap/` | State hash tree | Sync speed, missing nodes |
|
||||
|
||||
---
|
||||
|
||||
@@ -93,6 +168,15 @@ sequenceDiagram
|
||||
Note over Client,PeerC: DISTRIBUTED TRACE (same trace_id: abc123)
|
||||
```
|
||||
|
||||
**Reading the diagram:**
|
||||
|
||||
- **Client**: The external entity that submits a transaction to Peer A. It has no trace context -- the trace starts at the first node.
|
||||
- **Peer A (Receive)**: The entry node that creates the root span `tx.receive`, runs HashRouter deduplication to avoid processing duplicates, and creates a child `tx.validate` span.
|
||||
- **Peer A to Peer B arrow**: The relay message carries trace context (trace_id + parent span_id), enabling Peer B to create a linked span under the same trace.
|
||||
- **Peer B (Relay)**: Receives the transaction and trace context, creates a `tx.receive` span linked to Peer A's trace, then relays onward.
|
||||
- **Peer C (Validate)**: Final hop in this example. Creates a linked `tx.receive` span and runs `tx.process` to fully process the transaction.
|
||||
- **Blue rectangles**: Highlight the span boundaries on each node, showing where instrumentation creates and closes spans.
|
||||
|
||||
### Trace Structure
|
||||
|
||||
```
|
||||
@@ -142,16 +226,26 @@ flowchart TB
|
||||
style accept fill:#c2185b,stroke:#880e4f,color:#ffffff
|
||||
```
|
||||
|
||||
**Reading the diagram:**
|
||||
|
||||
- **consensus.round (orange, root span)**: The top-level span encompassing the entire consensus round, with attributes like ledger sequence, mode, and proposer count.
|
||||
- **consensus.phase.open (blue)**: The first phase where the node waits (~3s) to collect incoming transactions before proposing.
|
||||
- **consensus.phase.establish (green)**: The negotiation phase where validators exchange proposals, resolve disputes, and converge on a transaction set. Child spans track each proposal received/sent and each dispute resolved.
|
||||
- **consensus.phase.accept (pink)**: The final phase where the agreed transaction set is applied, a new ledger is built, and the ledger is validated. Child spans cover `ledger.build` and `ledger.validate`.
|
||||
- **Arrows (open to establish to accept)**: The sequential flow through the three consensus phases. Each phase must complete before the next begins.
|
||||
|
||||
---
|
||||
|
||||
## 1.5 RPC Request Flow
|
||||
|
||||
> **WS** = WebSocket
|
||||
|
||||
RPC requests support W3C Trace Context headers for distributed tracing across services:
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
subgraph request["rpc.request (root span)"]
|
||||
http["HTTP Request<br/>POST /<br/>traceparent: 00-abc123...-def456...-01"]
|
||||
http["HTTP Request — POST /<br/>traceparent:<br/>00-abc123...-def456...-01"]
|
||||
|
||||
attrs["Attributes:<br/>http.method = POST<br/>net.peer.ip = 192.168.1.100<br/>xrpl.rpc.command = submit"]
|
||||
|
||||
@@ -177,32 +271,56 @@ flowchart TB
|
||||
style command fill:#e65100,stroke:#bf360c,color:#ffffff
|
||||
```
|
||||
|
||||
**Reading the diagram:**
|
||||
|
||||
- **rpc.request (green, root span)**: The outermost span representing the full RPC request lifecycle, from HTTP receipt to response. Carries the W3C `traceparent` header for distributed tracing.
|
||||
- **HTTP Request node**: Shows the incoming POST request with its `traceparent` header and extracted attributes (method, peer IP, command name).
|
||||
- **jobqueue.enqueue (blue)**: The span covering the asynchronous handoff from the RPC thread to the JobQueue worker thread. The trace context is preserved across this async boundary.
|
||||
- **rpc.command.submit (orange)**: The span for the actual command execution, with child spans for deserialization, local validation, and network submission.
|
||||
- **Response node**: The final output with HTTP status and total duration, marking the end of the root span.
|
||||
- **Arrows (top to bottom)**: The sequential processing pipeline -- receive request, extract attributes, enqueue job, execute command, return response.
|
||||
|
||||
---
|
||||
|
||||
## 1.6 Key Trace Points
|
||||
|
||||
> **TxQ** = Transaction Queue
|
||||
|
||||
The following table identifies priority instrumentation points across the codebase:
|
||||
|
||||
| Category | Span Name | File | Method | Priority |
|
||||
| --------------- | ---------------------- | -------------------- | ---------------------- | -------- |
|
||||
| **Transaction** | `tx.receive` | `PeerImp.cpp` | `handleTransaction()` | High |
|
||||
| **Transaction** | `tx.validate` | `NetworkOPs.cpp` | `processTransaction()` | High |
|
||||
| **Transaction** | `tx.process` | `NetworkOPs.cpp` | `doTransactionSync()` | High |
|
||||
| **Transaction** | `tx.relay` | `OverlayImpl.cpp` | `relay()` | Medium |
|
||||
| **Consensus** | `consensus.round` | `RCLConsensus.cpp` | `startRound()` | High |
|
||||
| **Consensus** | `consensus.phase.*` | `Consensus.h` | `timerEntry()` | High |
|
||||
| **Consensus** | `consensus.proposal.*` | `RCLConsensus.cpp` | `peerProposal()` | Medium |
|
||||
| **RPC** | `rpc.request` | `ServerHandler.cpp` | `onRequest()` | High |
|
||||
| **RPC** | `rpc.command.*` | `RPCHandler.cpp` | `doCommand()` | High |
|
||||
| **Peer** | `peer.connect` | `OverlayImpl.cpp` | `onHandoff()` | Low |
|
||||
| **Peer** | `peer.message.*` | `PeerImp.cpp` | `onMessage()` | Low |
|
||||
| **Ledger** | `ledger.acquire` | `InboundLedgers.cpp` | `acquire()` | Medium |
|
||||
| **Ledger** | `ledger.build` | `RCLConsensus.cpp` | `buildLCL()` | High |
|
||||
| Category | Span Name | File | Method | Priority |
|
||||
| --------------- | ---------------------- | ---------------------- | ----------------------- | -------- |
|
||||
| **Transaction** | `tx.receive` | `PeerImp.cpp` | `handleTransaction()` | High |
|
||||
| **Transaction** | `tx.validate` | `NetworkOPs.cpp` | `processTransaction()` | High |
|
||||
| **Transaction** | `tx.process` | `NetworkOPs.cpp` | `doTransactionSync()` | High |
|
||||
| **Transaction** | `tx.relay` | `OverlayImpl.cpp` | `relay()` | Medium |
|
||||
| **Consensus** | `consensus.round` | `RCLConsensus.cpp` | `startRound()` | High |
|
||||
| **Consensus** | `consensus.phase.*` | `Consensus.h` | `timerEntry()` | High |
|
||||
| **Consensus** | `consensus.proposal.*` | `RCLConsensus.cpp` | `peerProposal()` | Medium |
|
||||
| **RPC** | `rpc.request` | `ServerHandler.cpp` | `onRequest()` | High |
|
||||
| **RPC** | `rpc.command.*` | `RPCHandler.cpp` | `doCommand()` | High |
|
||||
| **Peer** | `peer.connect` | `OverlayImpl.cpp` | `onHandoff()` | Low |
|
||||
| **Peer** | `peer.message.*` | `PeerImp.cpp` | `onMessage()` | Low |
|
||||
| **Ledger** | `ledger.acquire` | `InboundLedgers.cpp` | `acquire()` | Medium |
|
||||
| **Ledger** | `ledger.build` | `RCLConsensus.cpp` | `buildLCL()` | High |
|
||||
| **PathFinding** | `pathfind.request` | `PathRequest.cpp` | `doUpdate()` | High |
|
||||
| **PathFinding** | `pathfind.compute` | `Pathfinder.cpp` | `findPaths()` | High |
|
||||
| **TxQ** | `txq.enqueue` | `TxQ.cpp` | `apply()` | High |
|
||||
| **TxQ** | `txq.apply` | `TxQ.cpp` | `processClosedLedger()` | High |
|
||||
| **Fee** | `fee.escalate` | `LoadManager.cpp` | `raiseLocalFee()` | Medium |
|
||||
| **Ledger** | `ledger.replay` | `LedgerReplayer.h` | `replay()` | Medium |
|
||||
| **Ledger** | `ledger.delta` | `LedgerDeltaAcquire.h` | `processData()` | Medium |
|
||||
| **Validator** | `validator.list.fetch` | `ValidatorList.cpp` | `verify()` | Medium |
|
||||
| **Validator** | `validator.manifest` | `Manifest.cpp` | `applyManifest()` | Low |
|
||||
| **Amendment** | `amendment.vote` | `AmendmentTable.cpp` | `doVoting()` | Low |
|
||||
| **SHAMap** | `shamap.sync` | `SHAMap.cpp` | `fetchRoot()` | Medium |
|
||||
|
||||
---
|
||||
|
||||
## 1.7 Instrumentation Priority
|
||||
|
||||
> **TxQ** = Transaction Queue
|
||||
|
||||
```mermaid
|
||||
quadrantChart
|
||||
title Instrumentation Priority Matrix
|
||||
@@ -213,18 +331,25 @@ quadrantChart
|
||||
quadrant-3 Quick Wins
|
||||
quadrant-4 Consider Later
|
||||
|
||||
RPC Tracing: [0.3, 0.85]
|
||||
Transaction Tracing: [0.65, 0.92]
|
||||
Consensus Tracing: [0.75, 0.87]
|
||||
Peer Message Tracing: [0.4, 0.3]
|
||||
Ledger Acquisition: [0.5, 0.6]
|
||||
JobQueue Tracing: [0.35, 0.5]
|
||||
RPC Tracing: [0.2, 0.92]
|
||||
Transaction Tracing: [0.55, 0.88]
|
||||
Consensus Tracing: [0.78, 0.82]
|
||||
PathFinding: [0.38, 0.75]
|
||||
TxQ and Fees: [0.25, 0.65]
|
||||
Ledger Sync: [0.62, 0.58]
|
||||
Peer Message Tracing: [0.35, 0.25]
|
||||
JobQueue Tracing: [0.2, 0.48]
|
||||
Validator Mgmt: [0.48, 0.42]
|
||||
Amendment Tracking: [0.15, 0.32]
|
||||
SHAMap Operations: [0.72, 0.45]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 1.8 Observable Outcomes
|
||||
|
||||
> **TxQ** = Transaction Queue | **UNL** = Unique Node List
|
||||
|
||||
After implementing OpenTelemetry, operators and developers will gain visibility into the following:
|
||||
|
||||
### 1.8.1 What You Will See: Traces
|
||||
@@ -236,20 +361,28 @@ After implementing OpenTelemetry, operators and developers will gain visibility
|
||||
| **Consensus Rounds** | Complete round with all phases (open, establish, accept) | `{span.name=~"consensus.round.*"}` |
|
||||
| **RPC Request Processing** | Individual command execution with timing breakdown | `{xrpl.rpc.command="account_info"}` |
|
||||
| **Ledger Acquisition** | Peer-to-peer ledger data requests and responses | `{span.name="ledger.acquire"}` |
|
||||
| **PathFinding Latency** | Path computation time and cache effectiveness for payment RPCs | `{span.name="pathfind.compute"}` |
|
||||
| **TxQ Behavior** | Queue depth, eviction patterns, fee escalation during congestion | `{span.name=~"txq.*"}` |
|
||||
| **Ledger Sync** | Full acquisition timeline including delta and transaction fetches | `{span.name=~"ledger.acquire.*"}` |
|
||||
| **Validator Health** | UNL fetch success, manifest updates, stale list detection | `{span.name=~"validator.*"}` |
|
||||
|
||||
### 1.8.2 What You Will See: Metrics (Derived from Traces)
|
||||
|
||||
| Metric | Description | Dashboard Panel |
|
||||
| ----------------------------- | -------------------------------------- | --------------------------- |
|
||||
| **RPC Latency (p50/p95/p99)** | Response time distribution per command | Heatmap by command |
|
||||
| **Transaction Throughput** | Transactions processed per second | Time series graph |
|
||||
| **Consensus Round Duration** | Time to complete consensus phases | Histogram |
|
||||
| **Cross-Node Latency** | Time for transaction to reach N nodes | Line chart with percentiles |
|
||||
| **Error Rate** | Failed transactions/RPC calls by type | Stacked bar chart |
|
||||
| Metric | Description | Dashboard Panel |
|
||||
| ----------------------------- | --------------------------------------- | --------------------------- |
|
||||
| **RPC Latency (p50/p95/p99)** | Response time distribution per command | Heatmap by command |
|
||||
| **Transaction Throughput** | Transactions processed per second | Time series graph |
|
||||
| **Consensus Round Duration** | Time to complete consensus phases | Histogram |
|
||||
| **Cross-Node Latency** | Time for transaction to reach N nodes | Line chart with percentiles |
|
||||
| **Error Rate** | Failed transactions/RPC calls by type | Stacked bar chart |
|
||||
| **PathFinding Latency** | Path computation time per currency pair | Heatmap by currency |
|
||||
| **TxQ Depth** | Queued transactions over time | Time series with thresholds |
|
||||
| **Fee Escalation Level** | Current fee multiplier | Gauge with alert thresholds |
|
||||
| **Ledger Sync Duration** | Time to acquire missing ledgers | Histogram |
|
||||
|
||||
### 1.8.3 Concrete Dashboard Examples
|
||||
|
||||
**Transaction Trace View (Jaeger/Tempo):**
|
||||
**Transaction Trace View (Tempo):**
|
||||
|
||||
```
|
||||
┌────────────────────────────────────────────────────────────────────────────────┐
|
||||
@@ -304,18 +437,22 @@ xychart-beta
|
||||
title "Consensus Round Duration (Last 24 Hours)"
|
||||
x-axis "Time of Day (Hours)" [0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24]
|
||||
y-axis "Duration (seconds)" 1 --> 5
|
||||
line [2.1, 2.3, 2.5, 2.4, 2.8, 1.6, 3.2, 3.0, 3.5, 1.3, 3.8, 3.6, 4.0, 3.2, 4.3, 4.1, 4.5, 4.3, 4.2, 2.4, 4.8, 4.6, 4.9, 4.7, 5.0, 4.9, 4.8, 2.6, 4.7, 4.5, 4.2, 4.0, 2.5, 3.7, 3.2, 3.4, 2.9, 3.1, 2.6, 2.8, 2.3, 1.5, 2.7, 2.4, 2.5, 2.3, 2.2, 2.1, 2.0]
|
||||
line [2.1, 2.4, 2.8, 3.2, 3.8, 4.3, 4.5, 5.0, 4.7, 4.0, 3.2, 2.6, 2.0]
|
||||
```
|
||||
|
||||
### 1.8.4 Operator Actionable Insights
|
||||
|
||||
| Scenario | What You'll See | Action |
|
||||
| --------------------- | ---------------------------------------------------------------------------- | -------------------------------- |
|
||||
| **Slow RPC** | Span showing which phase is slow (parsing, execution, serialization) | Optimize specific code path |
|
||||
| **Transaction Stuck** | Trace stops at validation; error attribute shows reason | Fix transaction parameters |
|
||||
| **Consensus Delay** | Phase.establish taking too long; proposer attribute shows missing validators | Investigate network connectivity |
|
||||
| **Memory Spike** | Large batch of spans correlating with memory increase | Tune batch_size or sampling |
|
||||
| **Network Partition** | Traces missing cross-node links for specific peer | Check peer connectivity |
|
||||
| Scenario | What You'll See | Action |
|
||||
| ------------------------- | ---------------------------------------------------------------------------- | ------------------------------------------------ |
|
||||
| **Slow RPC** | Span showing which phase is slow (parsing, execution, serialization) | Optimize specific code path |
|
||||
| **Transaction Stuck** | Trace stops at validation; error attribute shows reason | Fix transaction parameters |
|
||||
| **Consensus Delay** | Phase.establish taking too long; proposer attribute shows missing validators | Investigate network connectivity |
|
||||
| **Memory Spike** | Large batch of spans correlating with memory increase | Tune batch_size or sampling |
|
||||
| **Network Partition** | Traces missing cross-node links for specific peer | Check peer connectivity |
|
||||
| **Path Computation Slow** | pathfind.compute span shows high latency; cache miss rate in attributes | Warm the RippleLineCache, check order book depth |
|
||||
| **TxQ Full** | txq.enqueue spans show evictions; fee.escalate spans increasing | Monitor fee levels, alert operators |
|
||||
| **Ledger Sync Stalled** | ledger.acquire spans timing out; peer reliability attributes show issues | Check peer connectivity, add trusted peers |
|
||||
| **UNL Stale** | validator.list.fetch spans failing; last_update attribute aging | Verify validator site URLs, check DNS |
|
||||
|
||||
### 1.8.5 Developer Debugging Workflow
|
||||
|
||||
|
||||
@@ -7,6 +7,8 @@
|
||||
|
||||
## 2.1 OpenTelemetry Components
|
||||
|
||||
> **OTLP** = OpenTelemetry Protocol
|
||||
|
||||
### 2.1.1 SDK Selection
|
||||
|
||||
**Primary Choice**: OpenTelemetry C++ SDK (`opentelemetry-cpp`)
|
||||
@@ -32,6 +34,8 @@
|
||||
|
||||
## 2.2 Exporter Configuration
|
||||
|
||||
> **OTLP** = OpenTelemetry Protocol
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
subgraph nodes["rippled Nodes"]
|
||||
@@ -43,8 +47,7 @@ flowchart TB
|
||||
collector["OpenTelemetry<br/>Collector<br/>(sidecar or standalone)"]
|
||||
|
||||
subgraph backends["Observability Backends"]
|
||||
jaeger["Jaeger<br/>(Dev)"]
|
||||
tempo["Tempo<br/>(Prod)"]
|
||||
tempo["Tempo"]
|
||||
elastic["Elastic<br/>APM"]
|
||||
end
|
||||
|
||||
@@ -52,7 +55,6 @@ flowchart TB
|
||||
node2 -->|"OTLP/gRPC<br/>:4317"| collector
|
||||
node3 -->|"OTLP/gRPC<br/>:4317"| collector
|
||||
|
||||
collector --> jaeger
|
||||
collector --> tempo
|
||||
collector --> elastic
|
||||
|
||||
@@ -61,6 +63,13 @@ flowchart TB
|
||||
style collector fill:#bf360c,stroke:#8c2809,color:#ffffff
|
||||
```
|
||||
|
||||
**Reading the diagram:**
|
||||
|
||||
- **rippled Nodes (blue)**: The source of telemetry data. Each rippled node exports spans via OTLP/gRPC on port 4317.
|
||||
- **OpenTelemetry Collector (red)**: The central aggregation point that receives spans from all nodes. Can run as a sidecar (per-node) or standalone (shared). Handles batching, filtering, and routing.
|
||||
- **Observability Backends (green)**: The storage and visualization destinations. Tempo is the recommended backend for both development and production, and Elastic APM is an alternative. The Collector routes to one or more backends.
|
||||
- **Arrows (nodes to collector to backends)**: The data pipeline -- spans flow from nodes to the Collector over gRPC, then the Collector fans out to the configured backends.
|
||||
|
||||
### 2.2.1 OTLP/gRPC (Recommended)
|
||||
|
||||
```cpp
|
||||
@@ -69,8 +78,8 @@ namespace otlp = opentelemetry::exporter::otlp;
|
||||
|
||||
otlp::OtlpGrpcExporterOptions opts;
|
||||
opts.endpoint = "localhost:4317";
|
||||
opts.use_ssl_credentials = true;
|
||||
opts.ssl_credentials_cacert_path = "/path/to/ca.crt";
|
||||
opts.useTls = true;
|
||||
opts.sslCaCertPath = "/path/to/ca.crt";
|
||||
```
|
||||
|
||||
### 2.2.2 OTLP/HTTP (Alternative)
|
||||
@@ -88,6 +97,8 @@ opts.content_type = otlp::HttpRequestContentType::kJson; // or kBinary
|
||||
|
||||
## 2.3 Span Naming Conventions
|
||||
|
||||
> **TxQ** = Transaction Queue | **UNL** = Unique Node List | **WS** = WebSocket
|
||||
|
||||
### 2.3.1 Naming Schema
|
||||
|
||||
```
|
||||
@@ -145,6 +156,36 @@ ledger:
|
||||
build: "Build new ledger"
|
||||
validate: "Ledger validation"
|
||||
close: "Close ledger"
|
||||
replay: "Ledger replay executed"
|
||||
delta: "Delta-based ledger acquired"
|
||||
|
||||
# PathFinding Spans
|
||||
pathfind:
|
||||
request: "Path request initiated"
|
||||
compute: "Path computation executed"
|
||||
|
||||
# TxQ Spans
|
||||
txq:
|
||||
enqueue: "Transaction queued"
|
||||
apply: "Queued transaction applied"
|
||||
|
||||
# Fee/Load Spans
|
||||
fee:
|
||||
escalate: "Fee escalation triggered"
|
||||
|
||||
# Validator Spans
|
||||
validator:
|
||||
list:
|
||||
fetch: "UNL list fetched"
|
||||
manifest: "Manifest update processed"
|
||||
|
||||
# Amendment Spans
|
||||
amendment:
|
||||
vote: "Amendment voting executed"
|
||||
|
||||
# SHAMap Spans
|
||||
shamap:
|
||||
sync: "State tree synchronization"
|
||||
|
||||
# Job Spans
|
||||
job:
|
||||
@@ -156,6 +197,8 @@ job:
|
||||
|
||||
## 2.4 Attribute Schema
|
||||
|
||||
> **TxQ** = Transaction Queue | **UNL** = Unique Node List | **OTLP** = OpenTelemetry Protocol
|
||||
|
||||
### 2.4.1 Resource Attributes (Set Once at Startup)
|
||||
|
||||
```cpp
|
||||
@@ -231,21 +274,75 @@ resource::SemanticConventions::SERVICE_INSTANCE_ID = <node_public_key_base58>
|
||||
"xrpl.job.worker" = int64 // Worker thread ID
|
||||
```
|
||||
|
||||
#### PathFinding Attributes
|
||||
|
||||
```cpp
|
||||
"xrpl.pathfind.source_currency" = string // Source currency code
|
||||
"xrpl.pathfind.dest_currency" = string // Destination currency code
|
||||
"xrpl.pathfind.path_count" = int64 // Number of paths found
|
||||
"xrpl.pathfind.cache_hit" = bool // RippleLineCache hit
|
||||
```
|
||||
|
||||
#### TxQ Attributes
|
||||
|
||||
```cpp
|
||||
"xrpl.txq.queue_depth" = int64 // Current queue depth
|
||||
"xrpl.txq.fee_level" = int64 // Fee level of transaction
|
||||
"xrpl.txq.eviction_reason" = string // Why transaction was evicted
|
||||
```
|
||||
|
||||
#### Fee Attributes
|
||||
|
||||
```cpp
|
||||
"xrpl.fee.load_factor" = int64 // Current load factor
|
||||
"xrpl.fee.escalation_level" = int64 // Fee escalation multiplier
|
||||
```
|
||||
|
||||
#### Validator Attributes
|
||||
|
||||
```cpp
|
||||
"xrpl.validator.list_size" = int64 // UNL size
|
||||
"xrpl.validator.list_age_sec" = int64 // Seconds since last update
|
||||
```
|
||||
|
||||
#### Amendment Attributes
|
||||
|
||||
```cpp
|
||||
"xrpl.amendment.name" = string // Amendment name
|
||||
"xrpl.amendment.status" = string // "enabled", "vetoed", "supported"
|
||||
```
|
||||
|
||||
#### SHAMap Attributes
|
||||
|
||||
```cpp
|
||||
"xrpl.shamap.type" = string // "transaction", "state", "account_state"
|
||||
"xrpl.shamap.missing_nodes" = int64 // Number of missing nodes during sync
|
||||
"xrpl.shamap.duration_ms" = float64 // Sync duration
|
||||
```
|
||||
|
||||
### 2.4.3 Data Collection Summary
|
||||
|
||||
The following table summarizes what data is collected by category:
|
||||
|
||||
| Category | Attributes Collected | Purpose |
|
||||
| --------------- | -------------------------------------------------------------------- | --------------------------- |
|
||||
| **Transaction** | `tx.hash`, `tx.type`, `tx.result`, `tx.fee`, `ledger_index` | Trace transaction lifecycle |
|
||||
| **Consensus** | `round`, `phase`, `mode`, `proposers` (public keys), `duration_ms` | Analyze consensus timing |
|
||||
| **RPC** | `command`, `version`, `status`, `duration_ms` | Monitor RPC performance |
|
||||
| **Peer** | `peer.id` (public key), `latency_ms`, `message.type`, `message.size` | Network topology analysis |
|
||||
| **Ledger** | `ledger.hash`, `ledger.index`, `close_time`, `tx_count` | Ledger progression tracking |
|
||||
| **Job** | `job.type`, `queue_ms`, `worker` | JobQueue performance |
|
||||
| Category | Attributes Collected | Purpose |
|
||||
| --------------- | ---------------------------------------------------------------------- | ---------------------------- |
|
||||
| **Transaction** | `tx.hash`, `tx.type`, `tx.result`, `tx.fee`, `ledger_index` | Trace transaction lifecycle |
|
||||
| **Consensus** | `round`, `phase`, `mode`, `proposers` (public keys), `duration_ms` | Analyze consensus timing |
|
||||
| **RPC** | `command`, `version`, `status`, `duration_ms` | Monitor RPC performance |
|
||||
| **Peer** | `peer.id` (public key), `latency_ms`, `message.type`, `message.size` | Network topology analysis |
|
||||
| **Ledger** | `ledger.hash`, `ledger.index`, `close_time`, `tx_count` | Ledger progression tracking |
|
||||
| **Job** | `job.type`, `queue_ms`, `worker` | JobQueue performance |
|
||||
| **PathFinding** | `pathfind.source_currency`, `dest_currency`, `path_count`, `cache_hit` | Payment path analysis |
|
||||
| **TxQ** | `txq.queue_depth`, `fee_level`, `eviction_reason` | Queue depth and fee tracking |
|
||||
| **Fee** | `fee.load_factor`, `escalation_level` | Fee escalation monitoring |
|
||||
| **Validator** | `validator.list_size`, `list_age_sec` | UNL health monitoring |
|
||||
| **Amendment** | `amendment.name`, `status` | Protocol upgrade tracking |
|
||||
| **SHAMap** | `shamap.type`, `missing_nodes`, `duration_ms` | State tree sync performance |
|
||||
|
||||
### 2.4.4 Privacy & Sensitive Data Policy
|
||||
|
||||
> **PII** = Personally Identifiable Information
|
||||
|
||||
OpenTelemetry instrumentation is designed to collect **operational metadata only**, never sensitive content.
|
||||
|
||||
#### Data NOT Collected
|
||||
@@ -310,18 +407,22 @@ redact_account=1 # Hash account addresses before export
|
||||
redact_peer_address=1 # Remove peer IP addresses
|
||||
```
|
||||
|
||||
> **Note**: The `redact_account` configuration in `rippled.cfg` controls SDK-level redaction before export, while collector-level filtering (see [Collector-Level Data Protection](#collector-level-data-protection) above) provides an additional defense-in-depth layer. Both can operate independently.
|
||||
|
||||
> **Key Principle**: Telemetry collects **operational metadata** (timing, counts, hashes) — never **sensitive content** (keys, balances, amounts, raw payloads).
|
||||
|
||||
---
|
||||
|
||||
## 2.5 Context Propagation Design
|
||||
|
||||
> **WS** = WebSocket
|
||||
|
||||
### 2.5.1 Propagation Boundaries
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
subgraph http["HTTP/WebSocket (RPC)"]
|
||||
w3c["W3C Trace Context Headers:<br/>traceparent: 00-{trace_id}-{span_id}-{flags}<br/>tracestate: rippled=<state>"]
|
||||
w3c["W3C Trace Context Headers:<br/>traceparent:<br/>00-trace_id-span_id-flags<br/>tracestate: rippled=..."]
|
||||
end
|
||||
|
||||
subgraph protobuf["Protocol Buffers (P2P)"]
|
||||
@@ -329,7 +430,7 @@ flowchart TB
|
||||
end
|
||||
|
||||
subgraph jobqueue["JobQueue (Internal Async)"]
|
||||
job["Context captured at job creation,<br/>restored at execution<br/><br/>class Job {<br/> opentelemetry::context::Context traceContext_;<br/>};"]
|
||||
job["Context captured at job creation,<br/>restored at execution<br/><br/>class Job {<br/> otel::context::Context<br/> traceContext_;<br/>};"]
|
||||
end
|
||||
|
||||
style http fill:#0d47a1,stroke:#082f6a,color:#ffffff
|
||||
@@ -337,10 +438,18 @@ flowchart TB
|
||||
style jobqueue fill:#bf360c,stroke:#8c2809,color:#ffffff
|
||||
```
|
||||
|
||||
**Reading the diagram:**
|
||||
|
||||
- **HTTP/WebSocket - RPC (blue)**: For client-facing RPC requests, trace context is propagated using the W3C `traceparent` header. This is the standard approach and works with any OTel-compatible client.
|
||||
- **Protocol Buffers - P2P (green)**: For peer-to-peer messages between rippled nodes, trace context is embedded as a protobuf `TraceContext` message carrying trace_id, span_id, flags, and optional trace_state.
|
||||
- **JobQueue - Internal Async (red)**: For asynchronous work within a single node, the OTel context is captured when a job is created and restored when the job executes on a worker thread. This bridges the async gap so spans remain linked.
|
||||
|
||||
---
|
||||
|
||||
## 2.6 Integration with Existing Observability
|
||||
|
||||
> **OTLP** = OpenTelemetry Protocol | **WS** = WebSocket
|
||||
|
||||
### 2.6.1 Existing Frameworks Comparison
|
||||
|
||||
rippled already has two observability mechanisms. OpenTelemetry complements (not replaces) them:
|
||||
@@ -422,7 +531,7 @@ span->SetAttribute("peer.id", peerId);
|
||||
|
||||
| Scenario | PerfLog | StatsD | OpenTelemetry |
|
||||
| --------------------------------------- | ---------- | ------ | ------------- |
|
||||
| "How many TXs per second?" | ❌ | ✅ | ❌ |
|
||||
| "How many TXs per second?" | ❌ | ✅ | ✅ |
|
||||
| "What's the p99 RPC latency?" | ❌ | ✅ | ✅ |
|
||||
| "Why was this specific TX slow?" | ⚠️ partial | ❌ | ✅ |
|
||||
| "Which node delayed consensus?" | ❌ | ❌ | ✅ |
|
||||
@@ -451,6 +560,14 @@ flowchart TB
|
||||
style grafana fill:#bf360c,stroke:#8c2809,color:#ffffff
|
||||
```
|
||||
|
||||
**Reading the diagram:**
|
||||
|
||||
- **rippled Process (dark gray)**: The single rippled node running all three observability frameworks side by side. Each framework operates independently with no interference.
|
||||
- **PerfLog to perf.log**: PerfLog writes JSON-formatted event logs to a local file. Grafana can ingest these via Loki or a file-based datasource.
|
||||
- **Beast Insight to StatsD Server**: Insight sends aggregated metrics (counters, gauges) over UDP to a StatsD server. Grafana reads from StatsD-compatible backends like Graphite or Prometheus (via StatsD exporter).
|
||||
- **OpenTelemetry to OTLP Collector**: OTel exports spans over OTLP/gRPC to a Collector, which then forwards to a trace backend (Tempo).
|
||||
- **Grafana (red, unified UI)**: All three data streams converge in Grafana, enabling operators to correlate logs, metrics, and traces in a single dashboard.
|
||||
|
||||
### 2.6.5 Correlation with PerfLog
|
||||
|
||||
Trace IDs can be correlated with existing PerfLog entries for comprehensive debugging:
|
||||
|
||||
@@ -81,12 +81,14 @@ flowchart TB
|
||||
|
||||
## 3.3 Performance Overhead Summary
|
||||
|
||||
| Metric | Overhead | Notes |
|
||||
| ------------- | ---------- | ----------------------------------- |
|
||||
| CPU | 1-3% | Span creation and attribute setting |
|
||||
| Memory | 2-5 MB | Batch buffer for pending spans |
|
||||
| Network | 10-50 KB/s | Compressed OTLP export to collector |
|
||||
| Latency (p99) | <2% | With proper sampling configuration |
|
||||
> **OTLP** = OpenTelemetry Protocol
|
||||
|
||||
| Metric | Overhead | Notes |
|
||||
| ------------- | ---------- | ------------------------------------------------ |
|
||||
| CPU | 1-3% | Of per-transaction CPU cost (~200μs baseline) |
|
||||
| Memory | ~10 MB | SDK statics + batch buffer + worker thread stack |
|
||||
| Network | 10-50 KB/s | Compressed OTLP export to collector |
|
||||
| Latency (p99) | <2% | With proper sampling configuration |
|
||||
|
||||
---
|
||||
|
||||
@@ -94,17 +96,26 @@ flowchart TB
|
||||
|
||||
### 3.4.1 Per-Operation Costs
|
||||
|
||||
> **Note on hardware assumptions**: The costs below are based on the official OTel C++ SDK CI benchmarks
|
||||
> (969 runs on GitHub Actions 2-core shared runners). On production server hardware (3+ GHz Xeon),
|
||||
> expect costs at the **lower end** of each range (~30-50% improvement over CI hardware).
|
||||
|
||||
| Operation | Time (ns) | Frequency | Impact |
|
||||
| --------------------- | --------- | ---------------------- | ---------- |
|
||||
| Span creation | 200-500 | Every traced operation | Low |
|
||||
| Span creation | 500-1000 | Every traced operation | Low |
|
||||
| Span end | 100-200 | Every traced operation | Low |
|
||||
| SetAttribute (string) | 80-120 | 3-5 per span | Low |
|
||||
| SetAttribute (int) | 40-60 | 2-3 per span | Negligible |
|
||||
| AddEvent | 50-80 | 0-2 per span | Negligible |
|
||||
| AddEvent | 100-200 | 0-2 per span | Low |
|
||||
| Context injection | 150-250 | Per outgoing message | Low |
|
||||
| Context extraction | 100-180 | Per incoming message | Low |
|
||||
| GetCurrent context | 10-20 | Thread-local access | Negligible |
|
||||
|
||||
**Source**: Span creation based on OTel C++ SDK `BM_SpanCreation` benchmark (AlwaysOnSampler +
|
||||
SimpleSpanProcessor + InMemoryExporter), median ~1,000 ns on CI hardware. AddEvent includes
|
||||
timestamp read + string copy + vector push + mutex acquisition. Context injection/extraction
|
||||
confirmed by `BM_SpanCreationWithScope` benchmark delta (~160 ns).
|
||||
|
||||
### 3.4.2 Transaction Processing Overhead
|
||||
|
||||
<div align="center">
|
||||
@@ -112,67 +123,91 @@ flowchart TB
|
||||
```mermaid
|
||||
%%{init: {'pie': {'textPosition': 0.75}}}%%
|
||||
pie showData
|
||||
"tx.receive (800ns)" : 800
|
||||
"tx.validate (500ns)" : 500
|
||||
"tx.relay (500ns)" : 500
|
||||
"Context inject (600ns)" : 600
|
||||
"tx.receive (1400ns)" : 1400
|
||||
"tx.validate (1200ns)" : 1200
|
||||
"tx.relay (1200ns)" : 1200
|
||||
"Context inject (200ns)" : 200
|
||||
```
|
||||
|
||||
**Transaction Tracing Overhead (~2.4μs total)**
|
||||
**Transaction Tracing Overhead (~4.0μs total)**
|
||||
|
||||
</div>
|
||||
|
||||
**Overhead percentage**: 2.4 μs / 200 μs (avg tx processing) = **~1.2%**
|
||||
**Overhead percentage**: 4.0 μs / 200 μs (avg tx processing) = **~2.0%**
|
||||
|
||||
> **Breakdown**: Each span (tx.receive, tx.validate, tx.relay) costs ~1,000 ns for creation plus
|
||||
> ~200-400 ns for 3-5 attribute sets. Context injection is ~200 ns (confirmed by benchmarks).
|
||||
> On production hardware, expect ~2.6 μs total (~1.3% overhead) due to faster span creation (~500-600 ns).
|
||||
|
||||
### 3.4.3 Consensus Round Overhead
|
||||
|
||||
| Operation | Count | Cost (ns) | Total |
|
||||
| ---------------------- | ----- | --------- | ---------- |
|
||||
| consensus.round span | 1 | ~1000 | ~1 μs |
|
||||
| consensus.phase spans | 3 | ~700 | ~2.1 μs |
|
||||
| proposal.receive spans | ~20 | ~600 | ~12 μs |
|
||||
| proposal.send spans | ~3 | ~600 | ~1.8 μs |
|
||||
| consensus.round span | 1 | ~1200 | ~1.2 μs |
|
||||
| consensus.phase spans | 3 | ~1100 | ~3.3 μs |
|
||||
| proposal.receive spans | ~20 | ~1100 | ~22 μs |
|
||||
| proposal.send spans | ~3 | ~1100 | ~3.3 μs |
|
||||
| Context operations | ~30 | ~200 | ~6 μs |
|
||||
| **TOTAL** | | | **~23 μs** |
|
||||
| **TOTAL** | | | **~36 μs** |
|
||||
|
||||
**Overhead percentage**: 23 μs / 3s (typical round) = **~0.0008%** (negligible)
|
||||
> **Why higher**: Each span costs ~1,000 ns creation + ~100-200 ns for 1-2 attributes, totaling ~1,100-1,200 ns.
|
||||
> Context operations remain ~200 ns (confirmed by benchmarks). On production hardware, expect ~24 μs total.
|
||||
|
||||
**Overhead percentage**: 36 μs / 3s (typical round) = **~0.001%** (negligible)
|
||||
|
||||
### 3.4.4 RPC Request Overhead
|
||||
|
||||
| Operation | Cost (ns) |
|
||||
| ---------------- | ------------ |
|
||||
| rpc.request span | ~700 |
|
||||
| rpc.command span | ~600 |
|
||||
| rpc.request span | ~1200 |
|
||||
| rpc.command span | ~1100 |
|
||||
| Context extract | ~250 |
|
||||
| Context inject | ~200 |
|
||||
| **TOTAL** | **~1.75 μs** |
|
||||
| **TOTAL** | **~2.75 μs** |
|
||||
|
||||
- Fast RPC (1ms): 1.75 μs / 1ms = **~0.175%**
|
||||
- Slow RPC (100ms): 1.75 μs / 100ms = **~0.002%**
|
||||
> **Why higher**: Each span costs ~1,000 ns creation + ~100-200 ns for attributes (command name,
|
||||
> version, role). Context extract/inject costs are confirmed by OTel C++ benchmarks.
|
||||
|
||||
- Fast RPC (1ms): 2.75 μs / 1ms = **~0.275%**
|
||||
- Slow RPC (100ms): 2.75 μs / 100ms = **~0.003%**
|
||||
|
||||
---
|
||||
|
||||
## 3.5 Memory Overhead Analysis
|
||||
|
||||
> **OTLP** = OpenTelemetry Protocol
|
||||
|
||||
### 3.5.1 Static Memory
|
||||
|
||||
| Component | Size | Allocated |
|
||||
| ------------------------ | ----------- | ---------- |
|
||||
| TracerProvider singleton | ~64 KB | At startup |
|
||||
| BatchSpanProcessor | ~128 KB | At startup |
|
||||
| OTLP exporter | ~256 KB | At startup |
|
||||
| Propagator registry | ~8 KB | At startup |
|
||||
| **Total static** | **~456 KB** | |
|
||||
| Component | Size | Allocated |
|
||||
| ------------------------------------ | ----------- | ---------- |
|
||||
| TracerProvider singleton | ~64 KB | At startup |
|
||||
| BatchSpanProcessor (circular buffer) | ~16 KB | At startup |
|
||||
| BatchSpanProcessor (worker thread) | ~8 MB | At startup |
|
||||
| OTLP exporter (gRPC channel init) | ~256 KB | At startup |
|
||||
| Propagator registry | ~8 KB | At startup |
|
||||
| **Total static** | **~8.3 MB** | |
|
||||
|
||||
> **Why higher than earlier estimate**: The BatchSpanProcessor's circular buffer itself is only ~16 KB
|
||||
> (2049 x 8-byte `AtomicUniquePtr` entries), but it spawns a dedicated worker thread whose default
|
||||
> stack size on Linux is ~8 MB. The OTLP gRPC exporter allocates memory for channel stubs and TLS
|
||||
> initialization. The worker thread stack dominates the static footprint.
|
||||
|
||||
### 3.5.2 Dynamic Memory
|
||||
|
||||
| Component | Size per unit | Max units | Peak |
|
||||
| -------------------- | ------------- | ---------- | ----------- |
|
||||
| Active span | ~200 bytes | 1000 | ~200 KB |
|
||||
| Queued span (export) | ~500 bytes | 2048 | ~1 MB |
|
||||
| Attribute storage | ~50 bytes | 5 per span | Included |
|
||||
| Context storage | ~64 bytes | Per thread | ~6.4 KB |
|
||||
| **Total dynamic** | | | **~1.2 MB** |
|
||||
| Component | Size per unit | Max units | Peak |
|
||||
| -------------------- | -------------- | ---------- | --------------- |
|
||||
| Active span | ~500-800 bytes | 1000 | ~500-800 KB |
|
||||
| Queued span (export) | ~500 bytes | 2048 | ~1 MB |
|
||||
| Attribute storage | ~80 bytes | 5 per span | Included |
|
||||
| Context storage | ~64 bytes | Per thread | ~6.4 KB |
|
||||
| **Total dynamic** | | | **~1.5-1.8 MB** |
|
||||
|
||||
> **Why active spans are larger**: An active `Span` object includes the wrapper (~88 bytes: shared_ptr,
|
||||
> mutex, unique_ptr to Recordable) plus `SpanData` (~250 bytes: SpanContext, timestamps, name, status,
|
||||
> empty containers) plus attribute storage (~200-500 bytes for 3-5 string attributes in a `std::map`).
|
||||
> Source: `sdk/src/trace/span.h` and `sdk/include/opentelemetry/sdk/trace/span_data.h`.
|
||||
> Queued spans release the wrapper, keeping only `SpanData` + attributes (~500 bytes).
|
||||
|
||||
### 3.5.3 Memory Growth Characteristics
|
||||
|
||||
@@ -184,18 +219,34 @@ config:
|
||||
height: 400
|
||||
---
|
||||
xychart-beta
|
||||
title "Memory Usage vs Span Rate"
|
||||
title "Memory Usage vs Span Rate (bounded by queue limit)"
|
||||
x-axis "Spans/second" [0, 200, 400, 600, 800, 1000]
|
||||
y-axis "Memory (MB)" 0 --> 6
|
||||
line [1, 1.8, 2.6, 3.4, 4.2, 5]
|
||||
y-axis "Memory (MB)" 0 --> 12
|
||||
line [8.5, 9.2, 9.6, 9.9, 10.0, 10.0]
|
||||
```
|
||||
|
||||
**Notes**:
|
||||
|
||||
- Memory increases linearly with span rate
|
||||
- Memory increases with span rate but **plateaus at queue capacity** (default 2048 spans)
|
||||
- Batch export prevents unbounded growth
|
||||
- Queue size is configurable (default 2048 spans)
|
||||
- At queue limit, oldest spans are dropped (not blocked)
|
||||
- Maximum memory is bounded: ~8.3 MB static (dominated by worker thread stack) + 2048 queued spans x ~500 bytes (~1 MB) + active spans (~0.8 MB) ≈ **~10 MB ceiling**
|
||||
- The worker thread stack (~8 MB) is virtual memory; actual RSS depends on stack usage (typically much less)
|
||||
|
||||
### 3.5.4 Performance Data Sources
|
||||
|
||||
The overhead estimates in Sections 3.3-3.5 are derived from the following sources:
|
||||
|
||||
| Source | What it covers | URL |
|
||||
| ------------------------------------------------ | ----------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------ |
|
||||
| OTel C++ SDK CI benchmarks (969 runs) | Span creation, context activation, sampler overhead | [Benchmark Dashboard](https://open-telemetry.github.io/opentelemetry-cpp/benchmarks/) |
|
||||
| `api/test/trace/span_benchmark.cc` | API-level span creation (~22 ns no-op) | [Source](https://github.com/open-telemetry/opentelemetry-cpp/blob/main/api/test/trace/span_benchmark.cc) |
|
||||
| `sdk/test/trace/sampler_benchmark.cc` | SDK span creation with samplers (~1,000 ns AlwaysOn) | [Source](https://github.com/open-telemetry/opentelemetry-cpp/blob/main/sdk/test/trace/sampler_benchmark.cc) |
|
||||
| `sdk/include/.../span_data.h` | SpanData memory layout (~250 bytes base) | [Source](https://github.com/open-telemetry/opentelemetry-cpp/blob/main/sdk/include/opentelemetry/sdk/trace/span_data.h) |
|
||||
| `sdk/src/trace/span.h` | Span wrapper memory layout (~88 bytes) | [Source](https://github.com/open-telemetry/opentelemetry-cpp/blob/main/sdk/src/trace/span.h) |
|
||||
| `sdk/include/.../batch_span_processor_options.h` | Default queue size (2048), batch size (512) | [Source](https://github.com/open-telemetry/opentelemetry-cpp/blob/main/sdk/include/opentelemetry/sdk/trace/batch_span_processor_options.h) |
|
||||
| `sdk/include/.../circular_buffer.h` | CircularBuffer implementation (AtomicUniquePtr array) | [Source](https://github.com/open-telemetry/opentelemetry-cpp/blob/main/sdk/include/opentelemetry/sdk/common/circular_buffer.h) |
|
||||
| OTLP proto definition | Serialized span size estimation | [Proto](https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/trace/v1/trace.proto) |
|
||||
|
||||
---
|
||||
|
||||
@@ -203,6 +254,11 @@ xychart-beta
|
||||
|
||||
### 3.6.1 Export Bandwidth
|
||||
|
||||
> **Bytes per span**: Estimates use ~500 bytes/span (conservative upper bound). OTLP protobuf analysis
|
||||
> shows a typical span with 3-5 string attributes serializes to ~200-300 bytes raw; with gzip
|
||||
> compression (~60-70% of raw) and batching (amortized headers), ~350 bytes/span is more realistic.
|
||||
> The table uses the conservative estimate for capacity planning.
|
||||
|
||||
| Sampling Rate | Spans/sec | Bandwidth | Notes |
|
||||
| ------------- | --------- | --------- | ---------------- |
|
||||
| 100% | ~500 | ~250 KB/s | Development only |
|
||||
@@ -214,10 +270,10 @@ xychart-beta
|
||||
|
||||
| Message Type | Context Size | Messages/sec | Overhead |
|
||||
| ---------------------- | ------------ | ------------ | ----------- |
|
||||
| TMTransaction | 32 bytes | ~100 | ~3.2 KB/s |
|
||||
| TMProposeSet | 32 bytes | ~10 | ~320 B/s |
|
||||
| TMValidation | 32 bytes | ~50 | ~1.6 KB/s |
|
||||
| **Total P2P overhead** | | | **~5 KB/s** |
|
||||
| TMTransaction | 25 bytes | ~100 | ~2.5 KB/s |
|
||||
| TMProposeSet | 25 bytes | ~10 | ~250 B/s |
|
||||
| TMValidation | 25 bytes | ~50 | ~1.25 KB/s |
|
||||
| **Total P2P overhead** | | | **~4 KB/s** |
|
||||
|
||||
---
|
||||
|
||||
@@ -225,6 +281,8 @@ xychart-beta
|
||||
|
||||
### 3.7.1 Sampling Strategies
|
||||
|
||||
#### Tail Sampling
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
trace["New Trace"]
|
||||
@@ -284,6 +342,8 @@ if (telemetry.shouldTracePeer())
|
||||
|
||||
## 3.9 Code Intrusiveness Assessment
|
||||
|
||||
> **TxQ** = Transaction Queue
|
||||
|
||||
This section provides a detailed assessment of how intrusive the OpenTelemetry integration is to the existing rippled codebase.
|
||||
|
||||
### 3.9.1 Files Modified Summary
|
||||
@@ -297,7 +357,10 @@ This section provides a detailed assessment of how intrusive the OpenTelemetry i
|
||||
| **Consensus** | 3 files | ~100 | ~30 | Low-Medium |
|
||||
| **Protocol Buffers** | 1 file | ~25 | 0 | Low |
|
||||
| **CMake/Build** | 3 files | ~50 | ~10 | Minimal |
|
||||
| **Total** | **~21 files** | **~1,205** | **~105** | **Low** |
|
||||
| **PathFinding** | 2 | ~80 | ~5 | Minimal |
|
||||
| **TxQ/Fee** | 2 | ~60 | ~5 | Minimal |
|
||||
| **Validator/Amend** | 3 | ~40 | ~5 | Minimal |
|
||||
| **Total** | **~28 files** | **~1,490** | **~120** | **Low** |
|
||||
|
||||
### 3.9.2 Detailed File Impact
|
||||
|
||||
@@ -307,6 +370,9 @@ pie title Code Changes by Component
|
||||
"Transaction Relay" : 160
|
||||
"Consensus" : 130
|
||||
"RPC Layer" : 100
|
||||
"PathFinding" : 80
|
||||
"TxQ/Fee" : 60
|
||||
"Validator/Amendment" : 40
|
||||
"Application Init" : 35
|
||||
"Protocol Buffers" : 25
|
||||
"Build System" : 60
|
||||
@@ -337,6 +403,14 @@ pie title Code Changes by Component
|
||||
| `src/xrpld/app/consensus/RCLConsensus.cpp` | ~50 | ~15 | Medium |
|
||||
| `src/xrpld/app/consensus/RCLConsensusAdaptor.cpp` | ~40 | ~12 | Medium |
|
||||
| `src/xrpld/core/JobQueue.cpp` | ~20 | ~5 | Low |
|
||||
| `src/xrpld/app/paths/PathRequest.cpp` | ~40 | ~3 | Low |
|
||||
| `src/xrpld/app/paths/Pathfinder.cpp` | ~40 | ~2 | Low |
|
||||
| `src/xrpld/app/misc/TxQ.cpp` | ~40 | ~3 | Low |
|
||||
| `src/xrpld/app/main/LoadManager.cpp` | ~20 | ~2 | Low |
|
||||
| `src/xrpld/app/misc/ValidatorList.cpp` | ~20 | ~2 | Low |
|
||||
| `src/xrpld/app/misc/AmendmentTable.cpp` | ~10 | ~2 | Low |
|
||||
| `src/xrpld/app/misc/Manifest.cpp` | ~10 | ~1 | Low |
|
||||
| `src/xrpld/shamap/SHAMap.cpp` | ~20 | ~3 | Low |
|
||||
| `src/xrpld/overlay/detail/ripple.proto` | ~25 | 0 | Low |
|
||||
| `CMakeLists.txt` | ~40 | ~8 | Low |
|
||||
| `cmake/FindOpenTelemetry.cmake` | ~50 | 0 | None (new) |
|
||||
@@ -353,12 +427,15 @@ quadrantChart
|
||||
x-axis Low Risk --> High Risk
|
||||
y-axis Low Value --> High Value
|
||||
|
||||
RPC Tracing: [0.2, 0.8]
|
||||
Transaction Relay: [0.5, 0.9]
|
||||
Consensus Tracing: [0.7, 0.95]
|
||||
Peer Message Tracing: [0.8, 0.4]
|
||||
JobQueue Context: [0.4, 0.5]
|
||||
Ledger Acquisition: [0.5, 0.6]
|
||||
RPC Tracing: [0.2, 0.55]
|
||||
Transaction Relay: [0.55, 0.85]
|
||||
Consensus Tracing: [0.75, 0.92]
|
||||
Peer Message Tracing: [0.85, 0.35]
|
||||
JobQueue Context: [0.3, 0.42]
|
||||
Ledger Acquisition: [0.48, 0.65]
|
||||
PathFinding: [0.38, 0.72]
|
||||
TxQ and Fees: [0.25, 0.62]
|
||||
Validator Mgmt: [0.15, 0.35]
|
||||
```
|
||||
|
||||
**Optional** ↙ ↘ **Avoid**
|
||||
@@ -375,15 +452,15 @@ quadrantChart
|
||||
|
||||
### 3.9.4 Architectural Impact Assessment
|
||||
|
||||
| Aspect | Impact | Justification |
|
||||
| -------------------- | ------- | --------------------------------------------------------------------- |
|
||||
| **Data Flow** | None | Tracing is purely observational; no business logic changes |
|
||||
| **Threading Model** | Minimal | Context propagation uses thread-local storage (standard OTel pattern) |
|
||||
| **Memory Model** | Low | Bounded queues prevent unbounded growth; RAII ensures cleanup |
|
||||
| **Network Protocol** | Low | Optional fields in protobuf (high field numbers); backward compatible |
|
||||
| **Configuration** | None | New config section; existing configs unaffected |
|
||||
| **Build System** | Low | Optional CMake flag; builds work without OpenTelemetry |
|
||||
| **Dependencies** | Low | OpenTelemetry SDK is optional; null implementation when disabled |
|
||||
| Aspect | Impact | Justification |
|
||||
| -------------------- | ------- | -------------------------------------------------------------------------------- |
|
||||
| **Data Flow** | Minimal | Read-only instrumentation; no modification to consensus or transaction data flow |
|
||||
| **Threading Model** | Minimal | Context propagation uses thread-local storage (standard OTel pattern) |
|
||||
| **Memory Model** | Low | Bounded queues prevent unbounded growth; RAII ensures cleanup |
|
||||
| **Network Protocol** | Low | Optional fields in protobuf (high field numbers); backward compatible |
|
||||
| **Configuration** | None | New config section; existing configs unaffected |
|
||||
| **Build System** | Low | Optional CMake flag; builds work without OpenTelemetry |
|
||||
| **Dependencies** | Low | OpenTelemetry SDK is optional; null implementation when disabled |
|
||||
|
||||
### 3.9.5 Backward Compatibility
|
||||
|
||||
|
||||
@@ -7,6 +7,8 @@
|
||||
|
||||
## 4.1 Core Interfaces
|
||||
|
||||
> **OTLP** = OpenTelemetry Protocol
|
||||
|
||||
### 4.1.1 Main Telemetry Interface
|
||||
|
||||
```cpp
|
||||
@@ -69,6 +71,10 @@ public:
|
||||
bool traceRpc = true;
|
||||
bool tracePeer = false; // High volume, disabled by default
|
||||
bool traceLedger = true;
|
||||
bool tracePathfind = true;
|
||||
bool traceTxQ = true;
|
||||
bool traceValidator = false; // Low volume, disabled by default
|
||||
bool traceAmendment = false; // Very low volume, disabled by default
|
||||
};
|
||||
|
||||
virtual ~Telemetry() = default;
|
||||
@@ -140,6 +146,21 @@ public:
|
||||
|
||||
/** Check if peer message tracing is enabled */
|
||||
virtual bool shouldTracePeer() const = 0;
|
||||
|
||||
/** Check if ledger tracing is enabled */
|
||||
virtual bool shouldTraceLedger() const = 0;
|
||||
|
||||
/** Check if path finding tracing is enabled */
|
||||
virtual bool shouldTracePathfind() const = 0;
|
||||
|
||||
/** Check if transaction queue tracing is enabled */
|
||||
virtual bool shouldTraceTxQ() const = 0;
|
||||
|
||||
/** Check if validator list/manifest tracing is enabled */
|
||||
virtual bool shouldTraceValidator() const = 0;
|
||||
|
||||
/** Check if amendment voting tracing is enabled */
|
||||
virtual bool shouldTraceAmendment() const = 0;
|
||||
};
|
||||
|
||||
// Factory functions
|
||||
@@ -191,11 +212,17 @@ public:
|
||||
/**
|
||||
* Construct guard with span.
|
||||
* The span becomes the current span in thread-local context.
|
||||
*
|
||||
* @note If span is nullptr (e.g., telemetry disabled), the guard
|
||||
* becomes a no-op. All methods safely check for null before access.
|
||||
*/
|
||||
explicit SpanGuard(
|
||||
opentelemetry::nostd::shared_ptr<opentelemetry::trace::Span> span)
|
||||
: span_(std::move(span))
|
||||
, scope_(span_)
|
||||
: span_(span ? std::move(span) : nullptr)
|
||||
, scope_(span_ ? opentelemetry::trace::Scope(span_)
|
||||
: opentelemetry::trace::Scope(
|
||||
opentelemetry::nostd::shared_ptr<
|
||||
opentelemetry::trace::Span>(nullptr)))
|
||||
{
|
||||
}
|
||||
|
||||
@@ -277,6 +304,12 @@ public:
|
||||
|
||||
void addEvent(std::string_view) {}
|
||||
void recordException(std::exception const&) {}
|
||||
|
||||
/** Return a default empty context (matches SpanGuard interface) */
|
||||
opentelemetry::context::Context context() const
|
||||
{
|
||||
return opentelemetry::context::Context{};
|
||||
}
|
||||
};
|
||||
|
||||
} // namespace telemetry
|
||||
@@ -332,17 +365,66 @@ namespace telemetry {
|
||||
_xrpl_guard_.emplace((telemetry).startSpan(name)); \
|
||||
}
|
||||
|
||||
// Set attribute on current span (if exists)
|
||||
#define XRPL_TRACE_SET_ATTR(key, value) \
|
||||
if (_xrpl_guard_.has_value()) { \
|
||||
_xrpl_guard_->setAttribute(key, value); \
|
||||
#define XRPL_TRACE_PEER(telemetry, name) \
|
||||
std::optional<::xrpl::telemetry::SpanGuard> _xrpl_guard_; \
|
||||
if ((telemetry).shouldTracePeer()) { \
|
||||
_xrpl_guard_.emplace((telemetry).startSpan(name)); \
|
||||
}
|
||||
|
||||
#define XRPL_TRACE_LEDGER(telemetry, name) \
|
||||
std::optional<::xrpl::telemetry::SpanGuard> _xrpl_guard_; \
|
||||
if ((telemetry).shouldTraceLedger()) { \
|
||||
_xrpl_guard_.emplace((telemetry).startSpan(name)); \
|
||||
}
|
||||
|
||||
#define XRPL_TRACE_PATHFIND(telemetry, name) \
|
||||
std::optional<::xrpl::telemetry::SpanGuard> _xrpl_guard_; \
|
||||
if ((telemetry).shouldTracePathfind()) { \
|
||||
_xrpl_guard_.emplace((telemetry).startSpan(name)); \
|
||||
}
|
||||
|
||||
#define XRPL_TRACE_TXQ(telemetry, name) \
|
||||
std::optional<::xrpl::telemetry::SpanGuard> _xrpl_guard_; \
|
||||
if ((telemetry).shouldTraceTxQ()) { \
|
||||
_xrpl_guard_.emplace((telemetry).startSpan(name)); \
|
||||
}
|
||||
|
||||
#define XRPL_TRACE_VALIDATOR(telemetry, name) \
|
||||
std::optional<::xrpl::telemetry::SpanGuard> _xrpl_guard_; \
|
||||
if ((telemetry).shouldTraceValidator()) { \
|
||||
_xrpl_guard_.emplace((telemetry).startSpan(name)); \
|
||||
}
|
||||
|
||||
#define XRPL_TRACE_AMENDMENT(telemetry, name) \
|
||||
std::optional<::xrpl::telemetry::SpanGuard> _xrpl_guard_; \
|
||||
if ((telemetry).shouldTraceAmendment()) { \
|
||||
_xrpl_guard_.emplace((telemetry).startSpan(name)); \
|
||||
}
|
||||
|
||||
// Set attribute on current span (if exists).
|
||||
// Works with both std::optional<SpanGuard> (from conditional macros)
|
||||
// and bare SpanGuard (from XRPL_TRACE_SPAN). Uses 'if constexpr'-like
|
||||
// dispatch via a helper that checks for .has_value().
|
||||
#define XRPL_TRACE_SET_ATTR(key, value) \
|
||||
do { \
|
||||
if constexpr (requires { _xrpl_guard_.has_value(); }) { \
|
||||
if (_xrpl_guard_.has_value()) \
|
||||
_xrpl_guard_->setAttribute(key, value); \
|
||||
} else { \
|
||||
_xrpl_guard_.setAttribute(key, value); \
|
||||
} \
|
||||
} while(0)
|
||||
|
||||
// Record exception on current span
|
||||
#define XRPL_TRACE_EXCEPTION(e) \
|
||||
if (_xrpl_guard_.has_value()) { \
|
||||
_xrpl_guard_->recordException(e); \
|
||||
}
|
||||
do { \
|
||||
if constexpr (requires { _xrpl_guard_.has_value(); }) { \
|
||||
if (_xrpl_guard_.has_value()) \
|
||||
_xrpl_guard_->recordException(e); \
|
||||
} else { \
|
||||
_xrpl_guard_.recordException(e); \
|
||||
} \
|
||||
} while(0)
|
||||
|
||||
#else // XRPL_ENABLE_TELEMETRY not defined
|
||||
|
||||
@@ -351,6 +433,12 @@ namespace telemetry {
|
||||
#define XRPL_TRACE_TX(telemetry, name) ((void)0)
|
||||
#define XRPL_TRACE_CONSENSUS(telemetry, name) ((void)0)
|
||||
#define XRPL_TRACE_RPC(telemetry, name) ((void)0)
|
||||
#define XRPL_TRACE_PEER(telemetry, name) ((void)0)
|
||||
#define XRPL_TRACE_LEDGER(telemetry, name) ((void)0)
|
||||
#define XRPL_TRACE_PATHFIND(telemetry, name) ((void)0)
|
||||
#define XRPL_TRACE_TXQ(telemetry, name) ((void)0)
|
||||
#define XRPL_TRACE_VALIDATOR(telemetry, name) ((void)0)
|
||||
#define XRPL_TRACE_AMENDMENT(telemetry, name) ((void)0)
|
||||
#define XRPL_TRACE_SET_ATTR(key, value) ((void)0)
|
||||
#define XRPL_TRACE_EXCEPTION(e) ((void)0)
|
||||
|
||||
@@ -369,6 +457,9 @@ namespace telemetry {
|
||||
Add to `src/xrpld/overlay/detail/ripple.proto`:
|
||||
|
||||
```protobuf
|
||||
// Note: rippled uses proto2 syntax. The 'optional' keyword below is valid
|
||||
// in proto2 (it is the default field rule) and is included for clarity.
|
||||
|
||||
// Trace context for distributed tracing across nodes
|
||||
// Uses W3C Trace Context format internally
|
||||
message TraceContext {
|
||||
@@ -423,6 +514,8 @@ message TMLedgerData {
|
||||
#pragma once
|
||||
|
||||
#include <opentelemetry/context/context.h>
|
||||
#include <opentelemetry/trace/context.h>
|
||||
#include <opentelemetry/trace/default_span.h>
|
||||
#include <opentelemetry/trace/span_context.h>
|
||||
#include <protocol/messages.h> // Generated protobuf
|
||||
|
||||
@@ -480,7 +573,14 @@ TraceContextPropagator::extract(protocol::TraceContext const& proto)
|
||||
using namespace opentelemetry::trace;
|
||||
|
||||
if (proto.trace_id().size() != 16 || proto.span_id().size() != 8)
|
||||
return opentelemetry::context::Context{}; // Invalid, return empty
|
||||
{
|
||||
// Log malformed trace context for debugging. Silent failures in
|
||||
// context extraction make distributed tracing issues hard to diagnose.
|
||||
JLOG(j_.warn()) << "Malformed trace context: trace_id size="
|
||||
<< proto.trace_id().size()
|
||||
<< " span_id size=" << proto.span_id().size();
|
||||
return opentelemetry::context::Context{};
|
||||
}
|
||||
|
||||
// Construct TraceId and SpanId from bytes
|
||||
TraceId traceId(reinterpret_cast<uint8_t const*>(proto.trace_id().data()));
|
||||
@@ -490,11 +590,15 @@ TraceContextPropagator::extract(protocol::TraceContext const& proto)
|
||||
// Create SpanContext from extracted data
|
||||
SpanContext spanContext(traceId, spanId, flags, /* remote = */ true);
|
||||
|
||||
// Create context with extracted span as parent
|
||||
return opentelemetry::context::Context{}.SetValue(
|
||||
opentelemetry::trace::kSpanKey,
|
||||
// DefaultSpan wraps SpanContext for use as a non-recording parent.
|
||||
// This is the standard OTel C++ pattern for remote context propagation.
|
||||
// DefaultSpan carries the remote SpanContext without recording any data.
|
||||
auto parentCtx = opentelemetry::trace::SetSpan(
|
||||
opentelemetry::context::Context{},
|
||||
opentelemetry::nostd::shared_ptr<Span>(
|
||||
new DefaultSpan(spanContext)));
|
||||
|
||||
return parentCtx;
|
||||
}
|
||||
|
||||
inline void
|
||||
@@ -750,8 +854,8 @@ ServerHandler::onRequest(
|
||||
// Extract trace context from HTTP headers (W3C Trace Context)
|
||||
auto parentCtx = telemetry::TraceContextPropagator::extractFromHeaders(
|
||||
[&req](std::string_view name) -> std::optional<std::string> {
|
||||
auto it = req.find(boost::beast::http::field{
|
||||
std::string(name)});
|
||||
// Beast's find() accepts a string_view for custom header lookup
|
||||
auto it = req.find(name);
|
||||
if (it != req.end())
|
||||
return std::string(it->value());
|
||||
return std::nullopt;
|
||||
@@ -977,6 +1081,14 @@ flowchart TB
|
||||
|
||||
</div>
|
||||
|
||||
**Reading the diagram:**
|
||||
|
||||
- **Client / Submit TX**: An external client submits a transaction, creating the root span that initiates the trace.
|
||||
- **Node A (RPC layer)**: The receiving node processes the submission through `rpc.request` and `rpc.command.submit`, then hands off to the transaction pipeline (`tx.receive` → `tx.validate` → `tx.relay`).
|
||||
- **Dashed arrows (TraceContext)**: Cross-node boundaries where trace context is propagated via the protobuf protocol extension, linking spans across independent processes.
|
||||
- **Node B (relay hop)**: A peer node that receives, validates, and relays the transaction further, demonstrating multi-hop propagation.
|
||||
- **Node C (consensus)**: The final node where the transaction enters consensus (`consensus.round` → `consensus.phase.establish`), showing how a single client action produces an end-to-end distributed trace.
|
||||
|
||||
---
|
||||
|
||||
_Previous: [Implementation Strategy](./03-implementation-strategy.md)_ | _Next: [Configuration Reference](./05-configuration-reference.md)_ | _Back to: [Overview](./OpenTelemetryPlan.md)_
|
||||
|
||||
@@ -7,6 +7,8 @@
|
||||
|
||||
## 5.1 rippled Configuration
|
||||
|
||||
> **OTLP** = OpenTelemetry Protocol | **TxQ** = Transaction Queue
|
||||
|
||||
### 5.1.1 Configuration File Section
|
||||
|
||||
Add to `cfg/xrpld-example.cfg`:
|
||||
@@ -38,6 +40,9 @@ Add to `cfg/xrpld-example.cfg`:
|
||||
#
|
||||
# # Sampling ratio: 0.0-1.0 (default: 1.0 = 100% sampling)
|
||||
# # Use lower values in production to reduce overhead
|
||||
# # Default: 1.0 (all traces). For production deployments with high
|
||||
# # throughput, 0.1 (10%) is recommended to reduce overhead.
|
||||
# # See Section 7.4.2 for sampling strategy details.
|
||||
# sampling_ratio=0.1
|
||||
#
|
||||
# # Batch processor settings
|
||||
@@ -51,6 +56,10 @@ Add to `cfg/xrpld-example.cfg`:
|
||||
# trace_rpc=1 # RPC request handling
|
||||
# trace_peer=0 # Peer messages (high volume, disabled by default)
|
||||
# trace_ledger=1 # Ledger acquisition and building
|
||||
# trace_pathfind=1 # Path computation (can be expensive)
|
||||
# trace_txq=1 # Transaction queue and fee escalation
|
||||
# trace_validator=0 # Validator list and manifest updates (low volume)
|
||||
# trace_amendment=0 # Amendment voting (very low volume)
|
||||
#
|
||||
# # Service identification (automatically detected if not specified)
|
||||
# # service_name=rippled
|
||||
@@ -78,6 +87,10 @@ enabled=0
|
||||
| `trace_rpc` | bool | `true` | Enable RPC tracing |
|
||||
| `trace_peer` | bool | `false` | Enable peer message tracing (high volume) |
|
||||
| `trace_ledger` | bool | `true` | Enable ledger tracing |
|
||||
| `trace_pathfind` | bool | `true` | Enable path computation tracing |
|
||||
| `trace_txq` | bool | `true` | Enable transaction queue tracing |
|
||||
| `trace_validator` | bool | `false` | Enable validator list/manifest tracing |
|
||||
| `trace_amendment` | bool | `false` | Enable amendment voting tracing |
|
||||
| `service_name` | string | `"rippled"` | Service name for traces |
|
||||
| `service_instance_id` | string | `<node_pubkey>` | Instance identifier |
|
||||
|
||||
@@ -85,6 +98,8 @@ enabled=0
|
||||
|
||||
## 5.2 Configuration Parser
|
||||
|
||||
> **TxQ** = Transaction Queue
|
||||
|
||||
```cpp
|
||||
// src/libxrpl/telemetry/TelemetryConfig.cpp
|
||||
|
||||
@@ -140,6 +155,10 @@ setup_Telemetry(
|
||||
setup.traceRpc = section.value_or("trace_rpc", true);
|
||||
setup.tracePeer = section.value_or("trace_peer", false);
|
||||
setup.traceLedger = section.value_or("trace_ledger", true);
|
||||
setup.tracePathfind = section.value_or("trace_pathfind", true);
|
||||
setup.traceTxQ = section.value_or("trace_txq", true);
|
||||
setup.traceValidator = section.value_or("trace_validator", false);
|
||||
setup.traceAmendment = section.value_or("trace_amendment", false);
|
||||
|
||||
return setup;
|
||||
}
|
||||
@@ -239,6 +258,8 @@ public:
|
||||
|
||||
## 5.4 CMake Integration
|
||||
|
||||
> **OTLP** = OpenTelemetry Protocol
|
||||
|
||||
### 5.4.1 Find OpenTelemetry Module
|
||||
|
||||
```cmake
|
||||
@@ -354,6 +375,8 @@ endif()
|
||||
|
||||
## 5.5 OpenTelemetry Collector Configuration
|
||||
|
||||
> **OTLP** = OpenTelemetry Protocol | **APM** = Application Performance Monitoring
|
||||
|
||||
### 5.5.1 Development Configuration
|
||||
|
||||
```yaml
|
||||
@@ -380,9 +403,9 @@ exporters:
|
||||
sampling_initial: 5
|
||||
sampling_thereafter: 200
|
||||
|
||||
# Jaeger for trace visualization
|
||||
jaeger:
|
||||
endpoint: jaeger:14250
|
||||
# Tempo for trace visualization
|
||||
otlp/tempo:
|
||||
endpoint: tempo:4317
|
||||
tls:
|
||||
insecure: true
|
||||
|
||||
@@ -391,7 +414,7 @@ service:
|
||||
traces:
|
||||
receivers: [otlp]
|
||||
processors: [batch]
|
||||
exporters: [logging, jaeger]
|
||||
exporters: [logging, otlp/tempo]
|
||||
```
|
||||
|
||||
### 5.5.2 Production Configuration
|
||||
@@ -504,6 +527,8 @@ service:
|
||||
|
||||
## 5.6 Docker Compose Development Environment
|
||||
|
||||
> **OTLP** = OpenTelemetry Protocol
|
||||
|
||||
```yaml
|
||||
# docker-compose-telemetry.yaml
|
||||
version: "3.8"
|
||||
@@ -521,17 +546,15 @@ services:
|
||||
- "4318:4318" # OTLP HTTP
|
||||
- "13133:13133" # Health check
|
||||
depends_on:
|
||||
- jaeger
|
||||
- tempo
|
||||
|
||||
# Jaeger for trace visualization
|
||||
jaeger:
|
||||
image: jaegertracing/all-in-one:1.53
|
||||
container_name: jaeger
|
||||
environment:
|
||||
- COLLECTOR_OTLP_ENABLED=true
|
||||
# Tempo for trace visualization
|
||||
tempo:
|
||||
image: grafana/tempo:2.6.1
|
||||
container_name: tempo
|
||||
ports:
|
||||
- "16686:16686" # UI
|
||||
- "14250:14250" # gRPC
|
||||
- "3200:3200" # Tempo HTTP API
|
||||
- "4317" # OTLP gRPC (internal)
|
||||
|
||||
# Grafana for dashboards
|
||||
grafana:
|
||||
@@ -546,7 +569,7 @@ services:
|
||||
ports:
|
||||
- "3000:3000"
|
||||
depends_on:
|
||||
- jaeger
|
||||
- tempo
|
||||
|
||||
# Prometheus for metrics (optional, for correlation)
|
||||
prometheus:
|
||||
@@ -566,6 +589,8 @@ networks:
|
||||
|
||||
## 5.7 Configuration Architecture
|
||||
|
||||
> **OTLP** = OpenTelemetry Protocol
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
subgraph config["Configuration Sources"]
|
||||
@@ -605,10 +630,20 @@ flowchart TB
|
||||
style collector fill:#fff3e0,stroke:#ff9800
|
||||
```
|
||||
|
||||
**Reading the diagram:**
|
||||
|
||||
- **Configuration Sources**: `xrpld.cfg` provides runtime settings (endpoint, sampling) while the CMake flag controls whether telemetry is compiled in at all.
|
||||
- **Initialization**: `setup_Telemetry()` parses config values, then `make_Telemetry()` constructs the provider, processor, and exporter objects.
|
||||
- **Runtime Components**: The `TracerProvider` creates spans, the `BatchProcessor` buffers them, and the `OTLP Exporter` serializes and sends them over the wire.
|
||||
- **OTLP arrow to Collector**: Trace data leaves the rippled process via OTLP (gRPC or HTTP) and enters the external Collector pipeline.
|
||||
- **Collector Pipeline**: `Receivers` ingest OTLP data, `Processors` apply sampling/filtering/enrichment, and `Exporters` forward traces to storage backends (Tempo, etc.).
|
||||
|
||||
---
|
||||
|
||||
## 5.8 Grafana Integration
|
||||
|
||||
> **APM** = Application Performance Monitoring
|
||||
|
||||
Step-by-step instructions for integrating rippled traces with Grafana.
|
||||
|
||||
### 5.8.1 Data Source Configuration
|
||||
@@ -642,23 +677,6 @@ datasources:
|
||||
datasourceUid: loki
|
||||
```
|
||||
|
||||
#### Jaeger
|
||||
|
||||
```yaml
|
||||
# grafana/provisioning/datasources/jaeger.yaml
|
||||
apiVersion: 1
|
||||
|
||||
datasources:
|
||||
- name: Jaeger
|
||||
type: jaeger
|
||||
access: proxy
|
||||
url: http://jaeger:16686
|
||||
jsonData:
|
||||
tracesToLogs:
|
||||
datasourceUid: loki
|
||||
tags: ["service.name"]
|
||||
```
|
||||
|
||||
#### Elastic APM
|
||||
|
||||
```yaml
|
||||
|
||||
@@ -7,6 +7,8 @@
|
||||
|
||||
## 6.1 Phase Overview
|
||||
|
||||
> **TxQ** = Transaction Queue
|
||||
|
||||
```mermaid
|
||||
gantt
|
||||
title OpenTelemetry Implementation Timeline
|
||||
@@ -19,26 +21,36 @@ gantt
|
||||
Telemetry Interface :p1b, after p1a, 3d
|
||||
Configuration & CMake :p1c, after p1b, 3d
|
||||
Unit Tests :p1d, after p1c, 2d
|
||||
Buffer & Integration :p1e, after p1d, 2d
|
||||
|
||||
section Phase 2
|
||||
RPC Tracing :p2, after p1, 2w
|
||||
HTTP Context Extraction :p2a, after p1, 2d
|
||||
RPC Handler Instrumentation :p2b, after p2a, 4d
|
||||
WebSocket Support :p2c, after p2b, 2d
|
||||
PathFinding Instrumentation :p2f, after p2b, 2d
|
||||
TxQ Instrumentation :p2g, after p2f, 2d
|
||||
WebSocket Support :p2c, after p2g, 2d
|
||||
Integration Tests :p2d, after p2c, 2d
|
||||
Buffer & Review :p2e, after p2d, 4d
|
||||
|
||||
section Phase 3
|
||||
Transaction Tracing :p3, after p2, 2w
|
||||
Protocol Buffer Extension :p3a, after p2, 2d
|
||||
PeerImp Instrumentation :p3b, after p3a, 3d
|
||||
Relay Context Propagation :p3c, after p3b, 3d
|
||||
Fee Escalation Instrumentation :p3f, after p3b, 2d
|
||||
Relay Context Propagation :p3c, after p3f, 3d
|
||||
Multi-node Tests :p3d, after p3c, 2d
|
||||
Buffer & Review :p3e, after p3d, 4d
|
||||
|
||||
section Phase 4
|
||||
Consensus Tracing :p4, after p3, 2w
|
||||
Consensus Round Spans :p4a, after p3, 3d
|
||||
Proposal Handling :p4b, after p4a, 3d
|
||||
Validation Tests :p4c, after p4b, 4d
|
||||
Validator List & Manifest Tracing :p4f, after p4b, 2d
|
||||
Amendment Voting Tracing :p4g, after p4f, 2d
|
||||
SHAMap Sync Tracing :p4h, after p4g, 2d
|
||||
Validation Tests :p4c, after p4h, 4d
|
||||
Buffer & Review :p4e, after p4c, 4d
|
||||
|
||||
section Phase 5
|
||||
Documentation & Deploy :p5, after p4, 1w
|
||||
@@ -75,20 +87,24 @@ gantt
|
||||
|
||||
## 6.3 Phase 2: RPC Tracing (Weeks 3-4)
|
||||
|
||||
> **TxQ** = Transaction Queue
|
||||
|
||||
**Objective**: Complete tracing for all RPC operations
|
||||
|
||||
### Tasks
|
||||
|
||||
| Task | Description |
|
||||
| ---- | -------------------------------------------------- |
|
||||
| 2.1 | Implement W3C Trace Context HTTP header extraction |
|
||||
| 2.2 | Instrument `ServerHandler::onRequest()` |
|
||||
| 2.3 | Instrument `RPCHandler::doCommand()` |
|
||||
| 2.4 | Add RPC-specific attributes |
|
||||
| 2.5 | Instrument WebSocket handler |
|
||||
| 2.6 | Integration tests for RPC tracing |
|
||||
| 2.7 | Performance benchmarks |
|
||||
| 2.8 | Documentation |
|
||||
| Task | Description |
|
||||
| ---- | -------------------------------------------------------------------------- |
|
||||
| 2.1 | Implement W3C Trace Context HTTP header extraction |
|
||||
| 2.2 | Instrument `ServerHandler::onRequest()` |
|
||||
| 2.3 | Instrument `RPCHandler::doCommand()` |
|
||||
| 2.4 | Add RPC-specific attributes |
|
||||
| 2.5 | Instrument WebSocket handler |
|
||||
| 2.6 | PathFinding instrumentation (`pathfind.request`, `pathfind.compute` spans) |
|
||||
| 2.7 | TxQ instrumentation (`txq.enqueue`, `txq.apply` spans) |
|
||||
| 2.8 | Integration tests for RPC tracing |
|
||||
| 2.9 | Performance benchmarks |
|
||||
| 2.10 | Documentation |
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
@@ -106,16 +122,17 @@ gantt
|
||||
|
||||
### Tasks
|
||||
|
||||
| Task | Description |
|
||||
| ---- | --------------------------------------------- |
|
||||
| 3.1 | Define `TraceContext` Protocol Buffer message |
|
||||
| 3.2 | Implement protobuf context serialization |
|
||||
| 3.3 | Instrument `PeerImp::handleTransaction()` |
|
||||
| 3.4 | Instrument `NetworkOPs::submitTransaction()` |
|
||||
| 3.5 | Instrument HashRouter integration |
|
||||
| 3.6 | Implement relay context propagation |
|
||||
| 3.7 | Integration tests (multi-node) |
|
||||
| 3.8 | Performance benchmarks |
|
||||
| Task | Description |
|
||||
| ---- | ---------------------------------------------------- |
|
||||
| 3.1 | Define `TraceContext` Protocol Buffer message |
|
||||
| 3.2 | Implement protobuf context serialization |
|
||||
| 3.3 | Instrument `PeerImp::handleTransaction()` |
|
||||
| 3.4 | Instrument `NetworkOPs::submitTransaction()` |
|
||||
| 3.5 | Instrument HashRouter integration |
|
||||
| 3.6 | Fee escalation instrumentation (`fee.escalate` span) |
|
||||
| 3.7 | Implement relay context propagation |
|
||||
| 3.8 | Integration tests (multi-node) |
|
||||
| 3.9 | Performance benchmarks |
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
@@ -141,8 +158,11 @@ gantt
|
||||
| 4.4 | Instrument validation handling |
|
||||
| 4.5 | Add consensus-specific attributes |
|
||||
| 4.6 | Correlate with transaction traces |
|
||||
| 4.7 | Multi-validator integration tests |
|
||||
| 4.8 | Performance validation |
|
||||
| 4.7 | Validator list and manifest tracing |
|
||||
| 4.8 | Amendment voting tracing |
|
||||
| 4.9 | SHAMap sync tracing |
|
||||
| 4.10 | Multi-validator integration tests |
|
||||
| 4.11 | Performance validation |
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
@@ -159,6 +179,9 @@ Phase 4a (establish-phase gap fill & cross-node correlation) adds:
|
||||
- **Deterministic trace ID** derived from `previousLedger.id()` so all validators
|
||||
in the same round share the same `trace_id` (switchable via
|
||||
`consensus_trace_strategy` config: `"deterministic"` or `"attribute"`).
|
||||
See [Configuration Reference](./05-configuration-reference.md) for full
|
||||
configuration options. The `consensus_trace_strategy` option will be
|
||||
documented in the configuration reference as part of Phase 4a implementation.
|
||||
- **Round lifecycle spans**: `consensus.round` with round-to-round span links.
|
||||
- **Establish phase**: `consensus.establish`, `consensus.update_positions` (with
|
||||
`dispute.resolve` events), `consensus.check` (with threshold tracking).
|
||||
@@ -198,16 +221,16 @@ quadrantChart
|
||||
title Risk Assessment Matrix
|
||||
x-axis Low Impact --> High Impact
|
||||
y-axis Low Likelihood --> High Likelihood
|
||||
quadrant-1 Monitor Closely
|
||||
quadrant-2 Mitigate Immediately
|
||||
quadrant-1 Mitigate Immediately
|
||||
quadrant-2 Plan Mitigation
|
||||
quadrant-3 Accept Risk
|
||||
quadrant-4 Plan Mitigation
|
||||
quadrant-4 Monitor Closely
|
||||
|
||||
SDK Compatibility: [0.25, 0.2]
|
||||
Protocol Changes: [0.75, 0.65]
|
||||
Performance Overhead: [0.65, 0.45]
|
||||
Context Propagation: [0.5, 0.5]
|
||||
Memory Leaks: [0.8, 0.2]
|
||||
SDK Compat: [0.2, 0.18]
|
||||
Protocol Chg: [0.75, 0.72]
|
||||
Perf Overhead: [0.58, 0.42]
|
||||
Context Prop: [0.4, 0.55]
|
||||
Memory Leaks: [0.85, 0.25]
|
||||
```
|
||||
|
||||
### Risk Details
|
||||
@@ -224,19 +247,21 @@ quadrantChart
|
||||
|
||||
## 6.8 Success Metrics
|
||||
|
||||
| Metric | Target | Measurement |
|
||||
| ------------------------ | ------------------------------ | --------------------- |
|
||||
| Trace coverage | >95% of transactions | Sampling verification |
|
||||
| CPU overhead | <3% | Benchmark tests |
|
||||
| Memory overhead | <5 MB | Memory profiling |
|
||||
| Latency impact (p99) | <2% | Performance tests |
|
||||
| Trace completeness | >99% spans with required attrs | Validation script |
|
||||
| Cross-node trace linkage | >90% of multi-hop transactions | Integration tests |
|
||||
| Metric | Target | Measurement |
|
||||
| ------------------------ | -------------------------------------------------------------- | --------------------- |
|
||||
| Trace coverage | >95% of transaction code paths (independent of sampling ratio) | Sampling verification |
|
||||
| CPU overhead | <3% | Benchmark tests |
|
||||
| Memory overhead | <10 MB | Memory profiling |
|
||||
| Latency impact (p99) | <2% | Performance tests |
|
||||
| Trace completeness | >99% spans with required attrs | Validation script |
|
||||
| Cross-node trace linkage | >90% of multi-hop transactions | Integration tests |
|
||||
|
||||
---
|
||||
|
||||
## 6.9 Quick Wins and Crawl-Walk-Run Strategy
|
||||
|
||||
> **TxQ** = Transaction Queue
|
||||
|
||||
This section outlines a prioritized approach to maximize ROI with minimal initial investment.
|
||||
|
||||
### 6.9.1 Crawl-Walk-Run Overview
|
||||
@@ -247,17 +272,17 @@ This section outlines a prioritized approach to maximize ROI with minimal initia
|
||||
flowchart TB
|
||||
subgraph crawl["🐢 CRAWL (Week 1-2)"]
|
||||
direction LR
|
||||
c1[Core SDK Setup] ~~~ c2[RPC Tracing Only] ~~~ c3[Single Node]
|
||||
c1[Core SDK Setup] ~~~ c2[RPC Tracing Only] ~~~ c3[PathFinding + TxQ Tracing] ~~~ c4[Single Node]
|
||||
end
|
||||
|
||||
subgraph walk["🚶 WALK (Week 3-5)"]
|
||||
direction LR
|
||||
w1[Transaction Tracing] ~~~ w2[Cross-Node Context] ~~~ w3[Basic Dashboards]
|
||||
w1[Transaction Tracing] ~~~ w2[Fee Escalation Tracing] ~~~ w3[Cross-Node Context] ~~~ w4[Basic Dashboards]
|
||||
end
|
||||
|
||||
subgraph run["🏃 RUN (Week 6-9)"]
|
||||
direction LR
|
||||
r1[Consensus Tracing] ~~~ r2[Full Correlation] ~~~ r3[Production Deploy]
|
||||
r1[Consensus Tracing] ~~~ r2[Validator, Amendment,<br/>SHAMap Tracing] ~~~ r3[Full Correlation] ~~~ r4[Production Deploy]
|
||||
end
|
||||
|
||||
crawl --> walk --> run
|
||||
@@ -268,16 +293,26 @@ flowchart TB
|
||||
style c1 fill:#1b5e20,stroke:#0d3d14,color:#fff
|
||||
style c2 fill:#1b5e20,stroke:#0d3d14,color:#fff
|
||||
style c3 fill:#1b5e20,stroke:#0d3d14,color:#fff
|
||||
style c4 fill:#1b5e20,stroke:#0d3d14,color:#fff
|
||||
style w1 fill:#ffe0b2,stroke:#ffcc80,color:#1e293b
|
||||
style w2 fill:#ffe0b2,stroke:#ffcc80,color:#1e293b
|
||||
style w3 fill:#ffe0b2,stroke:#ffcc80,color:#1e293b
|
||||
style w4 fill:#ffe0b2,stroke:#ffcc80,color:#1e293b
|
||||
style r1 fill:#0d47a1,stroke:#082f6a,color:#fff
|
||||
style r2 fill:#0d47a1,stroke:#082f6a,color:#fff
|
||||
style r3 fill:#0d47a1,stroke:#082f6a,color:#fff
|
||||
style r4 fill:#0d47a1,stroke:#082f6a,color:#fff
|
||||
```
|
||||
|
||||
</div>
|
||||
|
||||
**Reading the diagram:**
|
||||
|
||||
- **CRAWL (Weeks 1-2)**: Minimal investment -- set up the SDK, instrument RPC and PathFinding/TxQ handlers, and verify on a single node. Delivers immediate latency visibility.
|
||||
- **WALK (Weeks 3-5)**: Expand to transaction lifecycle tracing, fee escalation, cross-node context propagation, and basic Grafana dashboards. This is where distributed tracing starts working.
|
||||
- **RUN (Weeks 6-9)**: Full consensus instrumentation, validator/amendment/SHAMap tracing, end-to-end correlation, and production deployment with sampling and alerting.
|
||||
- **Arrows (crawl → walk → run)**: Each phase builds on the prior one; you cannot skip ahead because later phases depend on infrastructure established earlier.
|
||||
|
||||
### 6.9.2 Quick Wins (Immediate Value)
|
||||
|
||||
| Quick Win | Value | When to Deploy |
|
||||
@@ -296,6 +331,7 @@ flowchart TB
|
||||
|
||||
- RPC request/response traces for all commands
|
||||
- Latency breakdown per RPC command
|
||||
- PathFinding and TxQ tracing (directly impacts RPC latency)
|
||||
- Error visibility with stack traces
|
||||
- Basic Grafana dashboard
|
||||
|
||||
@@ -304,6 +340,7 @@ flowchart TB
|
||||
**Why Start Here**:
|
||||
|
||||
- RPC is the lowest-risk, highest-visibility component
|
||||
- PathFinding and TxQ are RPC-adjacent and directly affect latency
|
||||
- Immediate value for debugging client issues
|
||||
- No cross-node complexity
|
||||
- Single file modification to existing code
|
||||
@@ -315,6 +352,7 @@ flowchart TB
|
||||
**What You Get**:
|
||||
|
||||
- End-to-end transaction traces from submit to relay
|
||||
- Fee escalation tracing within the transaction pipeline
|
||||
- Cross-node correlation (see transaction path)
|
||||
- HashRouter deduplication visibility
|
||||
- Relay latency metrics
|
||||
@@ -324,6 +362,7 @@ flowchart TB
|
||||
**Why Do This Second**:
|
||||
|
||||
- Builds on RPC tracing (transactions submitted via RPC)
|
||||
- Fee escalation is integral to the transaction processing pipeline
|
||||
- Moderate complexity (requires context propagation)
|
||||
- High value for debugging transaction issues
|
||||
|
||||
@@ -336,13 +375,17 @@ flowchart TB
|
||||
- Complete consensus round visibility
|
||||
- Phase transition timing
|
||||
- Validator proposal tracking
|
||||
- Validator list and manifest tracing
|
||||
- Amendment voting tracing
|
||||
- SHAMap sync tracing
|
||||
- Full end-to-end traces (client → RPC → TX → consensus → ledger)
|
||||
|
||||
**Code Changes**: ~100 lines across 3 consensus files
|
||||
**Code Changes**: ~100 lines across 3 consensus files, plus validator/amendment/SHAMap modules
|
||||
|
||||
**Why Do This Last**:
|
||||
|
||||
- Highest complexity (consensus is critical path)
|
||||
- Validator, amendment, and SHAMap components are lower priority
|
||||
- Requires thorough testing
|
||||
- Lower relative value (consensus issues are rarer)
|
||||
|
||||
@@ -358,33 +401,35 @@ quadrantChart
|
||||
quadrant-3 Nice to Have - Optional
|
||||
quadrant-4 Time Sinks - Avoid
|
||||
|
||||
RPC Tracing: [0.15, 0.9]
|
||||
TX Submit Trace: [0.25, 0.85]
|
||||
TX Relay Trace: [0.5, 0.8]
|
||||
Consensus Trace: [0.7, 0.75]
|
||||
Peer Message Trace: [0.85, 0.3]
|
||||
Ledger Acquire: [0.55, 0.5]
|
||||
RPC Tracing: [0.15, 0.92]
|
||||
TX Submit Trace: [0.3, 0.78]
|
||||
TX Relay Trace: [0.5, 0.88]
|
||||
Consensus Trace: [0.72, 0.72]
|
||||
Peer Msg Trace: [0.85, 0.3]
|
||||
Ledger Acquire: [0.55, 0.52]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6.11 Definition of Done
|
||||
## 6.10 Definition of Done
|
||||
|
||||
> **TxQ** = Transaction Queue | **HA** = High Availability
|
||||
|
||||
Clear, measurable criteria for each phase.
|
||||
|
||||
### 6.11.1 Phase 1: Core Infrastructure
|
||||
### 6.10.1 Phase 1: Core Infrastructure
|
||||
|
||||
| Criterion | Measurement | Target |
|
||||
| --------------- | ---------------------------------------------------------- | ---------------------------- |
|
||||
| SDK Integration | `cmake --build` succeeds with `-DXRPL_ENABLE_TELEMETRY=ON` | ✅ Compiles |
|
||||
| Runtime Toggle | `enabled=0` produces zero overhead | <0.1% CPU difference |
|
||||
| Span Creation | Unit test creates and exports span | Span appears in Jaeger |
|
||||
| Span Creation | Unit test creates and exports span | Span appears in Tempo |
|
||||
| Configuration | All config options parsed correctly | Config validation tests pass |
|
||||
| Documentation | Developer guide exists | PR approved |
|
||||
|
||||
**Definition of Done**: All criteria met, PR merged, no regressions in CI.
|
||||
|
||||
### 6.11.2 Phase 2: RPC Tracing
|
||||
### 6.10.2 Phase 2: RPC Tracing
|
||||
|
||||
| Criterion | Measurement | Target |
|
||||
| ------------------ | ---------------------------------- | -------------------------- |
|
||||
@@ -394,9 +439,9 @@ Clear, measurable criteria for each phase.
|
||||
| Performance | RPC latency overhead | <1ms p99 |
|
||||
| Dashboard | Grafana dashboard deployed | Screenshot in docs |
|
||||
|
||||
**Definition of Done**: RPC traces visible in Jaeger/Tempo for all commands, dashboard shows latency distribution.
|
||||
**Definition of Done**: RPC traces visible in Tempo for all commands, dashboard shows latency distribution.
|
||||
|
||||
### 6.11.3 Phase 3: Transaction Tracing
|
||||
### 6.10.3 Phase 3: Transaction Tracing
|
||||
|
||||
| Criterion | Measurement | Target |
|
||||
| ---------------- | ------------------------------- | ---------------------------------- |
|
||||
@@ -408,7 +453,7 @@ Clear, measurable criteria for each phase.
|
||||
|
||||
**Definition of Done**: Transaction traces span 3+ nodes in test network, performance within bounds.
|
||||
|
||||
### 6.11.4 Phase 4: Consensus Tracing
|
||||
### 6.10.4 Phase 4: Consensus Tracing
|
||||
|
||||
| Criterion | Measurement | Target |
|
||||
| -------------------- | ----------------------------- | ------------------------- |
|
||||
@@ -420,7 +465,7 @@ Clear, measurable criteria for each phase.
|
||||
|
||||
**Definition of Done**: Consensus rounds fully traceable, no impact on consensus timing.
|
||||
|
||||
### 6.11.5 Phase 5: Production Deployment
|
||||
### 6.10.5 Phase 5: Production Deployment
|
||||
|
||||
| Criterion | Measurement | Target |
|
||||
| ------------ | ---------------------------- | -------------------------- |
|
||||
@@ -433,7 +478,7 @@ Clear, measurable criteria for each phase.
|
||||
|
||||
**Definition of Done**: Telemetry running in production, operators trained, alerts active.
|
||||
|
||||
### 6.11.6 Success Metrics Summary
|
||||
### 6.10.6 Success Metrics Summary
|
||||
|
||||
| Phase | Primary Metric | Secondary Metric | Deadline |
|
||||
| ------- | ---------------------- | --------------------------- | ------------- |
|
||||
@@ -458,7 +503,7 @@ flowchart TB
|
||||
|
||||
subgraph week2["Week 2"]
|
||||
t3[3. RPC ServerHandler<br/>instrumentation]
|
||||
t4[4. Basic Jaeger setup<br/>for testing]
|
||||
t4[4. Basic Tempo setup<br/>for testing]
|
||||
end
|
||||
|
||||
subgraph week3["Week 3"]
|
||||
@@ -516,6 +561,15 @@ flowchart TB
|
||||
style t14 fill:#4a148c,stroke:#2e0d57,color:#fff
|
||||
```
|
||||
|
||||
**Reading the diagram:**
|
||||
|
||||
- **Week 1 (tasks 1-2)**: Foundation work -- integrate the OpenTelemetry SDK via Conan/CMake and build the `Telemetry` interface with `SpanGuard` and config parsing.
|
||||
- **Week 2 (tasks 3-4)**: First observable output -- instrument `ServerHandler` for RPC tracing and stand up Tempo so developers can see traces immediately.
|
||||
- **Weeks 3-5 (tasks 5-10)**: Transaction lifecycle -- add submit tracing, build the first Grafana dashboard, extend protobuf for cross-node context, instrument `PeerImp` relay, then validate with multi-node integration tests and performance benchmarks.
|
||||
- **Weeks 6-8 (tasks 11-12)**: Consensus deep-dive -- instrument consensus rounds and phases, then run full integration testing across all instrumented paths.
|
||||
- **Week 9 (tasks 13-14)**: Go-live -- deploy to production with sampling/alerting configured, and deliver documentation and operator training.
|
||||
- **Arrow chain (t1 → ... → t14)**: Strict sequential dependency; each task's output is a prerequisite for the next.
|
||||
|
||||
---
|
||||
|
||||
_Previous: [Configuration Reference](./05-configuration-reference.md)_ | _Next: [Observability Backends](./07-observability-backends.md)_ | _Back to: [Overview](./OpenTelemetryPlan.md)_
|
||||
|
||||
@@ -7,33 +7,36 @@
|
||||
|
||||
## 7.1 Development/Testing Backends
|
||||
|
||||
| Backend | Pros | Cons | Use Case |
|
||||
| ---------- | ------------------- | ----------------- | ----------------- |
|
||||
| **Jaeger** | Easy setup, good UI | Limited retention | Local dev, CI |
|
||||
| **Zipkin** | Simple, lightweight | Basic features | Quick prototyping |
|
||||
> **OTLP** = OpenTelemetry Protocol
|
||||
|
||||
### Quick Start with Jaeger
|
||||
| Backend | Pros | Cons | Use Case |
|
||||
| ---------- | ----------------------------------- | ---------------------- | ------------------- |
|
||||
| **Tempo** | Cost-effective, Grafana integration | Requires Grafana stack | Local dev, CI, Prod |
|
||||
| **Zipkin** | Simple, lightweight | Basic features | Quick prototyping |
|
||||
|
||||
### Quick Start with Tempo
|
||||
|
||||
```bash
|
||||
# Start Jaeger with OTLP support
|
||||
docker run -d --name jaeger \
|
||||
-e COLLECTOR_OTLP_ENABLED=true \
|
||||
-p 16686:16686 \
|
||||
# Start Tempo with OTLP support
|
||||
docker run -d --name tempo \
|
||||
-p 3200:3200 \
|
||||
-p 4317:4317 \
|
||||
-p 4318:4318 \
|
||||
jaegertracing/all-in-one:latest
|
||||
grafana/tempo:2.6.1
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7.2 Production Backends
|
||||
|
||||
| Backend | Pros | Cons | Use Case |
|
||||
| ----------------- | ----------------------------------------- | ------------------ | --------------------------- |
|
||||
| **Grafana Tempo** | Cost-effective, Grafana integration | Newer project | Most production deployments |
|
||||
| **Elastic APM** | Full observability stack, log correlation | Resource intensive | Existing Elastic users |
|
||||
| **Honeycomb** | Excellent query, high cardinality | SaaS cost | Deep debugging needs |
|
||||
| **Datadog APM** | Full platform, easy setup | SaaS cost | Enterprise with budget |
|
||||
> **APM** = Application Performance Monitoring
|
||||
|
||||
| Backend | Pros | Cons | Use Case |
|
||||
| ----------------- | ----------------------------------------- | ---------------------- | --------------------------- |
|
||||
| **Grafana Tempo** | Cost-effective, Grafana integration | Requires Grafana stack | Most production deployments |
|
||||
| **Elastic APM** | Full observability stack, log correlation | Resource intensive | Existing Elastic users |
|
||||
| **Honeycomb** | Excellent query, high cardinality | SaaS cost | Deep debugging needs |
|
||||
| **Datadog APM** | Full platform, easy setup | SaaS cost | Enterprise with budget |
|
||||
|
||||
### Backend Selection Flowchart
|
||||
|
||||
@@ -73,10 +76,19 @@ flowchart TD
|
||||
style datadog fill:#4a148c,stroke:#2e0d57,color:#fff
|
||||
```
|
||||
|
||||
**Reading the diagram:**
|
||||
|
||||
- **Budget Constraints? (Yes)**: Leads to open-source options. If you already run Grafana or Elastic, pick the matching backend; otherwise default to Grafana Tempo.
|
||||
- **Budget Constraints? (No) → Prefer SaaS?**: If you want a managed service, choose between Datadog (enterprise support) and Honeycomb (developer-focused). If not, fall back to open-source.
|
||||
- **Terminal nodes (Tempo / Elastic / Honeycomb / Datadog)**: Each represents a concrete backend choice, all of which feed into the same final step.
|
||||
- **Configure Collector**: Regardless of backend, you always finish by configuring the OTel Collector to export to your chosen destination.
|
||||
|
||||
---
|
||||
|
||||
## 7.3 Recommended Production Architecture
|
||||
|
||||
> **OTLP** = OpenTelemetry Protocol | **APM** = Application Performance Monitoring | **HA** = High Availability
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
subgraph validators["Validator Nodes"]
|
||||
@@ -117,6 +129,8 @@ flowchart TB
|
||||
tempo --> grafana
|
||||
elastic --> grafana
|
||||
|
||||
%% Note: simplified single-collector-per-DC topology shown for clarity
|
||||
|
||||
style validators fill:#b71c1c,stroke:#7f1d1d,color:#ffffff
|
||||
style stock fill:#0d47a1,stroke:#082f6a,color:#ffffff
|
||||
style collector fill:#bf360c,stroke:#8c2809,color:#ffffff
|
||||
@@ -124,6 +138,16 @@ flowchart TB
|
||||
style ui fill:#4a148c,stroke:#2e0d57,color:#ffffff
|
||||
```
|
||||
|
||||
**Reading the diagram:**
|
||||
|
||||
- **Validator / Stock Nodes**: All rippled nodes emit trace data via OTLP. Validators and stock nodes are grouped separately because they may reside in different network zones.
|
||||
- **Collector Cluster (DC1, DC2)**: Regional collectors receive OTLP from nodes in their datacenter, apply processing (sampling, enrichment), and fan out to multiple backends.
|
||||
- **Storage Backends**: Tempo and Elastic provide queryable trace storage; S3/GCS Archive provides long-term cold storage for compliance or post-incident analysis.
|
||||
- **Grafana Dashboards**: The single visualization layer that queries both Tempo and Elastic, giving operators a unified view of all traces.
|
||||
- **Data flow direction**: Nodes → Collectors → Storage → Grafana. Each arrow represents a network hop; minimizing collector-to-backend hops reduces latency.
|
||||
|
||||
> **Note**: Production deployments should use multiple collector instances behind a load balancer for high availability. The diagram shows a simplified single-collector topology for clarity.
|
||||
|
||||
---
|
||||
|
||||
## 7.4 Architecture Considerations
|
||||
@@ -147,7 +171,7 @@ flowchart TB
|
||||
```mermaid
|
||||
flowchart LR
|
||||
subgraph head["Head Sampling (Node)"]
|
||||
hs[10% probabilistic]
|
||||
hs[Node-level head sampling<br/>configurable, default: 100%<br/>recommended production: 10%]
|
||||
end
|
||||
|
||||
subgraph tail["Tail Sampling (Collector)"]
|
||||
@@ -171,6 +195,13 @@ flowchart LR
|
||||
style final fill:#bf360c,stroke:#8c2809,color:#fff
|
||||
```
|
||||
|
||||
**Reading the diagram:**
|
||||
|
||||
- **Head Sampling (Node)**: The first filter -- each rippled node decides whether to sample a trace at creation time (default 100%, recommended 10% in production). This controls the volume leaving the node.
|
||||
- **Tail Sampling (Collector)**: The second filter -- the collector inspects completed traces and applies rules: keep all errors, keep anything slower than 5 seconds, and keep 10% of the remainder.
|
||||
- **Arrow head → tail**: All head-sampled traces flow to the collector, where tail sampling further reduces volume while preserving the most valuable data.
|
||||
- **Final Traces**: The output after both sampling stages; this is what gets stored and queried. The two-stage approach balances cost with debuggability.
|
||||
|
||||
### 7.4.3 Data Retention
|
||||
|
||||
| Environment | Hot Storage | Warm Storage | Cold Archive |
|
||||
@@ -355,6 +386,9 @@ groups:
|
||||
model:
|
||||
queryType: traceql
|
||||
query: '{resource.service.name="rippled" && name="consensus.round"} | avg(duration) > 5s'
|
||||
# Note: Verify TraceQL aggregate queries are supported by your
|
||||
# Tempo version. Aggregate alerting (e.g., avg(duration)) requires
|
||||
# Tempo 2.3+ with TraceQL metrics enabled.
|
||||
for: 5m
|
||||
annotations:
|
||||
summary: Consensus rounds taking >5 seconds
|
||||
@@ -371,6 +405,9 @@ groups:
|
||||
model:
|
||||
queryType: traceql
|
||||
query: '{resource.service.name="rippled" && name=~"rpc.command.*" && status.code=error} | rate() > 0.05'
|
||||
# Note: Verify TraceQL aggregate queries are supported by your
|
||||
# Tempo version. Aggregate alerting (e.g., rate()) requires
|
||||
# Tempo 2.3+ with TraceQL metrics enabled.
|
||||
for: 2m
|
||||
annotations:
|
||||
summary: RPC error rate >5%
|
||||
@@ -397,6 +434,8 @@ groups:
|
||||
|
||||
## 7.7 PerfLog and Insight Correlation
|
||||
|
||||
> **OTLP** = OpenTelemetry Protocol
|
||||
|
||||
How to correlate OpenTelemetry traces with existing rippled observability.
|
||||
|
||||
### 7.7.1 Correlation Architecture
|
||||
@@ -459,6 +498,13 @@ flowchart TB
|
||||
style corr fill:#4a148c,stroke:#2e0d57,color:#fff
|
||||
```
|
||||
|
||||
**Reading the diagram:**
|
||||
|
||||
- **rippled Node (three sources)**: A single node emits three independent data streams -- OpenTelemetry spans, PerfLog JSON logs, and Beast Insight StatsD metrics.
|
||||
- **Data Collection layer**: Each stream has its own collector -- OTel Collector for spans, Promtail/Fluentd for logs, and a StatsD exporter for metrics. They operate independently.
|
||||
- **Storage layer (Tempo, Loki, Prometheus)**: Each data type lands in a purpose-built store optimized for its query patterns (trace search, log grep, metric aggregation).
|
||||
- **Grafana Correlation Panel**: The key integration point -- Grafana queries all three stores and links them via shared fields (`trace_id`, `xrpl.tx.hash`, `ledger_seq`), enabling a single-pane debugging experience.
|
||||
|
||||
### 7.7.2 Correlation Fields
|
||||
|
||||
| Source | Field | Link To | Purpose |
|
||||
|
||||
@@ -7,6 +7,8 @@
|
||||
|
||||
## 8.1 Glossary
|
||||
|
||||
> **OTLP** = OpenTelemetry Protocol | **TxQ** = Transaction Queue
|
||||
|
||||
| Term | Definition |
|
||||
| --------------------- | ---------------------------------------------------------- |
|
||||
| **Span** | A unit of work with start/end time, name, and attributes |
|
||||
@@ -26,25 +28,31 @@
|
||||
|
||||
### rippled-Specific Terms
|
||||
|
||||
| Term | Definition |
|
||||
| ----------------- | -------------------------------------------------- |
|
||||
| **Overlay** | P2P network layer managing peer connections |
|
||||
| **Consensus** | XRP Ledger consensus algorithm (RCL) |
|
||||
| **Proposal** | Validator's suggested transaction set for a ledger |
|
||||
| **Validation** | Validator's signature on a closed ledger |
|
||||
| **HashRouter** | Component for transaction deduplication |
|
||||
| **JobQueue** | Thread pool for asynchronous task execution |
|
||||
| **PerfLog** | Existing performance logging system in rippled |
|
||||
| **Beast Insight** | Existing metrics framework in rippled |
|
||||
| Term | Definition |
|
||||
| ----------------- | ------------------------------------------------------------- |
|
||||
| **Overlay** | P2P network layer managing peer connections |
|
||||
| **Consensus** | XRP Ledger consensus algorithm (RCL) |
|
||||
| **Proposal** | Validator's suggested transaction set for a ledger |
|
||||
| **Validation** | Validator's signature on a closed ledger |
|
||||
| **HashRouter** | Component for transaction deduplication |
|
||||
| **JobQueue** | Thread pool for asynchronous task execution |
|
||||
| **PerfLog** | Existing performance logging system in rippled |
|
||||
| **Beast Insight** | Existing metrics framework in rippled |
|
||||
| **PathFinding** | Payment path computation engine for cross-currency payments |
|
||||
| **TxQ** | Transaction queue managing fee-based prioritization |
|
||||
| **LoadManager** | Dynamic fee escalation based on network load |
|
||||
| **SHAMap** | SHA-256 hash-based map (Merkle trie variant) for ledger state |
|
||||
|
||||
---
|
||||
|
||||
## 8.2 Span Hierarchy Visualization
|
||||
|
||||
> **TxQ** = Transaction Queue
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
subgraph trace["Trace: Transaction Lifecycle"]
|
||||
rpc["rpc.submit<br/>(entry point)"]
|
||||
rpc["rpc.request<br/>(entry point)"]
|
||||
validate["tx.validate"]
|
||||
relay["tx.relay<br/>(parent span)"]
|
||||
|
||||
@@ -54,20 +62,45 @@ flowchart TB
|
||||
p3["peer.send<br/>Peer C"]
|
||||
end
|
||||
|
||||
subgraph pathfinding["PathFinding Spans"]
|
||||
pathfind["pathfind.request"]
|
||||
pathcomp["pathfind.compute"]
|
||||
end
|
||||
|
||||
consensus["consensus.round"]
|
||||
apply["tx.apply"]
|
||||
|
||||
subgraph txqueue["TxQ Spans"]
|
||||
txq["txq.enqueue"]
|
||||
txqApply["txq.apply"]
|
||||
end
|
||||
|
||||
feeCalc["fee.escalate"]
|
||||
end
|
||||
|
||||
subgraph validators["Validator Spans"]
|
||||
valFetch["validator.list.fetch"]
|
||||
valManifest["validator.manifest"]
|
||||
end
|
||||
|
||||
rpc --> validate
|
||||
rpc --> pathfind
|
||||
pathfind --> pathcomp
|
||||
validate --> relay
|
||||
relay --> p1
|
||||
relay --> p2
|
||||
relay --> p3
|
||||
p1 -.->|"context propagation"| consensus
|
||||
consensus --> apply
|
||||
apply --> txq
|
||||
txq --> txqApply
|
||||
txq --> feeCalc
|
||||
|
||||
style trace fill:#0f172a,stroke:#020617,color:#fff
|
||||
style peers fill:#1e3a8a,stroke:#172554,color:#fff
|
||||
style pathfinding fill:#134e4a,stroke:#0f766e,color:#fff
|
||||
style txqueue fill:#064e3b,stroke:#047857,color:#fff
|
||||
style validators fill:#4c1d95,stroke:#6d28d9,color:#fff
|
||||
style rpc fill:#1d4ed8,stroke:#1e40af,color:#fff
|
||||
style validate fill:#047857,stroke:#064e3b,color:#fff
|
||||
style relay fill:#047857,stroke:#064e3b,color:#fff
|
||||
@@ -76,12 +109,30 @@ flowchart TB
|
||||
style p3 fill:#0e7490,stroke:#155e75,color:#fff
|
||||
style consensus fill:#fef3c7,stroke:#fde68a,color:#1e293b
|
||||
style apply fill:#047857,stroke:#064e3b,color:#fff
|
||||
style pathfind fill:#0e7490,stroke:#155e75,color:#fff
|
||||
style pathcomp fill:#0e7490,stroke:#155e75,color:#fff
|
||||
style txq fill:#047857,stroke:#064e3b,color:#fff
|
||||
style txqApply fill:#047857,stroke:#064e3b,color:#fff
|
||||
style feeCalc fill:#047857,stroke:#064e3b,color:#fff
|
||||
style valFetch fill:#6d28d9,stroke:#4c1d95,color:#fff
|
||||
style valManifest fill:#6d28d9,stroke:#4c1d95,color:#fff
|
||||
```
|
||||
|
||||
**Reading the diagram:**
|
||||
|
||||
- **rpc.request (blue, top)**: The entry point — every traced transaction starts as an RPC call; this root span is the parent of all downstream work.
|
||||
- **tx.validate and pathfind.request (green/teal, first fork)**: The RPC request fans out into transaction validation and, for cross-currency payments, a PathFinding branch (`pathfind.request` -> `pathfind.compute`).
|
||||
- **tx.relay -> Peer Spans (teal, middle)**: After validation, the transaction is relayed to peers A, B, and C in parallel; each `peer.send` is a sibling child span showing fan-out across the network.
|
||||
- **context propagation (dashed arrow)**: The dotted line from `peer.send Peer A` to `consensus.round` represents the trace context crossing a node boundary — the receiving validator picks up the same `trace_id` and continues the trace.
|
||||
- **consensus.round -> tx.apply -> TxQ Spans (green, lower)**: Once consensus accepts the transaction, it is applied to the ledger; the TxQ spans (`txq.enqueue`, `txq.apply`, `fee.escalate`) capture queue depth and fee escalation behavior.
|
||||
- **Validator Spans (purple, detached)**: `validator.list.fetch` and `validator.manifest` are independent workflows for UNL management — they run on their own traces and are linked to consensus via Span Links, not parent-child relationships.
|
||||
|
||||
---
|
||||
|
||||
## 8.3 References
|
||||
|
||||
> **OTLP** = OpenTelemetry Protocol
|
||||
|
||||
### OpenTelemetry Resources
|
||||
|
||||
1. [OpenTelemetry C++ SDK](https://github.com/open-telemetry/opentelemetry-cpp)
|
||||
@@ -107,10 +158,11 @@ flowchart TB
|
||||
|
||||
## 8.4 Version History
|
||||
|
||||
| Version | Date | Author | Changes |
|
||||
| ------- | ---------- | ------ | --------------------------------- |
|
||||
| 1.0 | 2026-02-12 | - | Initial implementation plan |
|
||||
| 1.1 | 2026-02-13 | - | Refactored into modular documents |
|
||||
| Version | Date | Author | Changes |
|
||||
| ------- | ---------- | ------ | -------------------------------------------------------------- |
|
||||
| 1.0 | 2026-02-12 | - | Initial implementation plan |
|
||||
| 1.1 | 2026-02-13 | - | Refactored into modular documents |
|
||||
| 1.2 | 2026-03-24 | - | Review fixes: accuracy corrections, cross-document consistency |
|
||||
|
||||
---
|
||||
|
||||
@@ -133,9 +185,10 @@ flowchart TB
|
||||
|
||||
### Task Lists
|
||||
|
||||
| Document | Description |
|
||||
| ------------------------------------ | -------------------------------------- |
|
||||
| [POC_taskList.md](./POC_taskList.md) | Proof-of-concept telemetry integration |
|
||||
| Document | Description |
|
||||
| ------------------------------------ | --------------------------------------------------- |
|
||||
| [POC_taskList.md](./POC_taskList.md) | Proof-of-concept telemetry integration |
|
||||
| [presentation.md](./presentation.md) | Presentation slides for OpenTelemetry plan overview |
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -2,6 +2,8 @@
|
||||
|
||||
## Executive Summary
|
||||
|
||||
> **OTLP** = OpenTelemetry Protocol
|
||||
|
||||
This document provides a comprehensive implementation plan for integrating OpenTelemetry distributed tracing into the rippled XRP Ledger node software. The plan addresses the unique challenges of a decentralized peer-to-peer system where trace context must propagate across network boundaries between independent nodes.
|
||||
|
||||
### Key Benefits
|
||||
@@ -33,6 +35,10 @@ This implementation plan is organized into modular documents for easier navigati
|
||||
flowchart TB
|
||||
overview["📋 OpenTelemetryPlan.md<br/>(This Document)"]
|
||||
|
||||
subgraph fundamentals["Fundamentals"]
|
||||
fund["00-tracing-fundamentals.md"]
|
||||
end
|
||||
|
||||
subgraph analysis["Analysis & Design"]
|
||||
arch["01-architecture-analysis.md"]
|
||||
design["02-design-decisions.md"]
|
||||
@@ -48,12 +54,15 @@ flowchart TB
|
||||
phases["06-implementation-phases.md"]
|
||||
backends["07-observability-backends.md"]
|
||||
appendix["08-appendix.md"]
|
||||
poc["POC_taskList.md"]
|
||||
end
|
||||
|
||||
overview --> fundamentals
|
||||
overview --> analysis
|
||||
overview --> impl
|
||||
overview --> deploy
|
||||
|
||||
fund --> arch
|
||||
arch --> design
|
||||
design --> strategy
|
||||
strategy --> code
|
||||
@@ -61,8 +70,11 @@ flowchart TB
|
||||
config --> phases
|
||||
phases --> backends
|
||||
backends --> appendix
|
||||
phases --> poc
|
||||
|
||||
style overview fill:#1b5e20,stroke:#0d3d14,color:#fff,stroke-width:2px
|
||||
style fundamentals fill:#00695c,stroke:#004d40,color:#fff
|
||||
style fund fill:#00695c,stroke:#004d40,color:#fff
|
||||
style analysis fill:#0d47a1,stroke:#082f6a,color:#fff
|
||||
style impl fill:#bf360c,stroke:#8c2809,color:#fff
|
||||
style deploy fill:#4a148c,stroke:#2e0d57,color:#fff
|
||||
@@ -74,6 +86,7 @@ flowchart TB
|
||||
style phases fill:#4a148c,stroke:#2e0d57,color:#fff
|
||||
style backends fill:#4a148c,stroke:#2e0d57,color:#fff
|
||||
style appendix fill:#4a148c,stroke:#2e0d57,color:#fff
|
||||
style poc fill:#4a148c,stroke:#2e0d57,color:#fff
|
||||
```
|
||||
|
||||
</div>
|
||||
@@ -84,22 +97,34 @@ flowchart TB
|
||||
|
||||
| Section | Document | Description |
|
||||
| ------- | ---------------------------------------------------------- | ---------------------------------------------------------------------- |
|
||||
| **0** | [Tracing Fundamentals](./00-tracing-fundamentals.md) | Distributed tracing concepts, span relationships, context propagation |
|
||||
| **1** | [Architecture Analysis](./01-architecture-analysis.md) | rippled component analysis, trace points, instrumentation priorities |
|
||||
| **2** | [Design Decisions](./02-design-decisions.md) | SDK selection, exporters, span naming, attributes, context propagation |
|
||||
| **3** | [Implementation Strategy](./03-implementation-strategy.md) | Directory structure, key principles, performance optimization |
|
||||
| **4** | [Code Samples](./04-code-samples.md) | Complete C++ implementation examples for all components |
|
||||
| **4** | [Code Samples](./04-code-samples.md) | C++ implementation examples for core infrastructure and key modules |
|
||||
| **5** | [Configuration Reference](./05-configuration-reference.md) | rippled config, CMake integration, Collector configurations |
|
||||
| **6** | [Implementation Phases](./06-implementation-phases.md) | 5-phase timeline, tasks, risks, success metrics |
|
||||
| **7** | [Observability Backends](./07-observability-backends.md) | Backend selection guide and production architecture |
|
||||
| **8** | [Appendix](./08-appendix.md) | Glossary, references, version history |
|
||||
| **POC** | [POC Task List](./POC_taskList.md) | Proof of concept tasks for RPC tracing end-to-end demo |
|
||||
|
||||
---
|
||||
|
||||
## 0. Tracing Fundamentals
|
||||
|
||||
This document introduces distributed tracing concepts for readers unfamiliar with the domain. It covers what traces and spans are, how parent-child and follows-from relationships model causality, how context propagates across service boundaries, and how sampling controls data volume. It also maps these concepts to rippled-specific scenarios like transaction relay and consensus.
|
||||
|
||||
➡️ **[Read Tracing Fundamentals](./00-tracing-fundamentals.md)**
|
||||
|
||||
---
|
||||
|
||||
## 1. Architecture Analysis
|
||||
|
||||
The rippled node consists of several key components that require instrumentation for comprehensive distributed tracing. The main areas include the RPC server (HTTP/WebSocket), Overlay P2P network, Consensus mechanism (RCLConsensus), JobQueue for async task execution, and existing observability infrastructure (PerfLog, Insight/StatsD, Journal logging).
|
||||
> **WS** = WebSocket | **TxQ** = Transaction Queue
|
||||
|
||||
Key trace points span across transaction submission via RPC, peer-to-peer message propagation, consensus round execution, and ledger building. The implementation prioritizes high-value, low-risk components first: RPC handlers provide immediate value with minimal risk, while consensus tracing requires careful implementation to avoid timing impacts.
|
||||
The rippled node consists of several key components that require instrumentation for comprehensive distributed tracing. The main areas include the RPC server (HTTP/WebSocket), Overlay P2P network, Consensus mechanism (RCLConsensus), JobQueue for async task execution, PathFinding, Transaction Queue (TxQ), fee escalation (LoadManager), ledger acquisition, validator management, and existing observability infrastructure (PerfLog, Insight/StatsD, Journal logging).
|
||||
|
||||
Key trace points span across transaction submission via RPC, peer-to-peer message propagation, consensus round execution, ledger building, path computation, transaction queue behavior, fee escalation, and validator health. The implementation prioritizes high-value, low-risk components first: RPC handlers provide immediate value with minimal risk, while consensus tracing requires careful implementation to avoid timing impacts.
|
||||
|
||||
➡️ **[Read full Architecture Analysis](./01-architecture-analysis.md)**
|
||||
|
||||
@@ -107,11 +132,13 @@ Key trace points span across transaction submission via RPC, peer-to-peer messag
|
||||
|
||||
## 2. Design Decisions
|
||||
|
||||
> **OTLP** = OpenTelemetry Protocol | **CNCF** = Cloud Native Computing Foundation
|
||||
|
||||
The OpenTelemetry C++ SDK is selected for its CNCF backing, active development, and native performance characteristics. Traces are exported via OTLP/gRPC (primary) or OTLP/HTTP (fallback) to an OpenTelemetry Collector, which provides flexible routing and sampling.
|
||||
|
||||
Span naming follows a hierarchical `<component>.<operation>` convention (e.g., `rpc.submit`, `tx.relay`, `consensus.round`). Context propagation uses W3C Trace Context headers for HTTP and embedded Protocol Buffer fields for P2P messages. The implementation coexists with existing PerfLog and Insight observability systems through correlation IDs.
|
||||
|
||||
**Data Collection & Privacy**: Telemetry collects only operational metadata (timing, counts, hashes) — never sensitive content (private keys, balances, amounts, raw payloads). Privacy protection includes account hashing, configurable redaction, sampling, and collector-level filtering. Node operators retain full control(not penned down in this document yet) over what data is exported.
|
||||
**Data Collection & Privacy**: Telemetry collects only operational metadata (timing, counts, hashes) — never sensitive content (private keys, balances, amounts, raw payloads). Privacy protection includes account hashing, configurable redaction, sampling, and collector-level filtering. Node operators retain full control over telemetry configuration.
|
||||
|
||||
➡️ **[Read full Design Decisions](./02-design-decisions.md)**
|
||||
|
||||
@@ -129,13 +156,14 @@ Performance optimization strategies include probabilistic head sampling (10% def
|
||||
|
||||
## 4. Code Samples
|
||||
|
||||
Complete C++ implementation examples are provided for all telemetry components:
|
||||
C++ implementation examples are provided for the core telemetry infrastructure and key modules:
|
||||
|
||||
- `Telemetry.h` - Core interface for tracer access and span creation
|
||||
- `SpanGuard.h` - RAII wrapper for automatic span lifecycle management
|
||||
- `TracingInstrumentation.h` - Macros for conditional instrumentation
|
||||
- Protocol Buffer extensions for trace context propagation
|
||||
- Module-specific instrumentation (RPC, Consensus, P2P, JobQueue)
|
||||
- Remaining modules (PathFinding, TxQ, Validator, etc.) follow the same patterns
|
||||
|
||||
➡️ **[View all Code Samples](./04-code-samples.md)**
|
||||
|
||||
@@ -143,9 +171,11 @@ Complete C++ implementation examples are provided for all telemetry components:
|
||||
|
||||
## 5. Configuration Reference
|
||||
|
||||
> **OTLP** = OpenTelemetry Protocol | **APM** = Application Performance Monitoring
|
||||
|
||||
Configuration is handled through the `[telemetry]` section in `xrpld.cfg` with options for enabling/disabling, exporter selection, endpoint configuration, sampling ratios, and component-level filtering. CMake integration includes a `XRPL_ENABLE_TELEMETRY` option for compile-time control.
|
||||
|
||||
OpenTelemetry Collector configurations are provided for development (with Jaeger) and production (with tail-based sampling, Tempo, and Elastic APM). Docker Compose examples enable quick local development environment setup.
|
||||
OpenTelemetry Collector configurations are provided for development and production (with tail-based sampling, Tempo, and Elastic APM). Docker Compose examples enable quick local development environment setup.
|
||||
|
||||
➡️ **[View full Configuration Reference](./05-configuration-reference.md)**
|
||||
|
||||
@@ -163,7 +193,7 @@ The implementation spans 9 weeks across 5 phases:
|
||||
| 4 | Weeks 7-8 | Consensus Tracing | Round spans, Proposal/validation tracing |
|
||||
| 5 | Week 9 | Documentation | Runbook, Dashboards, Training |
|
||||
|
||||
**Total Effort**: 47 developer-days with 2 developers
|
||||
**Total Effort**: 47 person-days (2 developers working in parallel)
|
||||
|
||||
➡️ **[View full Implementation Phases](./06-implementation-phases.md)**
|
||||
|
||||
@@ -171,7 +201,9 @@ The implementation spans 9 weeks across 5 phases:
|
||||
|
||||
## 7. Observability Backends
|
||||
|
||||
For development and testing, Jaeger provides easy setup with a good UI. For production deployments, Grafana Tempo is recommended for its cost-effectiveness and Grafana integration, while Elastic APM is ideal for organizations with existing Elastic infrastructure.
|
||||
> **APM** = Application Performance Monitoring | **GCS** = Google Cloud Storage
|
||||
|
||||
Grafana Tempo is recommended for all environments due to its cost-effectiveness and Grafana integration, while Elastic APM is ideal for organizations with existing Elastic infrastructure.
|
||||
|
||||
The recommended production architecture uses a gateway collector pattern with regional collectors performing tail-based sampling, routing traces to multiple backends (Tempo for primary storage, Elastic for log correlation, S3/GCS for long-term archive).
|
||||
|
||||
@@ -187,4 +219,12 @@ The appendix contains a glossary of OpenTelemetry and rippled-specific terms, re
|
||||
|
||||
---
|
||||
|
||||
## POC Task List
|
||||
|
||||
A step-by-step task list for building a minimal end-to-end proof of concept that demonstrates distributed tracing in rippled. The POC scope is limited to RPC tracing — showing request traces flowing from rippled through an OpenTelemetry Collector into Tempo, viewable in Grafana.
|
||||
|
||||
➡️ **[View POC Task List](./POC_taskList.md)**
|
||||
|
||||
---
|
||||
|
||||
_This document provides a comprehensive implementation plan for integrating OpenTelemetry distributed tracing into the rippled XRP Ledger node software. For detailed information on any section, follow the links to the corresponding sub-documents._
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# OpenTelemetry POC Task List
|
||||
|
||||
> **Goal**: Build a minimal end-to-end proof of concept that demonstrates distributed tracing in rippled. A successful POC will show RPC request traces flowing from rippled through an OTel Collector into Jaeger, viewable in a browser UI.
|
||||
> **Goal**: Build a minimal end-to-end proof of concept that demonstrates distributed tracing in rippled. A successful POC will show RPC request traces flowing from rippled through an OTel Collector into Tempo, viewable in Grafana.
|
||||
>
|
||||
> **Scope**: RPC tracing only (highest value, lowest risk per the [CRAWL phase](./06-implementation-phases.md#6102-quick-wins-immediate-value) in the implementation phases). No cross-node P2P context propagation or consensus tracing in the POC.
|
||||
|
||||
@@ -15,28 +15,29 @@
|
||||
| [04-code-samples.md](./04-code-samples.md) | Telemetry interface (§4.1), SpanGuard (§4.2), macros (§4.3), RPC instrumentation (§4.5.3) |
|
||||
| [05-configuration-reference.md](./05-configuration-reference.md) | rippled config (§5.1), config parser (§5.2), Application integration (§5.3), CMake (§5.4), Collector config (§5.5), Docker Compose (§5.6), Grafana (§5.8) |
|
||||
| [06-implementation-phases.md](./06-implementation-phases.md) | Phase 1 core tasks (§6.2), Phase 2 RPC tasks (§6.3), quick wins (§6.10), definition of done (§6.11) |
|
||||
| [07-observability-backends.md](./07-observability-backends.md) | Jaeger dev setup (§7.1), Grafana dashboards (§7.6), alert rules (§7.6.3) |
|
||||
| [07-observability-backends.md](./07-observability-backends.md) | Tempo dev setup (§7.1), Grafana dashboards (§7.6), alert rules (§7.6.3) |
|
||||
|
||||
---
|
||||
|
||||
## Task 0: Docker Observability Stack Setup
|
||||
|
||||
> **OTLP** = OpenTelemetry Protocol
|
||||
|
||||
**Objective**: Stand up the backend infrastructure to receive, store, and display traces.
|
||||
|
||||
**What to do**:
|
||||
|
||||
- Create `docker/telemetry/docker-compose.yml` in the repo with three services:
|
||||
1. **OpenTelemetry Collector** (`otel/opentelemetry-collector-contrib:latest`)
|
||||
1. **OpenTelemetry Collector** (`otel/opentelemetry-collector-contrib:0.92.0`)
|
||||
- Expose ports `4317` (OTLP gRPC) and `4318` (OTLP HTTP)
|
||||
- Expose port `13133` (health check)
|
||||
- Mount a config file `docker/telemetry/otel-collector-config.yaml`
|
||||
2. **Jaeger** (`jaegertracing/all-in-one:latest`)
|
||||
- Expose port `16686` (UI) and `14250` (gRPC collector)
|
||||
- Set env `COLLECTOR_OTLP_ENABLED=true`
|
||||
2. **Tempo** (`grafana/tempo:2.6.1`)
|
||||
- Expose port `3200` (HTTP API) and `4317` (OTLP gRPC, internal)
|
||||
3. **Grafana** (`grafana/grafana:latest`) — optional but useful
|
||||
- Expose port `3000`
|
||||
- Enable anonymous admin access for local dev (`GF_AUTH_ANONYMOUS_ENABLED=true`, `GF_AUTH_ANONYMOUS_ORG_ROLE=Admin`)
|
||||
- Provision Jaeger as a data source via `docker/telemetry/grafana/provisioning/datasources/jaeger.yaml`
|
||||
- Provision Tempo as a data source via `docker/telemetry/grafana/provisioning/datasources/tempo.yaml`
|
||||
|
||||
- Create `docker/telemetry/otel-collector-config.yaml`:
|
||||
|
||||
@@ -57,8 +58,8 @@
|
||||
exporters:
|
||||
logging:
|
||||
verbosity: detailed
|
||||
otlp/jaeger:
|
||||
endpoint: jaeger:4317
|
||||
otlp/tempo:
|
||||
endpoint: tempo:4317
|
||||
tls:
|
||||
insecure: true
|
||||
|
||||
@@ -67,30 +68,29 @@
|
||||
traces:
|
||||
receivers: [otlp]
|
||||
processors: [batch]
|
||||
exporters: [logging, otlp/jaeger]
|
||||
exporters: [logging, otlp/tempo]
|
||||
```
|
||||
|
||||
- Create Grafana Jaeger datasource provisioning file at `docker/telemetry/grafana/provisioning/datasources/jaeger.yaml`:
|
||||
- Create Grafana Tempo datasource provisioning file at `docker/telemetry/grafana/provisioning/datasources/tempo.yaml`:
|
||||
```yaml
|
||||
apiVersion: 1
|
||||
datasources:
|
||||
- name: Jaeger
|
||||
type: jaeger
|
||||
- name: Tempo
|
||||
type: tempo
|
||||
access: proxy
|
||||
url: http://jaeger:16686
|
||||
url: http://tempo:3200
|
||||
```
|
||||
|
||||
**Verification**: Run `docker compose -f docker/telemetry/docker-compose.yml up -d`, then:
|
||||
|
||||
- `curl http://localhost:13133` returns healthy (Collector)
|
||||
- `http://localhost:16686` opens Jaeger UI (no traces yet)
|
||||
- `http://localhost:3000` opens Grafana (optional)
|
||||
- `http://localhost:3000` opens Grafana (Tempo datasource available, no traces yet)
|
||||
|
||||
**Reference**:
|
||||
|
||||
- [05-configuration-reference.md §5.5](./05-configuration-reference.md) — Collector config (dev YAML with Jaeger exporter)
|
||||
- [05-configuration-reference.md §5.5](./05-configuration-reference.md) — Collector config (dev YAML with Tempo exporter)
|
||||
- [05-configuration-reference.md §5.6](./05-configuration-reference.md) — Docker Compose development environment
|
||||
- [07-observability-backends.md §7.1](./07-observability-backends.md) — Jaeger quick start and backend selection
|
||||
- [07-observability-backends.md §7.1](./07-observability-backends.md) — Tempo quick start and backend selection
|
||||
- [05-configuration-reference.md §5.8](./05-configuration-reference.md) — Grafana datasource provisioning and dashboards
|
||||
|
||||
---
|
||||
@@ -175,6 +175,8 @@
|
||||
|
||||
## Task 3: Implement OTel-Backed Telemetry
|
||||
|
||||
> **OTLP** = OpenTelemetry Protocol
|
||||
|
||||
**Objective**: Implement the real `Telemetry` class that initializes the OTel SDK, configures the OTLP exporter and batch processor, and creates tracers/spans.
|
||||
|
||||
**What to do**:
|
||||
@@ -183,7 +185,7 @@
|
||||
- `class TelemetryImpl : public Telemetry` that:
|
||||
- In `start()`: creates a `TracerProvider` with:
|
||||
- Resource attributes: `service.name`, `service.version`, `service.instance.id`
|
||||
- An `OtlpGrpcExporter` pointed at `setup.exporterEndpoint` (default `localhost:4317`)
|
||||
- An `OtlpHttpExporter` pointed at `setup.exporterEndpoint` (default `localhost:4318`)
|
||||
- A `BatchSpanProcessor` with configurable batch size and delay
|
||||
- A `TraceIdRatioBasedSampler` using `setup.samplingRatio`
|
||||
- Sets the global `TracerProvider`
|
||||
@@ -316,6 +318,8 @@
|
||||
|
||||
## Task 6: Instrument RPC ServerHandler
|
||||
|
||||
> **WS** = WebSocket
|
||||
|
||||
**Objective**: Add tracing to the HTTP RPC entry point so every incoming RPC request creates a span.
|
||||
|
||||
**What to do**:
|
||||
@@ -338,7 +342,7 @@
|
||||
rpc.request
|
||||
└── rpc.process
|
||||
```
|
||||
in Jaeger for every HTTP RPC call.
|
||||
in Tempo/Grafana for every HTTP RPC call.
|
||||
|
||||
**Key modified file**:
|
||||
|
||||
@@ -372,7 +376,7 @@
|
||||
- On success: `XRPL_TRACE_SET_ATTR("xrpl.rpc.status", "success");`
|
||||
- On error: `XRPL_TRACE_SET_ATTR("xrpl.rpc.status", "error");` and set the error message
|
||||
|
||||
- After this, traces in Jaeger should look like:
|
||||
- After this, traces in Tempo/Grafana should look like:
|
||||
```
|
||||
rpc.request (xrpl.rpc.command=account_info)
|
||||
└── rpc.process
|
||||
@@ -396,7 +400,9 @@
|
||||
|
||||
## Task 8: Build, Run, and Verify End-to-End
|
||||
|
||||
**Objective**: Prove the full pipeline works: rippled emits traces -> OTel Collector receives them -> Jaeger displays them.
|
||||
> **OTLP** = OpenTelemetry Protocol
|
||||
|
||||
**Objective**: Prove the full pipeline works: rippled emits traces -> OTel Collector receives them -> Tempo stores them for Grafana visualization.
|
||||
|
||||
**What to do**:
|
||||
|
||||
@@ -453,10 +459,10 @@
|
||||
-d '{"method":"account_info","params":[{"account":"rHb9CJAWyB4rj91VRWn96DkukG4bwdtyTh"}]}'
|
||||
```
|
||||
|
||||
6. **Verify in Jaeger**:
|
||||
- Open `http://localhost:16686`
|
||||
- Select service `rippled` from the dropdown
|
||||
- Click "Find Traces"
|
||||
6. **Verify in Grafana (Tempo)**:
|
||||
- Open `http://localhost:3000`
|
||||
- Navigate to Explore → select Tempo datasource
|
||||
- Search for service `rippled`
|
||||
- Confirm you see traces with spans: `rpc.request` -> `rpc.process` -> `rpc.command.server_info`
|
||||
- Click into a trace and verify attributes: `xrpl.rpc.command`, `xrpl.rpc.status`, `xrpl.rpc.version`
|
||||
|
||||
@@ -470,7 +476,7 @@
|
||||
- [ ] Docker stack starts without errors
|
||||
- [ ] rippled builds with `-DXRPL_ENABLE_TELEMETRY=ON`
|
||||
- [ ] rippled starts and connects to OTel Collector (check rippled logs for telemetry messages)
|
||||
- [ ] Traces appear in Jaeger UI under service "rippled"
|
||||
- [ ] Traces appear in Grafana/Tempo under service "rippled"
|
||||
- [ ] Span hierarchy is correct (parent-child relationships)
|
||||
- [ ] Span attributes are populated (`xrpl.rpc.command`, `xrpl.rpc.status`, etc.)
|
||||
- [ ] Error spans show error status and message
|
||||
@@ -479,8 +485,8 @@
|
||||
|
||||
**Reference**:
|
||||
|
||||
- [06-implementation-phases.md §6.11.1](./06-implementation-phases.md) — Phase 1 definition of done: SDK compiles, runtime toggle works, span creation verified in Jaeger, config validation passes
|
||||
- [06-implementation-phases.md §6.11.2](./06-implementation-phases.md) — Phase 2 definition of done: 100% RPC coverage, traceparent propagation, <1ms p99 overhead, dashboard deployed
|
||||
- [06-implementation-phases.md §6.11.1](./06-implementation-phases.md) — Phase 1 definition of done: SDK compiles, runtime toggle works, span creation verified in Tempo, config validation passes
|
||||
- [06-implementation-phases.md §6.11.2](./06-implementation-phases.md#6112-phase-2-rpc-tracing) — Phase 2 definition of done: 100% RPC coverage, traceparent propagation, <1ms p99 overhead, dashboard deployed
|
||||
- [06-implementation-phases.md §6.8](./06-implementation-phases.md) — Success metrics: trace coverage >95%, CPU overhead <3%, memory <5 MB, latency impact <2%
|
||||
- [03-implementation-strategy.md §3.9.5](./03-implementation-strategy.md) — Backward compatibility: config optional, protocol unchanged, `XRPL_ENABLE_TELEMETRY=OFF` produces identical binary
|
||||
- [01-architecture-analysis.md §1.8](./01-architecture-analysis.md) — Observable outcomes: what traces, metrics, and dashboards to expect
|
||||
@@ -489,11 +495,13 @@
|
||||
|
||||
## Task 9: Document POC Results and Next Steps
|
||||
|
||||
> **OTLP** = OpenTelemetry Protocol | **WS** = WebSocket
|
||||
|
||||
**Objective**: Capture findings, screenshots, and remaining work for the team.
|
||||
|
||||
**What to do**:
|
||||
|
||||
- Take screenshots of Jaeger showing:
|
||||
- Take screenshots of Grafana/Tempo showing:
|
||||
- The service list with "rippled"
|
||||
- A trace with the full span tree
|
||||
- Span detail view showing attributes
|
||||
@@ -541,9 +549,11 @@
|
||||
|
||||
## Next Steps (Post-POC)
|
||||
|
||||
> **OTLP** = OpenTelemetry Protocol | **WS** = WebSocket
|
||||
|
||||
### Metrics Pipeline for Grafana Dashboards
|
||||
|
||||
The current POC exports **traces only**. Grafana's Explore view can query Jaeger for individual traces, but time-series charts (latency histograms, request throughput, error rates) require a **metrics pipeline**. To enable this:
|
||||
The current POC exports **traces only**. Grafana's Explore view can query Tempo for individual traces, but time-series charts (latency histograms, request throughput, error rates) require a **metrics pipeline**. To enable this:
|
||||
|
||||
1. **Add a `spanmetrics` connector** to the OTel Collector config that derives RED metrics (Rate, Errors, Duration) from trace spans automatically:
|
||||
|
||||
@@ -566,7 +576,7 @@ The current POC exports **traces only**. Grafana's Explore view can query Jaeger
|
||||
traces:
|
||||
receivers: [otlp]
|
||||
processors: [batch]
|
||||
exporters: [debug, otlp/jaeger, spanmetrics]
|
||||
exporters: [debug, otlp/tempo, spanmetrics]
|
||||
metrics:
|
||||
receivers: [spanmetrics]
|
||||
exporters: [prometheus]
|
||||
|
||||
@@ -4,6 +4,8 @@
|
||||
|
||||
## Slide 1: Introduction
|
||||
|
||||
> **CNCF** = Cloud Native Computing Foundation
|
||||
|
||||
### What is OpenTelemetry?
|
||||
|
||||
OpenTelemetry is an open-source, CNCF-backed observability framework for distributed tracing, metrics, and logs.
|
||||
@@ -25,12 +27,21 @@ flowchart LR
|
||||
style D fill:#e65100,stroke:#bf360c,color:#fff
|
||||
```
|
||||
|
||||
**Reading the diagram:**
|
||||
|
||||
- **Node A (blue, leftmost)**: The originating node that first receives the transaction and assigns a new `trace_id: abc123`; this ID becomes the correlation key for the entire distributed trace.
|
||||
- **Node B and Node C (green, middle)**: Relay and validation nodes — each creates its own span but carries the same `trace_id`, so their work is linked to the original submission without any central coordinator.
|
||||
- **Node D (orange, rightmost)**: The final node that applies the transaction to the ledger; the trace now spans the full lifecycle from submission to ledger inclusion.
|
||||
- **Left-to-right flow**: The horizontal progression shows the real-world message path — a transaction hops from node to node, and the shared `trace_id` stitches all hops into a single queryable trace.
|
||||
|
||||
> **Trace ID: abc123** — All nodes share the same trace, enabling cross-node correlation.
|
||||
|
||||
---
|
||||
|
||||
## Slide 2: OpenTelemetry vs Open Source Alternatives
|
||||
|
||||
> **CNCF** = Cloud Native Computing Foundation
|
||||
|
||||
| Feature | OpenTelemetry | Jaeger | Zipkin | SkyWalking | Pinpoint | Prometheus |
|
||||
| ------------------- | ---------------- | ---------------- | ------------------ | ---------- | ---------- | ---------- |
|
||||
| **Tracing** | YES | YES | YES | YES | YES | NO |
|
||||
@@ -42,11 +53,131 @@ flowchart LR
|
||||
| **Backend** | Any (exporters) | Self | Self | Self | Self | Self |
|
||||
| **CNCF Status** | Incubating | Graduated | NO | Incubating | NO | Graduated |
|
||||
|
||||
> **Why OpenTelemetry?** It's the only actively maintained, full-featured C++ option with vendor neutrality — allowing export to Jaeger, Prometheus, Grafana, or any commercial backend without changing instrumentation.
|
||||
> **Why OpenTelemetry?** It's the only actively maintained, full-featured C++ option with vendor neutrality — allowing export to Tempo, Prometheus, Grafana, or any commercial backend without changing instrumentation.
|
||||
|
||||
---
|
||||
|
||||
## Slide 3: Comparison with rippled's Existing Solutions
|
||||
## Slide 3: Adoption Scope — Traces Only (Current Plan)
|
||||
|
||||
OpenTelemetry supports three signal types: **Traces**, **Metrics**, and **Logs**. rippled already captures metrics (StatsD via Beast Insight) and logs (Journal/PerfLog). The question is: how much of OTel do we adopt?
|
||||
|
||||
> **Scenario A**: Add distributed tracing. Keep StatsD for metrics and Journal for logs.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
subgraph rippled["rippled Process"]
|
||||
direction TB
|
||||
OTel["OTel SDK<br/>(Traces)"]
|
||||
Insight["Beast Insight<br/>(StatsD Metrics)"]
|
||||
Journal["Journal + PerfLog<br/>(Logging)"]
|
||||
end
|
||||
|
||||
OTel -->|"OTLP"| Collector["OTel Collector"]
|
||||
Insight -->|"UDP"| StatsD["StatsD Server"]
|
||||
Journal -->|"File I/O"| LogFile["perf.log / debug.log"]
|
||||
|
||||
Collector --> Tempo["Tempo / Jaeger"]
|
||||
StatsD --> Graphite["Graphite / Grafana"]
|
||||
LogFile --> Loki["Loki (optional)"]
|
||||
|
||||
style rippled fill:#424242,stroke:#212121,color:#fff
|
||||
style OTel fill:#2e7d32,stroke:#1b5e20,color:#fff
|
||||
style Insight fill:#1565c0,stroke:#0d47a1,color:#fff
|
||||
style Journal fill:#e65100,stroke:#bf360c,color:#fff
|
||||
style Collector fill:#2e7d32,stroke:#1b5e20,color:#fff
|
||||
```
|
||||
|
||||
| Aspect | Details |
|
||||
| ------------------------------ | --------------------------------------------------------------------------------------------------------------- |
|
||||
| **What changes for operators** | Deploy OTel Collector + trace backend. Existing StatsD and log pipelines stay as-is. |
|
||||
| **Codebase impact** | New `Telemetry` module (~1500 LOC). Beast Insight and Journal untouched. |
|
||||
| **New capabilities** | Cross-node trace correlation, span-based debugging, request lifecycle visibility. |
|
||||
| **What we still can't do** | Correlate metrics with specific traces natively. StatsD metrics remain fire-and-forget with no trace exemplars. |
|
||||
| **Maintenance burden** | Three separate observability systems to maintain (OTel + StatsD + Journal). |
|
||||
| **Risk** | Lowest — additive change, no existing systems disturbed. |
|
||||
|
||||
---
|
||||
|
||||
## Slide 4: Future Adoption — Metrics & Logs via OTel
|
||||
|
||||
### Scenario B: + OTel Metrics (Replace StatsD)
|
||||
|
||||
> Migrate StatsD to OTel Metrics API, exposing Prometheus-compatible metrics. Remove Beast Insight.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
subgraph rippled["rippled Process"]
|
||||
direction TB
|
||||
OTel["OTel SDK<br/>(Traces + Metrics)"]
|
||||
Journal["Journal + PerfLog<br/>(Logging)"]
|
||||
end
|
||||
|
||||
OTel -->|"OTLP"| Collector["OTel Collector"]
|
||||
Journal -->|"File I/O"| LogFile["perf.log / debug.log"]
|
||||
|
||||
Collector --> Tempo["Tempo<br/>(Traces)"]
|
||||
Collector --> Prom["Prometheus<br/>(Metrics)"]
|
||||
LogFile --> Loki["Loki (optional)"]
|
||||
|
||||
style rippled fill:#424242,stroke:#212121,color:#fff
|
||||
style OTel fill:#2e7d32,stroke:#1b5e20,color:#fff
|
||||
style Journal fill:#e65100,stroke:#bf360c,color:#fff
|
||||
style Collector fill:#2e7d32,stroke:#1b5e20,color:#fff
|
||||
```
|
||||
|
||||
- **Better metrics?** Yes — Prometheus gives native histograms (p50/p95/p99), multi-dimensional labels, and exemplars linking metric spikes to traces.
|
||||
- **Codebase**: Remove `Beast::Insight` + `StatsDCollector` (~2000 LOC). Single SDK for traces and metrics.
|
||||
- **Operator effort**: Rewrite dashboards from StatsD/Graphite queries to PromQL. Run both in parallel during transition.
|
||||
- **Risk**: Medium — operators must migrate monitoring infrastructure.
|
||||
|
||||
### Scenario C: + OTel Logs (Full Stack)
|
||||
|
||||
> Also replace Journal logging with OTel Logs API. Single SDK for everything.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
subgraph rippled["rippled Process"]
|
||||
OTel["OTel SDK<br/>(Traces + Metrics + Logs)"]
|
||||
end
|
||||
|
||||
OTel -->|"OTLP"| Collector["OTel Collector"]
|
||||
|
||||
Collector --> Tempo["Tempo<br/>(Traces)"]
|
||||
Collector --> Prom["Prometheus<br/>(Metrics)"]
|
||||
Collector --> Loki["Loki / Elastic<br/>(Logs)"]
|
||||
|
||||
style rippled fill:#424242,stroke:#212121,color:#fff
|
||||
style OTel fill:#2e7d32,stroke:#1b5e20,color:#fff
|
||||
style Collector fill:#2e7d32,stroke:#1b5e20,color:#fff
|
||||
```
|
||||
|
||||
- **Structured logging**: OTel Logs API outputs structured records with `trace_id`, `span_id`, severity, and attributes by design.
|
||||
- **Full correlation**: Every log line carries `trace_id`. Click trace → see logs. Click metric spike → see trace → see logs.
|
||||
- **Codebase**: Remove Beast Insight (~2000 LOC) + simplify Journal/PerfLog (~3000 LOC). One dependency instead of three.
|
||||
- **Risk**: Highest — `beast::Journal` is deeply embedded in every component. Large refactor. OTel C++ Logs API is newer (stable since v1.11, less battle-tested).
|
||||
|
||||
### Recommendation
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A["Phase 1<br/><b>Traces Only</b><br/>(Current Plan)"] --> B["Phase 2<br/><b>+ Metrics</b><br/>(Replace StatsD)"] --> C["Phase 3<br/><b>+ Logs</b><br/>(Full OTel)"]
|
||||
|
||||
style A fill:#2e7d32,stroke:#1b5e20,color:#fff
|
||||
style B fill:#1565c0,stroke:#0d47a1,color:#fff
|
||||
style C fill:#e65100,stroke:#bf360c,color:#fff
|
||||
```
|
||||
|
||||
| Phase | Signal | Strategy | Risk |
|
||||
| -------------------- | --------- | -------------------------------------------------------------- | ------ |
|
||||
| **Phase 1** (now) | Traces | Add OTel traces. Keep StatsD and Journal. Prove value. | Low |
|
||||
| **Phase 2** (future) | + Metrics | Migrate StatsD → Prometheus via OTel. Remove Beast Insight. | Medium |
|
||||
| **Phase 3** (future) | + Logs | Adopt OTel Logs API. Align with structured logging initiative. | High |
|
||||
|
||||
> **Key Takeaway**: Start with traces (unique value, lowest risk), then incrementally adopt metrics and logs as the OTel infrastructure proves itself.
|
||||
|
||||
---
|
||||
|
||||
## Slide 5: Comparison with rippled's Existing Solutions
|
||||
|
||||
### Current Observability Stack
|
||||
|
||||
@@ -68,11 +199,13 @@ flowchart LR
|
||||
| "Which node delayed consensus?" | ❌ | ❌ | ✅ |
|
||||
| "Show TX journey across 5 nodes" | ❌ | ❌ | ✅ |
|
||||
|
||||
> **Key Insight**: OpenTelemetry **complements** (not replaces) existing systems.
|
||||
> **Key Insight**: In the **traces-only** approach (Phase 1), OpenTelemetry **complements** existing systems. In future phases, OTel metrics and logs could **replace** StatsD and Journal respectively — see Slides 3-4 for the full adoption roadmap.
|
||||
|
||||
---
|
||||
|
||||
## Slide 4: Architecture
|
||||
## Slide 6: Architecture
|
||||
|
||||
> **OTLP** = OpenTelemetry Protocol | **WS** = WebSocket
|
||||
|
||||
### High-Level Integration Architecture
|
||||
|
||||
@@ -92,7 +225,6 @@ flowchart TB
|
||||
Telemetry -->|OTLP/gRPC| Collector["OTel Collector"]
|
||||
|
||||
Collector --> Tempo["Grafana Tempo"]
|
||||
Collector --> Jaeger["Jaeger"]
|
||||
Collector --> Elastic["Elastic APM"]
|
||||
|
||||
style rippled fill:#424242,stroke:#212121,color:#fff
|
||||
@@ -101,6 +233,14 @@ flowchart TB
|
||||
style Collector fill:#e65100,stroke:#bf360c,color:#fff
|
||||
```
|
||||
|
||||
**Reading the diagram:**
|
||||
|
||||
- **Core Services (blue, top)**: RPC Server, Overlay, and Consensus are the three primary components that generate trace data — they represent the entry points for client requests, peer messages, and consensus rounds respectively.
|
||||
- **Telemetry Module (green, middle)**: The OpenTelemetry SDK sits below the core services and receives span data from all three; it acts as a single collection point within the rippled process.
|
||||
- **OTel Collector (orange, center)**: An external process that receives spans over OTLP/gRPC from the Telemetry Module; it decouples rippled from backend choices and handles batching, sampling, and routing.
|
||||
- **Backends (bottom row)**: Tempo and Elastic APM are interchangeable — the Collector fans out to any combination, so operators can switch backends without modifying rippled code.
|
||||
- **Top-to-bottom flow**: Data flows from instrumented code down through the SDK, out over the network to the Collector, and finally into storage/visualization backends.
|
||||
|
||||
### Context Propagation
|
||||
|
||||
```mermaid
|
||||
@@ -120,10 +260,12 @@ sequenceDiagram
|
||||
|
||||
---
|
||||
|
||||
## Slide 5: Implementation Plan
|
||||
## Slide 7: Implementation Plan
|
||||
|
||||
### 5-Phase Rollout (9 Weeks)
|
||||
|
||||
> **Note**: Dates shown are relative to project start, not calendar dates.
|
||||
|
||||
```mermaid
|
||||
gantt
|
||||
title Implementation Timeline
|
||||
@@ -158,18 +300,114 @@ gantt
|
||||
|
||||
**Total Effort**: ~47 developer-days (2 developers)
|
||||
|
||||
> **Future Phases** (not in current scope): After traces are stable, OTel metrics can replace StatsD (~3 weeks), and OTel logs can replace Journal (~4 weeks, aligned with structured logging initiative). See Slides 3-4 for the full adoption roadmap.
|
||||
|
||||
---
|
||||
|
||||
## Slide 6: Performance Overhead
|
||||
## Slide 8: Performance Overhead
|
||||
|
||||
> **OTLP** = OpenTelemetry Protocol
|
||||
|
||||
### Estimated System Impact
|
||||
|
||||
| Metric | Overhead | Notes |
|
||||
| ----------------- | ---------- | ----------------------------------- |
|
||||
| **CPU** | 1-3% | Span creation and attribute setting |
|
||||
| **Memory** | 2-5 MB | Batch buffer for pending spans |
|
||||
| **Network** | 10-50 KB/s | Compressed OTLP export to collector |
|
||||
| **Latency (p99)** | <2% | With proper sampling configuration |
|
||||
| Metric | Overhead | Notes |
|
||||
| ----------------- | ---------- | ------------------------------------------------ |
|
||||
| **CPU** | 1-3% | Span creation and attribute setting |
|
||||
| **Memory** | ~10 MB | SDK statics + batch buffer + worker thread stack |
|
||||
| **Network** | 10-50 KB/s | Compressed OTLP export to collector |
|
||||
| **Latency (p99)** | <2% | With proper sampling configuration |
|
||||
|
||||
#### How We Arrived at These Numbers
|
||||
|
||||
**Assumptions (XRPL mainnet baseline)**:
|
||||
|
||||
| Parameter | Value | Source |
|
||||
| ------------------------- | ---------------------- | --------------------------------------------------------------------------------------------------- |
|
||||
| Transaction throughput | ~25 TPS (peaks to ~50) | Mainnet average |
|
||||
| Default peers per node | 21 | `peerfinder/detail/Tuning.h` (`defaultMaxPeers`) |
|
||||
| Consensus round frequency | ~1 round / 3-4 seconds | `ConsensusParms.h` (`ledgerMIN_CONSENSUS=1950ms`) |
|
||||
| Proposers per round | ~20-35 | Mainnet UNL size |
|
||||
| P2P message rate | ~160 msgs/sec | See message breakdown below |
|
||||
| Avg TX processing time | ~200 μs | Profiled baseline |
|
||||
| Single span creation cost | 500-1000 ns | OTel C++ SDK benchmarks (see [3.5.4](./03-implementation-strategy.md#354-performance-data-sources)) |
|
||||
|
||||
**P2P message breakdown** (per node, mainnet):
|
||||
|
||||
| Message Type | Rate | Derivation |
|
||||
| ------------- | ------------ | --------------------------------------------------------------------- |
|
||||
| TMTransaction | ~100/sec | ~25 TPS × ~4 relay hops per TX, deduplicated by HashRouter |
|
||||
| TMValidation | ~50/sec | ~35 validators × ~1 validation/3s round ≈ ~12/sec, plus relay fan-out |
|
||||
| TMProposeSet | ~10/sec | ~35 proposers / 3s round ≈ ~12/round, clustered in establish phase |
|
||||
| **Total** | **~160/sec** | **Only traced message types counted** |
|
||||
|
||||
**CPU (1-3%) — Calculation**:
|
||||
|
||||
Per-transaction tracing cost breakdown:
|
||||
|
||||
| Operation | Cost | Notes |
|
||||
| ----------------------------------------------- | ----------- | ------------------------------------------ |
|
||||
| `tx.receive` span (create + end + 4 attributes) | ~1400 ns | ~1000ns create + ~200ns end + 4×50ns attrs |
|
||||
| `tx.validate` span | ~1200 ns | ~1000ns create + ~200ns for 2 attributes |
|
||||
| `tx.relay` span | ~1200 ns | ~1000ns create + ~200ns for 2 attributes |
|
||||
| Context injection into P2P message | ~200 ns | Serialize trace_id + span_id into protobuf |
|
||||
| **Total per TX** | **~4.0 μs** | |
|
||||
|
||||
> **CPU overhead**: 4.0 μs / 200 μs baseline = **~2.0% per transaction**. Under high load with consensus + RPC spans overlapping, reaches ~3%. Consensus itself adds only ~36 μs per 3-second round (~0.001%), so the TX path dominates. On production server hardware (3+ GHz Xeon), span creation drops to ~500-600 ns, bringing per-TX cost to ~2.6 μs (~1.3%). See [Section 3.5.4](./03-implementation-strategy.md#354-performance-data-sources) for benchmark sources.
|
||||
|
||||
**Memory (~10 MB) — Calculation**:
|
||||
|
||||
| Component | Size | Notes |
|
||||
| --------------------------------------------- | ------------------ | ------------------------------------- |
|
||||
| TracerProvider + Exporter (gRPC channel init) | ~320 KB | Allocated once at startup |
|
||||
| BatchSpanProcessor (circular buffer) | ~16 KB | 2049 × 8-byte AtomicUniquePtr entries |
|
||||
| BatchSpanProcessor (worker thread stack) | ~8 MB | Default Linux thread stack size |
|
||||
| Active spans (in-flight, max ~1000) | ~500-800 KB | ~500-800 bytes/span × 1000 concurrent |
|
||||
| Export queue (batch buffer, max 2048 spans) | ~1 MB | ~500 bytes/span × 2048 queue depth |
|
||||
| Thread-local context storage (~100 threads) | ~6.4 KB | ~64 bytes/thread |
|
||||
| **Total** | **~10 MB ceiling** | |
|
||||
|
||||
> Memory plateaus once the export queue fills — the `max_queue_size=2048` config bounds growth.
|
||||
> The worker thread stack (~8 MB) dominates the static footprint but is virtual memory; actual RSS
|
||||
> depends on stack usage (typically much less). Active spans are larger than originally estimated
|
||||
> (~500-800 bytes) because the OTel SDK `Span` object includes a mutex (~40 bytes), `SpanData`
|
||||
> recordable (~250 bytes base), and `std::map`-based attribute storage (~200-500 bytes for 3-5
|
||||
> string attributes). See [Section 3.5.4](./03-implementation-strategy.md#354-performance-data-sources) for source references.
|
||||
|
||||
**Network (10-50 KB/s) — Calculation**:
|
||||
|
||||
Two sources of network overhead:
|
||||
|
||||
**(A) OTLP span export to Collector:**
|
||||
|
||||
| Sampling Rate | Effective Spans/sec | Avg Span Size (compressed) | Bandwidth |
|
||||
| -------------------------- | ------------------- | -------------------------- | ------------ |
|
||||
| 100% (dev only) | ~500 | ~500 bytes | ~250 KB/s |
|
||||
| **10% (recommended prod)** | **~50** | **~500 bytes** | **~25 KB/s** |
|
||||
| 1% (minimal) | ~5 | ~500 bytes | ~2.5 KB/s |
|
||||
|
||||
> The ~500 spans/sec at 100% comes from: ~100 TX spans + ~160 P2P context spans + ~23 consensus spans/round + ~50 RPC spans = ~500/sec. OTLP protobuf with gzip compression yields ~500 bytes/span average.
|
||||
|
||||
**(B) P2P trace context overhead** (added to existing messages, always-on regardless of sampling):
|
||||
|
||||
| Message Type | Rate | Context Size | Bandwidth |
|
||||
| ------------- | -------- | ------------ | ------------- |
|
||||
| TMTransaction | ~100/sec | 29 bytes | ~2.9 KB/s |
|
||||
| TMValidation | ~50/sec | 29 bytes | ~1.5 KB/s |
|
||||
| TMProposeSet | ~10/sec | 29 bytes | ~0.3 KB/s |
|
||||
| **Total P2P** | | | **~4.7 KB/s** |
|
||||
|
||||
> **Combined**: 25 KB/s (OTLP export at 10%) + 5 KB/s (P2P context) ≈ **~30 KB/s typical**. The 10-50 KB/s range covers 10-20% sampling under normal to peak mainnet load.
|
||||
|
||||
**Latency (<2%) — Calculation**:
|
||||
|
||||
| Path | Tracing Cost | Baseline | Overhead |
|
||||
| ------------------------------ | ------------ | -------- | -------- |
|
||||
| Fast RPC (e.g., `server_info`) | 2.75 μs | ~1 ms | 0.275% |
|
||||
| Slow RPC (e.g., `path_find`) | 2.75 μs | ~100 ms | 0.003% |
|
||||
| Transaction processing | 4.0 μs | ~200 μs | 2.0% |
|
||||
| Consensus round | 36 μs | ~3 sec | 0.001% |
|
||||
|
||||
> At p99, even the worst case (TX processing at 2.0%) is within the 1-3% range. RPC and consensus overhead are negligible. On production hardware, TX overhead drops to ~1.3%.
|
||||
|
||||
### Per-Message Overhead (Context Propagation)
|
||||
|
||||
@@ -179,20 +417,20 @@ Each P2P message carries trace context with the following overhead:
|
||||
| ------------- | ------------- | ----------------------------------------- |
|
||||
| `trace_id` | 16 bytes | Unique identifier for the entire trace |
|
||||
| `span_id` | 8 bytes | Current span (becomes parent on receiver) |
|
||||
| `trace_flags` | 4 bytes | Sampling decision flags |
|
||||
| `trace_flags` | 1 byte | Sampling decision flags |
|
||||
| `trace_state` | 0-4 bytes | Optional vendor-specific data |
|
||||
| **Total** | **~32 bytes** | **Added per traced P2P message** |
|
||||
| **Total** | **~29 bytes** | **Added per traced P2P message** |
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
subgraph msg["P2P Message with Trace Context"]
|
||||
A["Original Message<br/>(variable size)"] --> B["+ TraceContext<br/>(~32 bytes)"]
|
||||
A["Original Message<br/>(variable size)"] --> B["+ TraceContext<br/>(~29 bytes)"]
|
||||
end
|
||||
|
||||
subgraph breakdown["Context Breakdown"]
|
||||
C["trace_id<br/>16 bytes"]
|
||||
D["span_id<br/>8 bytes"]
|
||||
E["flags<br/>4 bytes"]
|
||||
E["flags<br/>1 byte"]
|
||||
F["state<br/>0-4 bytes"]
|
||||
end
|
||||
|
||||
@@ -206,7 +444,14 @@ flowchart LR
|
||||
style F fill:#4a148c,stroke:#2e0d57,color:#fff
|
||||
```
|
||||
|
||||
> **Note**: 32 bytes is negligible compared to typical transaction messages (hundreds to thousands of bytes)
|
||||
**Reading the diagram:**
|
||||
|
||||
- **Original Message (gray, left)**: The existing P2P message payload of variable size — this is unchanged; trace context is appended, never modifying the original data.
|
||||
- **+ TraceContext (green, right of message)**: The additional 29-byte context block attached to each traced message; the arrow from the original message shows it is a pure addition.
|
||||
- **Context Breakdown (right subgraph)**: The four fields — `trace_id` (16 bytes), `span_id` (8 bytes), `flags` (1 byte), and `state` (0-4 bytes) — show exactly what is added and their individual sizes.
|
||||
- **Color coding**: Blue fields (`trace_id`, `span_id`) are the core identifiers required for trace correlation; orange (`flags`) controls sampling decisions; purple (`state`) is optional vendor data typically omitted.
|
||||
|
||||
> **Note**: 29 bytes represents ~1-6% overhead depending on message size (500B simple TX to 5KB proposal), which is acceptable for the observability benefits provided.
|
||||
|
||||
### Mitigation Strategies
|
||||
|
||||
@@ -220,6 +465,8 @@ flowchart LR
|
||||
style D fill:#4a148c,stroke:#2e0d57,color:#fff
|
||||
```
|
||||
|
||||
> For a detailed explanation of head vs. tail sampling, see Slide 9.
|
||||
|
||||
### Kill Switches (Rollback Options)
|
||||
|
||||
1. **Config Disable**: Set `enabled=0` in config → instant disable, no restart needed for sampling
|
||||
@@ -228,18 +475,157 @@ flowchart LR
|
||||
|
||||
---
|
||||
|
||||
## Slide 7: Data Collection & Privacy
|
||||
## Slide 9: Sampling Strategies — Head vs. Tail
|
||||
|
||||
> Sampling controls **which traces are recorded and exported**. Without sampling, every operation generates a trace — at 500+ spans/sec, this overwhelms storage and network. Sampling lets you keep the signal, discard the noise.
|
||||
|
||||
### Head Sampling (Decision at Start)
|
||||
|
||||
The sampling decision is made **when a trace begins**, before any work is done. A random number is generated; if it falls within the configured ratio, the entire trace is recorded. Otherwise, the trace is silently dropped.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A["New Request<br/>Arrives"] --> B{"Random < 10%?"}
|
||||
B -->|"Yes (1 in 10)"| C["Record Entire Trace<br/>(all spans)"]
|
||||
B -->|"No (9 in 10)"| D["Drop Entire Trace<br/>(zero overhead)"]
|
||||
|
||||
style C fill:#2e7d32,stroke:#1b5e20,color:#fff
|
||||
style D fill:#c62828,stroke:#8c2809,color:#fff
|
||||
style B fill:#1565c0,stroke:#0d47a1,color:#fff
|
||||
```
|
||||
|
||||
| Aspect | Details |
|
||||
| ----------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| **Where it runs** | Inside rippled (SDK-level). Configured via `sampling_ratio` in `rippled.cfg`. |
|
||||
| **When the decision happens** | At trace creation time — before the first span is even populated. |
|
||||
| **How it works** | `sampling_ratio=0.1` means each trace has a 10% probability of being recorded. Dropped traces incur near-zero overhead (no spans created, no attributes set, no export). |
|
||||
| **Propagation** | Once a trace is sampled, the `trace_flags` field (1 byte in the context header) tells downstream nodes to also sample it. Unsampled traces propagate `trace_flags=0`, so downstream nodes skip them too. |
|
||||
| **Pros** | Lowest overhead. Simple to configure. Predictable resource usage. |
|
||||
| **Cons** | **Blind** — it doesn't know if the trace will be interesting. A rare error or slow consensus round has only a 10% chance of being captured. |
|
||||
| **Best for** | High-volume, steady-state traffic where most traces look similar (e.g., routine RPC requests). |
|
||||
|
||||
**rippled configuration**:
|
||||
|
||||
```ini
|
||||
[telemetry]
|
||||
# Record 10% of traces (recommended for production)
|
||||
sampling_ratio=0.1
|
||||
```
|
||||
|
||||
### Tail Sampling (Decision at End)
|
||||
|
||||
The sampling decision is made **after the trace completes**, based on its actual content — was it slow? Did it error? Was it a consensus round? This requires buffering complete traces before deciding.
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
A["All Traces<br/>Buffered (100%)"] --> B["OTel Collector<br/>Evaluates Rules"]
|
||||
|
||||
B --> C{"Error?"}
|
||||
C -->|Yes| K["KEEP"]
|
||||
|
||||
C -->|No| D{"Slow?<br/>(>5s consensus,<br/>>1s RPC)"}
|
||||
D -->|Yes| K
|
||||
|
||||
D -->|No| E{"Random < 10%?"}
|
||||
E -->|Yes| K
|
||||
E -->|No| F["DROP"]
|
||||
|
||||
style K fill:#2e7d32,stroke:#1b5e20,color:#fff
|
||||
style F fill:#c62828,stroke:#8c2809,color:#fff
|
||||
style B fill:#1565c0,stroke:#0d47a1,color:#fff
|
||||
style C fill:#e65100,stroke:#bf360c,color:#fff
|
||||
style D fill:#e65100,stroke:#bf360c,color:#fff
|
||||
style E fill:#4a148c,stroke:#2e0d57,color:#fff
|
||||
```
|
||||
|
||||
| Aspect | Details |
|
||||
| ----------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| **Where it runs** | In the **OTel Collector** (external process), not inside rippled. rippled exports 100% of traces; the Collector decides what to keep. |
|
||||
| **When the decision happens** | After the Collector has received all spans for a trace (waits `decision_wait=10s` for stragglers). |
|
||||
| **How it works** | Policy rules evaluate the completed trace: keep all errors, keep slow operations above a threshold, keep all consensus rounds, then probabilistically sample the rest at 10%. |
|
||||
| **Pros** | **Never misses important traces**. Errors, slow requests, and consensus anomalies are always captured regardless of probability. |
|
||||
| **Cons** | Higher resource usage — rippled must export 100% of spans to the Collector, which buffers them in memory before deciding. The Collector needs more RAM (configured via `num_traces` and `decision_wait`). |
|
||||
| **Best for** | Production troubleshooting where you can't afford to miss errors or anomalies. |
|
||||
|
||||
**Collector configuration** (tail sampling rules for rippled):
|
||||
|
||||
```yaml
|
||||
processors:
|
||||
tail_sampling:
|
||||
decision_wait: 10s # Wait for all spans in a trace
|
||||
num_traces: 100000 # Buffer up to 100K concurrent traces
|
||||
policies:
|
||||
- name: errors # Always keep error traces
|
||||
type: status_code
|
||||
status_code: { status_codes: [ERROR] }
|
||||
|
||||
- name: slow-consensus # Keep consensus rounds >5s
|
||||
type: latency
|
||||
latency: { threshold_ms: 5000 }
|
||||
|
||||
- name: slow-rpc # Keep slow RPC requests >1s
|
||||
type: latency
|
||||
latency: { threshold_ms: 1000 }
|
||||
|
||||
- name: probabilistic # Sample 10% of everything else
|
||||
type: probabilistic
|
||||
probabilistic: { sampling_percentage: 10 }
|
||||
```
|
||||
|
||||
### Head vs. Tail — Side-by-Side
|
||||
|
||||
| | Head Sampling | Tail Sampling |
|
||||
| ----------------------------- | ---------------------------------------- | ------------------------------------------------ |
|
||||
| **Decision point** | Trace start (inside rippled) | Trace end (in OTel Collector) |
|
||||
| **Knows trace content?** | No (random coin flip) | Yes (evaluates completed trace) |
|
||||
| **Overhead on rippled** | Lowest (dropped traces = no-op) | Higher (must export 100% to Collector) |
|
||||
| **Collector resource usage** | Low (receives only sampled traces) | Higher (buffers all traces before deciding) |
|
||||
| **Captures all errors?** | No (only if trace was randomly selected) | **Yes** (error policy catches them) |
|
||||
| **Captures slow operations?** | No (random) | **Yes** (latency policy catches them) |
|
||||
| **Configuration** | `rippled.cfg`: `sampling_ratio=0.1` | `otel-collector.yaml`: `tail_sampling` processor |
|
||||
| **Best for** | High-throughput steady-state | Troubleshooting & anomaly detection |
|
||||
|
||||
### Recommended Strategy for rippled
|
||||
|
||||
Use **both** in a layered approach:
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
subgraph rippled["rippled (Head Sampling)"]
|
||||
HS["sampling_ratio=1.0<br/>(export everything)"]
|
||||
end
|
||||
|
||||
subgraph collector["OTel Collector (Tail Sampling)"]
|
||||
TS["Keep: errors + slow + 10% random<br/>Drop: routine traces"]
|
||||
end
|
||||
|
||||
subgraph storage["Backend Storage"]
|
||||
ST["Only interesting traces<br/>stored long-term"]
|
||||
end
|
||||
|
||||
rippled -->|"100% of spans"| collector -->|"~15-20% kept"| storage
|
||||
|
||||
style rippled fill:#424242,stroke:#212121,color:#fff
|
||||
style collector fill:#1565c0,stroke:#0d47a1,color:#fff
|
||||
style storage fill:#2e7d32,stroke:#1b5e20,color:#fff
|
||||
```
|
||||
|
||||
> **Why this works**: rippled exports everything (no blind drops), the Collector applies intelligent filtering (keep errors/slow/anomalies, sample the rest), and only ~15-20% of traces reach storage. If Collector resource usage becomes a concern, add head sampling at `sampling_ratio=0.5` to halve the export volume while still giving the Collector enough data for good tail-sampling decisions.
|
||||
|
||||
---
|
||||
|
||||
## Slide 10: Data Collection & Privacy
|
||||
|
||||
### What Data is Collected
|
||||
|
||||
| Category | Attributes Collected | Purpose |
|
||||
| --------------- | ---------------------------------------------------------------------------------- | --------------------------- |
|
||||
| **Transaction** | `tx.hash`, `tx.type`, `tx.result`, `tx.fee`, `ledger_index` | Trace transaction lifecycle |
|
||||
| **Consensus** | `round`, `phase`, `mode`, `proposers`(public key or public node id), `duration_ms` | Analyze consensus timing |
|
||||
| **RPC** | `command`, `version`, `status`, `duration_ms` | Monitor RPC performance |
|
||||
| **Peer** | `peer.id`(public key), `latency_ms`, `message.type`, `message.size` | Network topology analysis |
|
||||
| **Ledger** | `ledger.hash`, `ledger.index`, `close_time`, `tx_count` | Ledger progression tracking |
|
||||
| **Job** | `job.type`, `queue_ms`, `worker` | JobQueue performance |
|
||||
| Category | Attributes Collected | Purpose |
|
||||
| --------------- | ------------------------------------------------------------------------------------ | --------------------------- |
|
||||
| **Transaction** | `tx.hash`, `tx.type`, `tx.result`, `tx.fee`, `ledger_index` | Trace transaction lifecycle |
|
||||
| **Consensus** | `round`, `phase`, `mode`, `proposers` (count of proposing validators), `duration_ms` | Analyze consensus timing |
|
||||
| **RPC** | `command`, `version`, `status`, `duration_ms` | Monitor RPC performance |
|
||||
| **Peer** | `peer.id`(public key), `latency_ms`, `message.type`, `message.size` | Network topology analysis |
|
||||
| **Ledger** | `ledger.hash`, `ledger.index`, `close_time`, `tx_count` | Ledger progression tracking |
|
||||
| **Job** | `job.type`, `queue_ms`, `worker` | JobQueue performance |
|
||||
|
||||
### What is NOT Collected (Privacy Guarantees)
|
||||
|
||||
@@ -263,6 +649,13 @@ flowchart LR
|
||||
style F fill:#c62828,stroke:#8c2809,color:#fff
|
||||
```
|
||||
|
||||
**Reading the diagram:**
|
||||
|
||||
- **NOT Collected (top row, red)**: Private Keys, Account Balances, and Transaction Amounts are explicitly excluded — these are financial/security-sensitive fields that telemetry never touches.
|
||||
- **Also Excluded (bottom row, red)**: IP Addresses (configurable per deployment), Personal Data, and Raw TX Payloads are also excluded — these protect operator and user privacy.
|
||||
- **All-red styling**: Every box is styled in red to visually reinforce that these are hard exclusions, not optional — the telemetry system has no code path to collect any of these fields.
|
||||
- **Two-row layout**: The split between "NOT Collected" and "Also Excluded" distinguishes between financial data (top) and operational/personal data (bottom), making the privacy boundaries clear to auditors.
|
||||
|
||||
### Privacy Protection Mechanisms
|
||||
|
||||
| Mechanism | Description |
|
||||
|
||||
@@ -276,6 +276,7 @@ words:
|
||||
- txjson
|
||||
- txn
|
||||
- txns
|
||||
- txqueue
|
||||
- txs
|
||||
- UBSAN
|
||||
- ubsan
|
||||
|
||||
Reference in New Issue
Block a user