mirror of https://github.com/XRPLF/rippled.git synced 2026-04-29 15:37:57 +00:00

Files

Pratik Mankawde f135842071 docs: correct OTel overhead estimates against SDK benchmarks

Verified CPU, memory, and network overhead calculations against
official OTel C++ SDK benchmarks (969 CI runs) and source code
analysis. Key corrections:

- Span creation: 200-500ns → 500-1000ns (SDK BM_SpanCreation median
  ~1000ns; original estimate matched API no-op, not SDK path)
- Per-TX overhead: 2.4μs → 4.0μs (2.0% vs 1.2%; still within 1-3%)
- Active span memory: ~200 bytes → ~500-800 bytes (Span wrapper +
  SpanData + std::map attribute storage)
- Static memory: ~456KB → ~8.3MB (BatchSpanProcessor worker thread
  stack ~8MB was omitted)
- Total memory ceiling: ~2.3MB → ~10MB
- Memory success metric target: <5MB → <10MB
- AddEvent: 50-80ns → 100-200ns

Added Section 3.5.4 with links to all benchmark sources.
Updated presentation.md with matching corrections.
High-level conclusions unchanged (1-3% CPU, negligible consensus).

Also includes: review fixes, cross-document consistency improvements,
additional component tracing docs (PathFinding, TxQ, Validator, etc.),
context size corrections (32 → 25 bytes).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-30 15:55:26 +01:00

24 KiB

Raw Blame History

Implementation Strategy

Parent Document: OpenTelemetryPlan.md Related: Code Samples | Configuration Reference

3.1 Directory Structure

The telemetry implementation follows rippled's existing code organization pattern:

include/xrpl/
├── telemetry/
│   ├── Telemetry.h              # Main telemetry interface
│   ├── TelemetryConfig.h        # Configuration structures
│   ├── TraceContext.h           # Context propagation utilities
│   ├── SpanGuard.h              # RAII span management
│   └── SpanAttributes.h         # Attribute helper functions

src/libxrpl/
├── telemetry/
│   ├── Telemetry.cpp            # Implementation
│   ├── TelemetryConfig.cpp      # Config parsing
│   ├── TraceContext.cpp         # Context serialization
│   └── NullTelemetry.cpp        # No-op implementation

src/xrpld/
├── telemetry/
│   ├── TracingInstrumentation.h # Instrumentation macros
│   └── TracingInstrumentation.cpp

3.2 Implementation Approach

%%{init: {'flowchart': {'nodeSpacing': 20, 'rankSpacing': 30}}}%%
flowchart TB
    subgraph phase1["Phase 1: Core"]
        direction LR
        sdk["SDK Integration"] ~~~ interface["Telemetry Interface"] ~~~ config["Configuration"]
    end

    subgraph phase2["Phase 2: RPC"]
        direction LR
        http["HTTP Context"] ~~~ rpc["RPC Handlers"]
    end

    subgraph phase3["Phase 3: P2P"]
        direction LR
        proto["Protobuf Context"] ~~~ tx["Transaction Relay"]
    end

    subgraph phase4["Phase 4: Consensus"]
        direction LR
        consensus["Consensus Rounds"] ~~~ proposals["Proposals"]
    end

    phase1 --> phase2 --> phase3 --> phase4

    style phase1 fill:#1565c0,stroke:#0d47a1,color:#ffffff
    style phase2 fill:#2e7d32,stroke:#1b5e20,color:#ffffff
    style phase3 fill:#e65100,stroke:#bf360c,color:#ffffff
    style phase4 fill:#c2185b,stroke:#880e4f,color:#ffffff

Key Principles

Minimal Intrusion: Instrumentation should not alter existing control flow
Zero-Cost When Disabled: Use compile-time flags and no-op implementations
Backward Compatibility: Protocol Buffer extensions use high field numbers
Graceful Degradation: Tracing failures must not affect node operation

3.3 Performance Overhead Summary

OTLP = OpenTelemetry Protocol

Metric	Overhead	Notes
CPU	1-3%	Of per-transaction CPU cost (~200μs baseline)
Memory	~10 MB	SDK statics + batch buffer + worker thread stack
Network	10-50 KB/s	Compressed OTLP export to collector
Latency (p99)	<2%	With proper sampling configuration

3.4 Detailed CPU Overhead Analysis

3.4.1 Per-Operation Costs

Note on hardware assumptions: The costs below are based on the official OTel C++ SDK CI benchmarks (969 runs on GitHub Actions 2-core shared runners). On production server hardware (3+ GHz Xeon), expect costs at the lower end of each range (~30-50% improvement over CI hardware).

Operation	Time (ns)	Frequency	Impact
Span creation	500-1000	Every traced operation	Low
Span end	100-200	Every traced operation	Low
SetAttribute (string)	80-120	3-5 per span	Low
SetAttribute (int)	40-60	2-3 per span	Negligible
AddEvent	100-200	0-2 per span	Low
Context injection	150-250	Per outgoing message	Low
Context extraction	100-180	Per incoming message	Low
GetCurrent context	10-20	Thread-local access	Negligible

Source: Span creation based on OTel C++ SDK BM_SpanCreation benchmark (AlwaysOnSampler + SimpleSpanProcessor + InMemoryExporter), median ~1,000 ns on CI hardware. AddEvent includes timestamp read + string copy + vector push + mutex acquisition. Context injection/extraction confirmed by BM_SpanCreationWithScope benchmark delta (~160 ns).

3.4.2 Transaction Processing Overhead

%%{init: {'pie': {'textPosition': 0.75}}}%%
pie showData
    "tx.receive (1400ns)" : 1400
    "tx.validate (1200ns)" : 1200
    "tx.relay (1200ns)" : 1200
    "Context inject (200ns)" : 200

Transaction Tracing Overhead (~4.0μs total)

Overhead percentage: 4.0 μs / 200 μs (avg tx processing) = ~2.0%

Breakdown: Each span (tx.receive, tx.validate, tx.relay) costs ~1,000 ns for creation plus ~200-400 ns for 3-5 attribute sets. Context injection is ~200 ns (confirmed by benchmarks). On production hardware, expect ~2.6 μs total (~1.3% overhead) due to faster span creation (~500-600 ns).

3.4.3 Consensus Round Overhead

Operation	Count	Cost (ns)	Total
consensus.round span	1	~1200	~1.2 μs
consensus.phase spans	3	~1100	~3.3 μs
proposal.receive spans	~20	~1100	~22 μs
proposal.send spans	~3	~1100	~3.3 μs
Context operations	~30	~200	~6 μs
TOTAL			~36 μs

Why higher: Each span costs ~1,000 ns creation + ~100-200 ns for 1-2 attributes, totaling ~1,100-1,200 ns. Context operations remain ~200 ns (confirmed by benchmarks). On production hardware, expect ~24 μs total.

Overhead percentage: 36 μs / 3s (typical round) = ~0.001% (negligible)

3.4.4 RPC Request Overhead

Operation	Cost (ns)
rpc.request span	~1200
rpc.command span	~1100
Context extract	~250
Context inject	~200
TOTAL	~2.75 μs

Why higher: Each span costs ~1,000 ns creation + ~100-200 ns for attributes (command name, version, role). Context extract/inject costs are confirmed by OTel C++ benchmarks.

Fast RPC (1ms): 2.75 μs / 1ms = ~0.275%
Slow RPC (100ms): 2.75 μs / 100ms = ~0.003%

3.5 Memory Overhead Analysis

OTLP = OpenTelemetry Protocol

3.5.1 Static Memory

Component	Size	Allocated
TracerProvider singleton	~64 KB	At startup
BatchSpanProcessor (circular buffer)	~16 KB	At startup
BatchSpanProcessor (worker thread)	~8 MB	At startup
OTLP exporter (gRPC channel init)	~256 KB	At startup
Propagator registry	~8 KB	At startup
Total static	~8.3 MB

Why higher than earlier estimate: The BatchSpanProcessor's circular buffer itself is only ~16 KB (2049 x 8-byte AtomicUniquePtr entries), but it spawns a dedicated worker thread whose default stack size on Linux is ~8 MB. The OTLP gRPC exporter allocates memory for channel stubs and TLS initialization. The worker thread stack dominates the static footprint.

3.5.2 Dynamic Memory

Component	Size per unit	Max units	Peak
Active span	~500-800 bytes	1000	~500-800 KB
Queued span (export)	~500 bytes	2048	~1 MB
Attribute storage	~80 bytes	5 per span	Included
Context storage	~64 bytes	Per thread	~6.4 KB
Total dynamic			~1.5-1.8 MB

Why active spans are larger: An active Span object includes the wrapper (~88 bytes: shared_ptr, mutex, unique_ptr to Recordable) plus SpanData (~250 bytes: SpanContext, timestamps, name, status, empty containers) plus attribute storage (~200-500 bytes for 3-5 string attributes in a std::map). Source: sdk/src/trace/span.h and sdk/include/opentelemetry/sdk/trace/span_data.h. Queued spans release the wrapper, keeping only SpanData + attributes (~500 bytes).

3.5.3 Memory Growth Characteristics

---
config:
    xyChart:
        width: 700
        height: 400
---
xychart-beta
    title "Memory Usage vs Span Rate (bounded by queue limit)"
    x-axis "Spans/second" [0, 200, 400, 600, 800, 1000]
    y-axis "Memory (MB)" 0 --> 12
    line [8.5, 9.2, 9.6, 9.9, 10.0, 10.0]

Notes:

Memory increases with span rate but plateaus at queue capacity (default 2048 spans)
Batch export prevents unbounded growth
At queue limit, oldest spans are dropped (not blocked)
Maximum memory is bounded: ~8.3 MB static (dominated by worker thread stack) + 2048 queued spans x ~500 bytes (~1 MB) + active spans (~0.8 MB) ≈ ~10 MB ceiling
The worker thread stack (~8 MB) is virtual memory; actual RSS depends on stack usage (typically much less)

3.5.4 Performance Data Sources

The overhead estimates in Sections 3.3-3.5 are derived from the following sources:

Source	What it covers	URL
OTel C++ SDK CI benchmarks (969 runs)	Span creation, context activation, sampler overhead	Benchmark Dashboard
`api/test/trace/span_benchmark.cc`	API-level span creation (~22 ns no-op)	Source
`sdk/test/trace/sampler_benchmark.cc`	SDK span creation with samplers (~1,000 ns AlwaysOn)	Source
`sdk/include/.../span_data.h`	SpanData memory layout (~250 bytes base)	Source
`sdk/src/trace/span.h`	Span wrapper memory layout (~88 bytes)	Source
`sdk/include/.../batch_span_processor_options.h`	Default queue size (2048), batch size (512)	Source
`sdk/include/.../circular_buffer.h`	CircularBuffer implementation (AtomicUniquePtr array)	Source
OTLP proto definition	Serialized span size estimation	Proto

3.6 Network Overhead Analysis

3.6.1 Export Bandwidth

Bytes per span: Estimates use ~500 bytes/span (conservative upper bound). OTLP protobuf analysis shows a typical span with 3-5 string attributes serializes to ~200-300 bytes raw; with gzip compression (~60-70% of raw) and batching (amortized headers), ~350 bytes/span is more realistic. The table uses the conservative estimate for capacity planning.

Sampling Rate	Spans/sec	Bandwidth	Notes
100%	~500	~250 KB/s	Development only
10%	~50	~25 KB/s	Staging
1%	~5	~2.5 KB/s	Production
Error-only	~1	~0.5 KB/s	Minimal overhead

3.6.2 Trace Context Propagation

Message Type	Context Size	Messages/sec	Overhead
TMTransaction	25 bytes	~100	~2.5 KB/s
TMProposeSet	25 bytes	~10	~250 B/s
TMValidation	25 bytes	~50	~1.25 KB/s
Total P2P overhead			~4 KB/s

3.7 Optimization Strategies

3.7.1 Sampling Strategies

Tail Sampling

flowchart TD
    trace["New Trace"]

    trace --> errors{"Is Error?"}
    errors -->|Yes| sample["SAMPLE"]
    errors -->|No| consensus{"Is Consensus?"}

    consensus -->|Yes| sample
    consensus -->|No| slow{"Is Slow?"}

    slow -->|Yes| sample
    slow -->|No| prob{"Random < 10%?"}

    prob -->|Yes| sample
    prob -->|No| drop["DROP"]

    style sample fill:#4caf50,stroke:#388e3c,color:#fff
    style drop fill:#f44336,stroke:#c62828,color:#fff

3.7.2 Batch Tuning Recommendations

Environment	Batch Size	Batch Delay	Max Queue
Low-latency	128	1000ms	512
High-throughput	1024	10000ms	8192
Memory-constrained	256	2000ms	512

3.7.3 Conditional Instrumentation

// Compile-time feature flag
#ifndef XRPL_ENABLE_TELEMETRY
// Zero-cost when disabled
#define XRPL_TRACE_SPAN(t, n) ((void)0)
#endif

// Runtime component filtering
if (telemetry.shouldTracePeer())
{
    XRPL_TRACE_SPAN(telemetry, "peer.message.receive");
    // ... instrumentation
}
// No overhead when component tracing disabled

3.8 Links to Detailed Documentation

Code Samples: Complete implementation code for all components
Configuration Reference: Configuration options and collector setup
Implementation Phases: Detailed timeline and milestones

3.9 Code Intrusiveness Assessment

TxQ = Transaction Queue

This section provides a detailed assessment of how intrusive the OpenTelemetry integration is to the existing rippled codebase.

3.9.1 Files Modified Summary

Component	Files Modified	Lines Added	Lines Changed	Architectural Impact
Core Telemetry	5 new files	~800	0	None (new module)
Application Init	2 files	~30	~5	Minimal
RPC Layer	3 files	~80	~20	Minimal
Transaction Relay	4 files	~120	~40	Low
Consensus	3 files	~100	~30	Low-Medium
Protocol Buffers	1 file	~25	0	Low
CMake/Build	3 files	~50	~10	Minimal
PathFinding	2	~80	~5	Minimal
TxQ/Fee	2	~60	~5	Minimal
Validator/Amend	3	~40	~5	Minimal
Total	~28 files	~1,490	~120	Low

3.9.2 Detailed File Impact

pie title Code Changes by Component
    "New Telemetry Module" : 800
    "Transaction Relay" : 160
    "Consensus" : 130
    "RPC Layer" : 100
    "PathFinding" : 80
    "TxQ/Fee" : 60
    "Validator/Amendment" : 40
    "Application Init" : 35
    "Protocol Buffers" : 25
    "Build System" : 60

New Files (No Impact on Existing Code)

File	Lines	Purpose
`include/xrpl/telemetry/Telemetry.h`	~160	Main interface
`include/xrpl/telemetry/SpanGuard.h`	~120	RAII wrapper
`include/xrpl/telemetry/TraceContext.h`	~80	Context propagation
`src/xrpld/telemetry/TracingInstrumentation.h`	~60	Macros
`src/libxrpl/telemetry/Telemetry.cpp`	~200	Implementation
`src/libxrpl/telemetry/TelemetryConfig.cpp`	~60	Config parsing
`src/libxrpl/telemetry/NullTelemetry.cpp`	~40	No-op implementation

Modified Files (Existing Rippled Code)

File	Lines Added	Lines Changed	Risk Level
`src/xrpld/app/main/Application.cpp`	~15	~3	Low
`include/xrpl/app/main/Application.h`	~5	~2	Low
`src/xrpld/rpc/detail/ServerHandler.cpp`	~40	~10	Low
`src/xrpld/rpc/handlers/*.cpp`	~30	~8	Low
`src/xrpld/overlay/detail/PeerImp.cpp`	~60	~15	Medium
`src/xrpld/overlay/detail/OverlayImpl.cpp`	~30	~10	Medium
`src/xrpld/app/consensus/RCLConsensus.cpp`	~50	~15	Medium
`src/xrpld/app/consensus/RCLConsensusAdaptor.cpp`	~40	~12	Medium
`src/xrpld/core/JobQueue.cpp`	~20	~5	Low
`src/xrpld/app/paths/PathRequest.cpp`	~40	~3	Low
`src/xrpld/app/paths/Pathfinder.cpp`	~40	~2	Low
`src/xrpld/app/misc/TxQ.cpp`	~40	~3	Low
`src/xrpld/app/main/LoadManager.cpp`	~20	~2	Low
`src/xrpld/app/misc/ValidatorList.cpp`	~20	~2	Low
`src/xrpld/app/misc/AmendmentTable.cpp`	~10	~2	Low
`src/xrpld/app/misc/Manifest.cpp`	~10	~1	Low
`src/xrpld/shamap/SHAMap.cpp`	~20	~3	Low
`src/xrpld/overlay/detail/ripple.proto`	~25	0	Low
`CMakeLists.txt`	~40	~8	Low
`cmake/FindOpenTelemetry.cmake`	~50	0	None (new)

3.9.3 Risk Assessment by Component

Do First ↖ ↗ Plan Carefully

quadrantChart
    title Code Intrusiveness Risk Matrix
    x-axis Low Risk --> High Risk
    y-axis Low Value --> High Value

    RPC Tracing: [0.2, 0.55]
    Transaction Relay: [0.55, 0.85]
    Consensus Tracing: [0.75, 0.92]
    Peer Message Tracing: [0.85, 0.35]
    JobQueue Context: [0.3, 0.42]
    Ledger Acquisition: [0.48, 0.65]
    PathFinding: [0.38, 0.72]
    TxQ and Fees: [0.25, 0.62]
    Validator Mgmt: [0.15, 0.35]

Optional ↙ ↘ Avoid

Risk Level Definitions

Risk Level	Definition	Mitigation
Low	Additive changes only; no modification to existing logic	Standard code review
Medium	Minor modifications to existing functions; clear boundaries	Comprehensive unit tests
High	Changes to core logic or data structures; potential side effects	Integration tests + staged rollout

3.9.4 Architectural Impact Assessment

Aspect	Impact	Justification
Data Flow	Minimal	Read-only instrumentation; no modification to consensus or transaction data flow
Threading Model	Minimal	Context propagation uses thread-local storage (standard OTel pattern)
Memory Model	Low	Bounded queues prevent unbounded growth; RAII ensures cleanup
Network Protocol	Low	Optional fields in protobuf (high field numbers); backward compatible
Configuration	None	New config section; existing configs unaffected
Build System	Low	Optional CMake flag; builds work without OpenTelemetry
Dependencies	Low	OpenTelemetry SDK is optional; null implementation when disabled

3.9.5 Backward Compatibility

Compatibility	Status	Notes
Config File	✅ Full	New `[telemetry]` section is optional
Protocol	✅ Full	Optional protobuf fields with high field numbers
Build	✅ Full	`XRPL_ENABLE_TELEMETRY=OFF` produces identical binary
Runtime	✅ Full	`enabled=0` produces zero overhead
API	✅ Full	No changes to public RPC or P2P APIs

3.9.6 Rollback Strategy

If issues are discovered after deployment:

Immediate: Set enabled=0 in config and restart (zero code change)
Quick: Rebuild with XRPL_ENABLE_TELEMETRY=OFF
Complete: Revert telemetry commits (clean separation makes this easy)

3.9.7 Code Change Examples

Minimal RPC Instrumentation (Low Intrusiveness):

// Before
void ServerHandler::onRequest(...) {
    auto result = processRequest(req);
    send(result);
}

// After (only ~10 lines added)
void ServerHandler::onRequest(...) {
    XRPL_TRACE_RPC(app_.getTelemetry(), "rpc.request");  // +1 line
    XRPL_TRACE_SET_ATTR("xrpl.rpc.command", command);     // +1 line

    auto result = processRequest(req);

    XRPL_TRACE_SET_ATTR("xrpl.rpc.status", status);       // +1 line
    send(result);
}

Consensus Instrumentation (Medium Intrusiveness):

// Before
void RCLConsensusAdaptor::startRound(...) {
    // ... existing logic
}

// After (context storage required)
void RCLConsensusAdaptor::startRound(...) {
    XRPL_TRACE_CONSENSUS(app_.getTelemetry(), "consensus.round");
    XRPL_TRACE_SET_ATTR("xrpl.consensus.ledger.seq", seq);

    // Store context for child spans in phase transitions
    currentRoundContext_ = _xrpl_guard_->context();  // New member variable

    // ... existing logic unchanged
}

Previous: Design Decisions | Next: Code Samples | Back to: Overview

24 KiB Raw Blame History