From 6e8f0624cee53d72654d8028ac2ff82695b2995b Mon Sep 17 00:00:00 2001 From: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com> Date: Fri, 20 Feb 2026 15:41:01 +0000 Subject: [PATCH] compare with other open source vendors Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com> --- OpenTelemetryPlan/00-tracing-fundamentals.md | 7 +- OpenTelemetryPlan/01-architecture-analysis.md | 4 +- OpenTelemetryPlan/02-design-decisions.md | 75 +++++++++++-------- .../03-implementation-strategy.md | 9 ++- OpenTelemetryPlan/04-code-samples.md | 2 +- .../05-configuration-reference.md | 26 +++---- OpenTelemetryPlan/06-implementation-phases.md | 10 ++- .../07-observability-backends.md | 7 +- OpenTelemetryPlan/08-appendix.md | 2 +- OpenTelemetryPlan/OpenTelemetryPlan.md | 3 +- cspell.config.yaml | 6 ++ presentation.md | 37 ++++++--- 12 files changed, 121 insertions(+), 67 deletions(-) diff --git a/OpenTelemetryPlan/00-tracing-fundamentals.md b/OpenTelemetryPlan/00-tracing-fundamentals.md index e623ea351c..1e61ed9584 100644 --- a/OpenTelemetryPlan/00-tracing-fundamentals.md +++ b/OpenTelemetryPlan/00-tracing-fundamentals.md @@ -136,6 +136,7 @@ For traces to work across nodes, **trace context must be propagated** in message ### How span_id Changes at Each Hop Only **one** `span_id` travels in the context - the sender's current span. Each node: + 1. Extracts the received `span_id` and uses it as the `parent_span_id` 2. Creates a **new** `span_id` for its own span 3. Sends its own `span_id` as the parent when forwarding @@ -192,19 +193,23 @@ message TMTransaction { Not every trace needs to be recorded. **Sampling** reduces overhead: ### Head Sampling (at trace start) + ``` Request arrives → Random 10% chance → Record or skip entire trace ``` + - ✅ Low overhead - ❌ May miss interesting traces ### Tail Sampling (after trace completes) + ``` Trace completes → Collector evaluates: - Error? → KEEP - Slow? → KEEP - Normal? → Sample 10% ``` + - ✅ Never loses important traces - ❌ Higher memory usage at collector @@ -236,4 +241,4 @@ Trace completes → Collector evaluates: --- -*Next: [Architecture Analysis](./01-architecture-analysis.md)* | *Back to: [Overview](./OpenTelemetryPlan.md)* +_Next: [Architecture Analysis](./01-architecture-analysis.md)_ | _Back to: [Overview](./OpenTelemetryPlan.md)_ diff --git a/OpenTelemetryPlan/01-architecture-analysis.md b/OpenTelemetryPlan/01-architecture-analysis.md index d29ebf21b3..9eb448d78c 100644 --- a/OpenTelemetryPlan/01-architecture-analysis.md +++ b/OpenTelemetryPlan/01-architecture-analysis.md @@ -250,6 +250,7 @@ After implementing OpenTelemetry, operators and developers will gain visibility ### 1.8.3 Concrete Dashboard Examples **Transaction Trace View (Jaeger/Tempo):** + ``` ┌────────────────────────────────────────────────────────────────────────────────┐ │ Trace: abc123... (Transaction Submission) Duration: 847ms │ @@ -270,6 +271,7 @@ After implementing OpenTelemetry, operators and developers will gain visibility ``` **RPC Performance Dashboard Panel:** + ``` ┌─────────────────────────────────────────────────────────────┐ │ RPC Command Latency (Last 1 Hour) │ @@ -325,4 +327,4 @@ xychart-beta --- -*Next: [Design Decisions](./02-design-decisions.md)* | *Back to: [Overview](./OpenTelemetryPlan.md)* +_Next: [Design Decisions](./02-design-decisions.md)_ | _Back to: [Overview](./OpenTelemetryPlan.md)_ diff --git a/OpenTelemetryPlan/02-design-decisions.md b/OpenTelemetryPlan/02-design-decisions.md index a8deabe2c3..793dd6b5ac 100644 --- a/OpenTelemetryPlan/02-design-decisions.md +++ b/OpenTelemetryPlan/02-design-decisions.md @@ -95,6 +95,7 @@ opts.content_type = otlp::HttpRequestContentType::kJson; // or kBinary ``` **Examples**: + - `tx.receive` - Transaction received from peer - `consensus.phase.establish` - Consensus establish phase - `rpc.command.server_info` - server_info RPC command @@ -104,51 +105,51 @@ opts.content_type = otlp::HttpRequestContentType::kJson; // or kBinary ```yaml # Transaction Spans tx: - receive: "Transaction received from network" - validate: "Transaction signature/format validation" - process: "Full transaction processing" - relay: "Transaction relay to peers" - apply: "Apply transaction to ledger" + receive: "Transaction received from network" + validate: "Transaction signature/format validation" + process: "Full transaction processing" + relay: "Transaction relay to peers" + apply: "Apply transaction to ledger" # Consensus Spans consensus: - round: "Complete consensus round" + round: "Complete consensus round" phase: - open: "Open phase - collecting transactions" + open: "Open phase - collecting transactions" establish: "Establish phase - reaching agreement" - accept: "Accept phase - applying consensus" + accept: "Accept phase - applying consensus" proposal: - receive: "Receive peer proposal" - send: "Send our proposal" + receive: "Receive peer proposal" + send: "Send our proposal" validation: - receive: "Receive peer validation" - send: "Send our validation" + receive: "Receive peer validation" + send: "Send our validation" # RPC Spans rpc: - request: "HTTP/WebSocket request handling" + request: "HTTP/WebSocket request handling" command: - "*": "Specific RPC command (dynamic)" + "*": "Specific RPC command (dynamic)" # Peer Spans peer: - connect: "Peer connection establishment" - disconnect: "Peer disconnection" + connect: "Peer connection establishment" + disconnect: "Peer disconnection" message: - send: "Send protocol message" - receive: "Receive protocol message" + send: "Send protocol message" + receive: "Receive protocol message" # Ledger Spans ledger: - acquire: "Ledger acquisition from network" - build: "Build new ledger" - validate: "Ledger validation" - close: "Close ledger" + acquire: "Ledger acquisition from network" + build: "Build new ledger" + validate: "Ledger validation" + close: "Close ledger" # Job Spans job: - enqueue: "Job added to queue" - execute: "Job execution" + enqueue: "Job added to queue" + execute: "Job execution" ``` --- @@ -173,6 +174,7 @@ resource::SemanticConventions::SERVICE_INSTANCE_ID = ### 2.4.2 Span Attributes by Category #### Transaction Attributes + ```cpp "xrpl.tx.hash" = string // Transaction hash (hex) "xrpl.tx.type" = string // "Payment", "OfferCreate", etc. @@ -184,6 +186,7 @@ resource::SemanticConventions::SERVICE_INSTANCE_ID = ``` #### Consensus Attributes + ```cpp "xrpl.consensus.round" = int64 // Round number "xrpl.consensus.phase" = string // "open", "establish", "accept" @@ -196,6 +199,7 @@ resource::SemanticConventions::SERVICE_INSTANCE_ID = ``` #### RPC Attributes + ```cpp "xrpl.rpc.command" = string // Command name "xrpl.rpc.version" = int64 // API version @@ -204,6 +208,7 @@ resource::SemanticConventions::SERVICE_INSTANCE_ID = ``` #### Peer & Message Attributes + ```cpp "xrpl.peer.id" = string // Peer public key (base58) "xrpl.peer.address" = string // IP:port @@ -215,6 +220,7 @@ resource::SemanticConventions::SERVICE_INSTANCE_ID = ``` #### Ledger & Job Attributes + ```cpp "xrpl.ledger.hash" = string // Ledger hash "xrpl.ledger.index" = int64 // Ledger sequence/index @@ -352,6 +358,7 @@ rippled already has two observability mechanisms. OpenTelemetry complements (not ### 2.6.2 What Each Framework Does Best #### PerfLog + - **Purpose**: Detailed local event logging for RPC and job execution - **Strengths**: - Rich JSON output with timing data @@ -373,6 +380,7 @@ rippled already has two observability mechanisms. OpenTelemetry complements (not ``` #### Beast Insight (StatsD) + - **Purpose**: Real-time metrics for monitoring dashboards - **Strengths**: - Aggregated metrics (counters, gauges, histograms) @@ -391,6 +399,7 @@ insight.timing("consensus.round", duration); ``` #### OpenTelemetry (NEW) + - **Purpose**: Distributed request tracing across nodes - **Strengths**: - **Cross-node correlation** via `trace_id` @@ -411,14 +420,14 @@ span->SetAttribute("peer.id", peerId); ### 2.6.3 When to Use Each -| Scenario | PerfLog | StatsD | OpenTelemetry | -| --------------------------------------- | --------- | ------ | ------------- | -| "How many TXs per second?" | ❌ | ✅ | ❌ | -| "What's the p99 RPC latency?" | ❌ | ✅ | ✅ | -| "Why was this specific TX slow?" | ⚠️ partial | ❌ | ✅ | -| "Which node delayed consensus?" | ❌ | ❌ | ✅ | -| "What happened on node X at time T?" | ✅ | ❌ | ✅ | -| "Show me the TX journey across 5 nodes" | ❌ | ❌ | ✅ | +| Scenario | PerfLog | StatsD | OpenTelemetry | +| --------------------------------------- | ---------- | ------ | ------------- | +| "How many TXs per second?" | ❌ | ✅ | ❌ | +| "What's the p99 RPC latency?" | ❌ | ✅ | ✅ | +| "Why was this specific TX slow?" | ⚠️ partial | ❌ | ✅ | +| "Which node delayed consensus?" | ❌ | ❌ | ✅ | +| "What happened on node X at time T?" | ✅ | ❌ | ✅ | +| "Show me the TX journey across 5 nodes" | ❌ | ❌ | ✅ | ### 2.6.4 Coexistence Strategy @@ -482,4 +491,4 @@ Status doCommand(RPC::JsonContext& context, Json::Value& result) --- -*Previous: [Architecture Analysis](./01-architecture-analysis.md)* | *Next: [Implementation Strategy](./03-implementation-strategy.md)* | *Back to: [Overview](./OpenTelemetryPlan.md)* +_Previous: [Architecture Analysis](./01-architecture-analysis.md)_ | _Next: [Implementation Strategy](./03-implementation-strategy.md)_ | _Back to: [Overview](./OpenTelemetryPlan.md)_ diff --git a/OpenTelemetryPlan/03-implementation-strategy.md b/OpenTelemetryPlan/03-implementation-strategy.md index 05be0fce32..723fe4978a 100644 --- a/OpenTelemetryPlan/03-implementation-strategy.md +++ b/OpenTelemetryPlan/03-implementation-strategy.md @@ -191,6 +191,7 @@ xychart-beta ``` **Notes**: + - Memory increases linearly with span rate - Batch export prevents unbounded growth - Queue size is configurable (default 2048 spans) @@ -386,8 +387,8 @@ quadrantChart ### 3.9.5 Backward Compatibility -| Compatibility | Status | Notes | -| --------------- | ------ | ----------------------------------------------------- | +| Compatibility | Status | Notes | +| --------------- | ------- | ----------------------------------------------------- | | **Config File** | ✅ Full | New `[telemetry]` section is optional | | **Protocol** | ✅ Full | Optional protobuf fields with high field numbers | | **Build** | ✅ Full | `XRPL_ENABLE_TELEMETRY=OFF` produces identical binary | @@ -405,6 +406,7 @@ If issues are discovered after deployment: ### 3.9.7 Code Change Examples **Minimal RPC Instrumentation (Low Intrusiveness):** + ```cpp // Before void ServerHandler::onRequest(...) { @@ -425,6 +427,7 @@ void ServerHandler::onRequest(...) { ``` **Consensus Instrumentation (Medium Intrusiveness):** + ```cpp // Before void RCLConsensusAdaptor::startRound(...) { @@ -445,4 +448,4 @@ void RCLConsensusAdaptor::startRound(...) { --- -*Previous: [Design Decisions](./02-design-decisions.md)* | *Next: [Code Samples](./04-code-samples.md)* | *Back to: [Overview](./OpenTelemetryPlan.md)* +_Previous: [Design Decisions](./02-design-decisions.md)_ | _Next: [Code Samples](./04-code-samples.md)_ | _Back to: [Overview](./OpenTelemetryPlan.md)_ diff --git a/OpenTelemetryPlan/04-code-samples.md b/OpenTelemetryPlan/04-code-samples.md index df192a33ac..3daf6adfbf 100644 --- a/OpenTelemetryPlan/04-code-samples.md +++ b/OpenTelemetryPlan/04-code-samples.md @@ -979,4 +979,4 @@ flowchart TB --- -*Previous: [Implementation Strategy](./03-implementation-strategy.md)* | *Next: [Configuration Reference](./05-configuration-reference.md)* | *Back to: [Overview](./OpenTelemetryPlan.md)* +_Previous: [Implementation Strategy](./03-implementation-strategy.md)_ | _Next: [Configuration Reference](./05-configuration-reference.md)_ | _Back to: [Overview](./OpenTelemetryPlan.md)_ diff --git a/OpenTelemetryPlan/05-configuration-reference.md b/OpenTelemetryPlan/05-configuration-reference.md index fcfbea4873..b13cc839ab 100644 --- a/OpenTelemetryPlan/05-configuration-reference.md +++ b/OpenTelemetryPlan/05-configuration-reference.md @@ -506,7 +506,7 @@ service: ```yaml # docker-compose-telemetry.yaml -version: '3.8' +version: "3.8" services: # OpenTelemetry Collector @@ -517,8 +517,8 @@ services: volumes: - ./otel-collector-dev.yaml:/etc/otel-collector-config.yaml:ro ports: - - "4317:4317" # OTLP gRPC - - "4318:4318" # OTLP HTTP + - "4317:4317" # OTLP gRPC + - "4318:4318" # OTLP HTTP - "13133:13133" # Health check depends_on: - jaeger @@ -628,8 +628,8 @@ datasources: httpMethod: GET tracesToLogs: datasourceUid: loki - tags: ['service.name', 'xrpl.tx.hash'] - mappedTags: [{ key: 'trace_id', value: 'traceID' }] + tags: ["service.name", "xrpl.tx.hash"] + mappedTags: [{ key: "trace_id", value: "traceID" }] mapTagNamesEnabled: true filterByTraceID: true serviceMap: @@ -656,7 +656,7 @@ datasources: jsonData: tracesToLogs: datasourceUid: loki - tags: ['service.name'] + tags: ["service.name"] ``` #### Elastic APM @@ -685,10 +685,10 @@ datasources: apiVersion: 1 providers: - - name: 'rippled-dashboards' + - name: "rippled-dashboards" orgId: 1 - folder: 'rippled' - folderUid: 'rippled' + folder: "rippled" + folderUid: "rippled" type: file disableDeletion: false updateIntervalSeconds: 30 @@ -880,7 +880,7 @@ In Tempo data source configuration, set up the derived field: jsonData: tracesToLogs: datasourceUid: loki - tags: ['trace_id', 'xrpl.tx.hash'] + tags: ["trace_id", "xrpl.tx.hash"] filterByTraceID: true filterBySpanID: false ``` @@ -894,9 +894,9 @@ To correlate traces with existing Beast Insight metrics: ```yaml # prometheus.yaml scrape_configs: - - job_name: 'rippled-statsd' + - job_name: "rippled-statsd" static_configs: - - targets: ['statsd-exporter:9102'] + - targets: ["statsd-exporter:9102"] ``` **Step 2: Add exemplars to metrics** @@ -933,4 +933,4 @@ This allows clicking on metric data points to jump directly to the related trace --- -*Previous: [Code Samples](./04-code-samples.md)* | *Next: [Implementation Phases](./06-implementation-phases.md)* | *Back to: [Overview](./OpenTelemetryPlan.md)* +_Previous: [Code Samples](./04-code-samples.md)_ | _Next: [Implementation Phases](./06-implementation-phases.md)_ | _Back to: [Overview](./OpenTelemetryPlan.md)_ diff --git a/OpenTelemetryPlan/06-implementation-phases.md b/OpenTelemetryPlan/06-implementation-phases.md index 71a73aabbd..10b97333ee 100644 --- a/OpenTelemetryPlan/06-implementation-phases.md +++ b/OpenTelemetryPlan/06-implementation-phases.md @@ -315,6 +315,7 @@ flowchart TB **Goal**: Get basic tracing working with minimal code changes. **What You Get**: + - RPC request/response traces for all commands - Latency breakdown per RPC command - Error visibility with stack traces @@ -323,6 +324,7 @@ flowchart TB **Code Changes**: ~15 lines in `ServerHandler.cpp`, ~40 lines in new telemetry module **Why Start Here**: + - RPC is the lowest-risk, highest-visibility component - Immediate value for debugging client issues - No cross-node complexity @@ -333,6 +335,7 @@ flowchart TB **Goal**: Add transaction lifecycle tracing across nodes. **What You Get**: + - End-to-end transaction traces from submit to relay - Cross-node correlation (see transaction path) - HashRouter deduplication visibility @@ -341,6 +344,7 @@ flowchart TB **Code Changes**: ~120 lines across 4 files, plus protobuf extension **Why Do This Second**: + - Builds on RPC tracing (transactions submitted via RPC) - Moderate complexity (requires context propagation) - High value for debugging transaction issues @@ -350,6 +354,7 @@ flowchart TB **Goal**: Full observability including consensus. **What You Get**: + - Complete consensus round visibility - Phase transition timing - Validator proposal tracking @@ -358,6 +363,7 @@ flowchart TB **Code Changes**: ~100 lines across 3 consensus files **Why Do This Last**: + - Highest complexity (consensus is critical path) - Requires thorough testing - Lower relative value (consensus issues are rarer) @@ -392,7 +398,7 @@ Clear, measurable criteria for each phase. | Criterion | Measurement | Target | | --------------- | ---------------------------------------------------------- | ---------------------------- | -| SDK Integration | `cmake --build` succeeds with `-DXRPL_ENABLE_TELEMETRY=ON` | ✅ Compiles | +| SDK Integration | `cmake --build` succeeds with `-DXRPL_ENABLE_TELEMETRY=ON` | ✅ Compiles | | Runtime Toggle | `enabled=0` produces zero overhead | <0.1% CPU difference | | Span Creation | Unit test creates and exports span | Span appears in Jaeger | | Configuration | All config options parsed correctly | Config validation tests pass | @@ -534,4 +540,4 @@ flowchart TB --- -*Previous: [Configuration Reference](./05-configuration-reference.md)* | *Next: [Observability Backends](./07-observability-backends.md)* | *Back to: [Overview](./OpenTelemetryPlan.md)* +_Previous: [Configuration Reference](./05-configuration-reference.md)_ | _Next: [Observability Backends](./07-observability-backends.md)_ | _Back to: [Overview](./OpenTelemetryPlan.md)_ diff --git a/OpenTelemetryPlan/07-observability-backends.md b/OpenTelemetryPlan/07-observability-backends.md index 73ca5dafd7..a90f41ae43 100644 --- a/OpenTelemetryPlan/07-observability-backends.md +++ b/OpenTelemetryPlan/07-observability-backends.md @@ -137,6 +137,7 @@ flowchart TB | **Gateway** | Central collector(s) | Centralized processing | Single point of failure | **Recommendation**: Use **Gateway** pattern with regional collectors for rippled networks: + - One collector cluster per datacenter/region - Tail-based sampling at collector level - Multiple export destinations for redundancy @@ -472,23 +473,27 @@ flowchart TB ### 7.7.3 Example: Debugging a Slow Transaction **Step 1: Find the trace** + ``` # In Grafana Explore with Tempo {resource.service.name="rippled" && span.xrpl.tx.hash="ABC123..."} ``` **Step 2: Get the trace_id from the trace view** + ``` Trace ID: 4bf92f3577b34da6a3ce929d0e0e4736 ``` **Step 3: Find related PerfLog entries** + ``` # In Grafana Explore with Loki {job="rippled"} |= "4bf92f3577b34da6a3ce929d0e0e4736" ``` **Step 4: Check Insight metrics for the time window** + ``` # In Grafana with Prometheus rate(rippled_tx_applied_total[1m]) @@ -587,4 +592,4 @@ rate(rippled_tx_applied_total[1m]) --- -*Previous: [Implementation Phases](./06-implementation-phases.md)* | *Next: [Appendix](./08-appendix.md)* | *Back to: [Overview](./OpenTelemetryPlan.md)* +_Previous: [Implementation Phases](./06-implementation-phases.md)_ | _Next: [Appendix](./08-appendix.md)_ | _Back to: [Overview](./OpenTelemetryPlan.md)_ diff --git a/OpenTelemetryPlan/08-appendix.md b/OpenTelemetryPlan/08-appendix.md index 30b2b68cb9..98470dd13c 100644 --- a/OpenTelemetryPlan/08-appendix.md +++ b/OpenTelemetryPlan/08-appendix.md @@ -130,4 +130,4 @@ flowchart TB --- -*Previous: [Observability Backends](./07-observability-backends.md)* | *Back to: [Overview](./OpenTelemetryPlan.md)* +_Previous: [Observability Backends](./07-observability-backends.md)_ | _Back to: [Overview](./OpenTelemetryPlan.md)_ diff --git a/OpenTelemetryPlan/OpenTelemetryPlan.md b/OpenTelemetryPlan/OpenTelemetryPlan.md index d02f9ca56a..96a1b697de 100644 --- a/OpenTelemetryPlan/OpenTelemetryPlan.md +++ b/OpenTelemetryPlan/OpenTelemetryPlan.md @@ -130,6 +130,7 @@ Performance optimization strategies include probabilistic head sampling (10% def ## 4. Code Samples Complete C++ implementation examples are provided for all telemetry components: + - `Telemetry.h` - Core interface for tracer access and span creation - `SpanGuard.h` - RAII wrapper for automatic span lifecycle management - `TracingInstrumentation.h` - Macros for conditional instrumentation @@ -186,4 +187,4 @@ The appendix contains a glossary of OpenTelemetry and rippled-specific terms, re --- -*This document provides a comprehensive implementation plan for integrating OpenTelemetry distributed tracing into the rippled XRP Ledger node software. For detailed information on any section, follow the links to the corresponding sub-documents.* +_This document provides a comprehensive implementation plan for integrating OpenTelemetry distributed tracing into the rippled XRP Ledger node software. For detailed information on any section, follow the links to the corresponding sub-documents._ diff --git a/cspell.config.yaml b/cspell.config.yaml index b2f4a33769..c90a51dd76 100644 --- a/cspell.config.yaml +++ b/cspell.config.yaml @@ -303,3 +303,9 @@ words: - xrplf - xxhash - xxhasher + - xychart + - otelc + - zpages + - traceql + - Gantt + - gantt diff --git a/presentation.md b/presentation.md index fb6e3c018c..7a443a635c 100644 --- a/presentation.md +++ b/presentation.md @@ -29,7 +29,24 @@ flowchart LR --- -## Slide 2: Comparison with Existing Solutions +## Slide 2: OpenTelemetry vs Open Source Alternatives + +| Feature | OpenTelemetry | Jaeger | Zipkin | SkyWalking | Pinpoint | Prometheus | +| ------------------- | ---------------- | ---------------- | ------------------ | ---------- | ---------- | ---------- | +| **Tracing** | YES | YES | YES | YES | YES | NO | +| **Metrics** | YES | NO | NO | YES | YES | YES | +| **Logs** | YES | NO | NO | YES | NO | NO | +| **C++ SDK** | YES Official | YES (Deprecated) | YES (Unmaintained) | NO | NO | YES | +| **Vendor Neutral** | YES Primary goal | NO | NO | NO | NO | NO | +| **Instrumentation** | Manual + Auto | Manual | Manual | Auto-first | Auto-first | Manual | +| **Backend** | Any (exporters) | Self | Self | Self | Self | Self | +| **CNCF Status** | Incubating | Graduated | NO | Incubating | NO | Graduated | + +> **Why OpenTelemetry?** It's the only actively maintained, full-featured C++ option with vendor neutrality — allowing export to Jaeger, Prometheus, Grafana, or any commercial backend without changing instrumentation. + +--- + +## Slide 3: Comparison with rippled's Existing Solutions ### Current Observability Stack @@ -46,16 +63,16 @@ flowchart LR | Scenario | PerfLog | StatsD | OpenTelemetry | | -------------------------------- | ------- | ------ | ------------- | -| "How many TXs per second?" | ❌ | ✅ | ❌ | -| "Why was this specific TX slow?" | ⚠️ | ❌ | ✅ | -| "Which node delayed consensus?" | ❌ | ❌ | ✅ | -| "Show TX journey across 5 nodes" | ❌ | ❌ | ✅ | +| "How many TXs per second?" | ❌ | ✅ | ❌ | +| "Why was this specific TX slow?" | ⚠️ | ❌ | ✅ | +| "Which node delayed consensus?" | ❌ | ❌ | ✅ | +| "Show TX journey across 5 nodes" | ❌ | ❌ | ✅ | > **Key Insight**: OpenTelemetry **complements** (not replaces) existing systems. --- -## Slide 3: Architecture +## Slide 4: Architecture ### High-Level Integration Architecture @@ -103,7 +120,7 @@ sequenceDiagram --- -## Slide 4: Implementation Plan +## Slide 5: Implementation Plan ### 5-Phase Rollout (9 Weeks) @@ -143,7 +160,7 @@ gantt --- -## Slide 5: Performance Overhead +## Slide 6: Performance Overhead ### Estimated System Impact @@ -211,7 +228,7 @@ flowchart LR --- -## Slide 6: Data Collection & Privacy +## Slide 7: Data Collection & Privacy ### What Data is Collected @@ -260,4 +277,4 @@ flowchart LR --- -*End of Presentation* +_End of Presentation_