From 5598b0eac759bd072a0fa14ff2d090f0f629beaa Mon Sep 17 00:00:00 2001
From: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>
Date: Tue, 9 Jun 2026 18:22:52 +0100
Subject: [PATCH] docs(telemetry): fix head sampling at 1.0, remove
 configurable ratio

Document that head sampling is intentionally fixed at 100% and no longer
exposes a sampling_ratio config knob. A per-node ratio let nodes make
divergent keep/drop decisions for the same distributed trace, producing
broken/partial traces; pinning at 1.0 with a ParentBased sampler keeps
decisions coherent across the network. Volume reduction is delegated to
collector-side tail sampling.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 OpenTelemetryPlan/00-tracing-fundamentals.md  |  9 +++++++-
 OpenTelemetryPlan/04-code-samples.md          |  6 ++++--
 .../05-configuration-reference.md             | 21 +++++++------------
 .../07-observability-backends.md              |  4 ++--
 OpenTelemetryPlan/OpenTelemetryPlan.md        |  2 +-
 5 files changed, 22 insertions(+), 20 deletions(-)

diff --git a/OpenTelemetryPlan/00-tracing-fundamentals.md b/OpenTelemetryPlan/00-tracing-fundamentals.md
index 24322bdd09..d7cc40c5ac 100644
--- a/OpenTelemetryPlan/00-tracing-fundamentals.md
+++ b/OpenTelemetryPlan/00-tracing-fundamentals.md
@@ -514,12 +514,19 @@ Not every trace needs to be recorded. **Sampling** reduces overhead:
 ### Head Sampling (at trace start)
 
 ```
-Request arrives → Random 10% chance → Record or skip entire trace
+Request arrives → Random N% chance → Record or skip entire trace
 ```
 
 - ✅ Low overhead
 - ❌ May miss interesting traces
 
+> **xrpld note**: xrpld intentionally fixes head sampling at 100% (sample
+> everything) and does not expose a configurable ratio. A per-node ratio
+> would let different nodes make divergent keep/drop decisions for the same
+> distributed trace, producing broken/partial traces. xrpld uses a
+> `ParentBased` sampler so spans with a remote parent honor the upstream
+> decision. Volume reduction is delegated to collector-side tail sampling.
+
 ### Tail Sampling (after trace completes)
 
 ```
diff --git a/OpenTelemetryPlan/04-code-samples.md b/OpenTelemetryPlan/04-code-samples.md
index 1452c30f5e..d70bcbc760 100644
--- a/OpenTelemetryPlan/04-code-samples.md
+++ b/OpenTelemetryPlan/04-code-samples.md
@@ -53,8 +53,10 @@ public:
         bool useTls = false;
         std::string tlsCertPath;
 
-        // Sampling configuration
-        double samplingRatio = 1.0;  // 1.0 = 100% sampling
+        // Head sampling: fixed at 1.0 (sample everything), not config-driven.
+        // Keeps trace keep/drop decisions coherent across nodes; volume
+        // reduction is delegated to the collector's tail sampling.
+        double samplingRatio = 1.0;
 
         // Batch processor settings
         std::uint32_t batchSize = 512;
diff --git a/OpenTelemetryPlan/05-configuration-reference.md b/OpenTelemetryPlan/05-configuration-reference.md
index d6f13e0d9d..0ea40c08a9 100644
--- a/OpenTelemetryPlan/05-configuration-reference.md
+++ b/OpenTelemetryPlan/05-configuration-reference.md
@@ -37,12 +37,11 @@ Add to `cfg/xrpld-example.cfg`:
 # # Path to CA certificate for TLS (optional)
 # # tls_ca_cert=/path/to/ca.crt
 #
-# # Sampling ratio: 0.0-1.0 (default: 1.0 = 100% sampling)
-# # Use lower values in production to reduce overhead
-# # Default: 1.0 (all traces). For production deployments with high
-# # throughput, 0.1 (10%) is recommended to reduce overhead.
-# # See Section 7.4.2 for sampling strategy details.
-# sampling_ratio=0.1
+# # Head sampling is intentionally fixed at 1.0 (sample everything) and is
+# # NOT configurable. A per-node head-sampling ratio would let nodes make
+# # divergent keep/drop decisions for the same distributed trace, producing
+# # broken/partial traces across the network. Volume reduction is delegated
+# # to the collector's tail sampling instead. See Section 7.4.2.
 #
 # # Batch processor settings
 # batch_size=512           # Spans per batch (default: 512)
@@ -78,7 +77,6 @@ enabled=0
 | `endpoint`            | string | `http://localhost:4318/v1/traces` | OTLP/HTTP collector endpoint              |
 | `use_tls`             | bool   | `false`                           | Enable TLS for exporter connection        |
 | `tls_ca_cert`         | string | `""`                              | Path to CA certificate file               |
-| `sampling_ratio`      | float  | `1.0`                             | Sampling ratio (0.0-1.0)                  |
 | `batch_size`          | uint   | `512`                             | Spans per export batch                    |
 | `batch_delay_ms`      | uint   | `5000`                            | Max delay before sending batch (ms)       |
 | `max_queue_size`      | uint   | `2048`                            | Maximum queued spans                      |
@@ -143,13 +141,8 @@ setup_Telemetry(
     setup.useTls = section.value_or("use_tls", false);
     setup.tlsCertPath = section.value_or("tls_ca_cert", "");
 
-    // Sampling
-    setup.samplingRatio = section.value_or("sampling_ratio", 1.0);
-    if (setup.samplingRatio < 0.0 || setup.samplingRatio > 1.0)
-    {
-        Throw<std::runtime_error>(
-            "telemetry.sampling_ratio must be between 0.0 and 1.0");
-    }
+    // Head sampling is fixed at 1.0 (sample everything) and is not read from
+    // config — see Section 7.4.2. setup.samplingRatio stays at its 1.0 default.
 
     // Batch processor
     setup.batchSize = section.value_or("batch_size", 512u);
diff --git a/OpenTelemetryPlan/07-observability-backends.md b/OpenTelemetryPlan/07-observability-backends.md
index a1c303b545..5d1638670a 100644
--- a/OpenTelemetryPlan/07-observability-backends.md
+++ b/OpenTelemetryPlan/07-observability-backends.md
@@ -171,7 +171,7 @@ flowchart TB
 ```mermaid
 flowchart LR
     subgraph head["Head Sampling (Node)"]
-        hs[Node-level head sampling<br/>configurable, default: 100%<br/>recommended production: 10%]
+        hs[Node-level head sampling<br/>fixed at 100%<br/>not configurable]
     end
 
     subgraph tail["Tail Sampling (Collector)"]
@@ -197,7 +197,7 @@ flowchart LR
 
 **Reading the diagram:**
 
-- **Head Sampling (Node)**: The first filter -- each xrpld node decides whether to sample a trace at creation time (default 100%, recommended 10% in production). This controls the volume leaving the node.
+- **Head Sampling (Node)**: xrpld pins head sampling at 100% (sample everything) and does not expose a configurable ratio. This is intentional: a per-node ratio would let different nodes make divergent keep/drop decisions for the same distributed trace, producing broken/partial traces. xrpld uses a `ParentBased` sampler so spans inheriting a remote parent honor the upstream decision. Volume reduction is delegated to the collector's tail sampling.
 - **Tail Sampling (Collector)**: The second filter -- the collector inspects completed traces and applies rules: keep all errors, keep anything slower than 5 seconds, and keep 10% of the remainder.
 - **Arrow head → tail**: All head-sampled traces flow to the collector, where tail sampling further reduces volume while preserving the most valuable data.
 - **Final Traces**: The output after both sampling stages; this is what gets stored and queried. The two-stage approach balances cost with debuggability.
diff --git a/OpenTelemetryPlan/OpenTelemetryPlan.md b/OpenTelemetryPlan/OpenTelemetryPlan.md
index 8f7476753b..3974d79481 100644
--- a/OpenTelemetryPlan/OpenTelemetryPlan.md
+++ b/OpenTelemetryPlan/OpenTelemetryPlan.md
@@ -148,7 +148,7 @@ Span naming follows a hierarchical `<component>.<operation>` convention (e.g., `
 
 The telemetry code is organized under `include/xrpl/telemetry/` for headers and `src/libxrpl/telemetry/` for implementation. Key principles include RAII-based span management via `SpanGuard`, conditional compilation with `XRPL_ENABLE_TELEMETRY`, and minimal runtime overhead through batch processing and efficient sampling.
 
-Performance optimization strategies include probabilistic head sampling (10% default), tail-based sampling at the collector for errors and slow traces, batch export to reduce network overhead, and conditional instrumentation that compiles to no-ops when disabled.
+Performance optimization strategies include head sampling fixed at 100% (intentionally not configurable, so trace keep/drop decisions stay coherent across nodes), tail-based sampling at the collector for errors and slow traces to reduce volume, batch export to reduce network overhead, and conditional instrumentation that compiles to no-ops when disabled.
 
 ➡️ **[Read full Implementation Strategy](./03-implementation-strategy.md)**