Two stray "rippled" tokens introduced by 43258e8d ("docs(telemetry):
add secure-OTel pipeline analysis…") were caught by check-rename in
CI. Re-run docs.sh to convert them to xrpld so the rename check
passes on PR #6425 (and downstream PR #6426 once merged up).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Document the threat model and chosen hardening approach for the OTel
pipeline: mTLS to the collector as primary defense (across-network
deployment), NetworkPolicy as defense-in-depth, and source-side
validation plus per-peer rate limiting for protocol::TraceContext on
peer messages. Skips Basic Auth (wrong shape for multi-operator
fleet) and HTTP-gateway header stripping (rippled is P2P).
Wires the new doc into the master plan ToC, mermaid diagram, and
body section, plus cross-refs from the privacy section in
02-design-decisions.md and the collector config in
05-configuration-reference.md so readers reach it from natural
in-context entry points. Adds a backlink at the top of secure-OTel.md
to the master plan.
Adds 'exfiltration' and 'htpasswd' to cspell dictionary.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Drop xrpl.node.amendment_blocked / xrpl.node.server_state from telemetry
surface (constants in SpanNames.h, two filters in tempo.yaml). Operators
read the same data via server_info / server_state RPC; OTel SDK 1.18.0
cannot refresh resource attrs at runtime so resource-level emission was
not viable either.
- Namespace all pathfind span attributes under pathfind_* (underscore form
per Phase 1c rule 5). Renames in PathFindSpanNames.h and call sites in
PathRequest.cpp, PathRequestManager.cpp, plus the rule-5 retention
xrpl.pathfind.ledger_index -> pathfind_ledger_index.
- Wire pathfind_source_account / pathfind_dest_account on pathfind.request
in doPathFind / doRipplePathFind handlers (only when present + string).
- Collapse per-asset pathfind.discover / pathfind.rank spans into one
pathfind.discover hoisted around the per-source-asset loop in
PathRequest::findPaths. Span count goes from 2N to 1 per RPC call;
per-asset breakdown traded for bounded storage and cardinality. Trade-off
documented inline.
- Fix pathfind_num_paths semantics: now sums getBestPaths().size() across
the loop (paths actually returned) instead of the maxPaths input cap.
- PathRequestManager::updateAll: move span creation after the locked
requests_ snapshot, early-return when no active subscriptions exist
(avoids empty span on every ledger close), set pathfind_num_requests
= requests.size().
- Update Phase2_taskList.md and 02-design-decisions.md to match.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 4 added a span catalog in `06-implementation-phases.md` listing the
source location for each consensus span. Line numbers `Consensus.h:707`,
`RCLConsensus.cpp:232/341/492/541/900` drift on every refactor and would
become stale PR after PR. Filename alone is enough for operators to
grep — the RCLConsensus.cpp spans are already unambiguous from the span
name itself.
Phase-6 introduces ledger-operations, peer-network, and the five StatsD
dashboards. Align them with the rest of the chain:
- Rename dashboard UIDs from `rippled-*` to `xrpld-*` so the provisioned
UIDs match the post-rename-script documentation (`docs.sh` rewrites
.md but not .json, so the two drifted). Runbook references
`xrpld-rpc-perf`, `xrpld-transactions`, etc., now the JSON matches.
- Add the `$node` template variable + `exported_instance=~"$node"` filter
to every target in the five `statsd-*` dashboards. Mirrors the pattern
already used by consensus-health, ledger-operations, and peer-network
per the project rule that every dashboard must support per-node
filtering.
- Strip `:<line>` (and `:NN-NN` range) suffixes from C++ file references
in every dashboard panel description and in docker/telemetry/TESTING.md.
Line numbers drift on every refactor; the filename alone is enough to
grep.
- Replace stale `rpc.request` entries with the real emitted span names
(`rpc.http_request`, `rpc.ws_upgrade`, `rpc.ws_message`, `rpc.process`)
in TESTING.md so operators can copy-paste the filters and hit real
traces.
- Also drop the `:706` line ref from the `StatsDCollector.cpp` callout
in `06-implementation-phases.md`.
Phase-1a plan documents advertised OTLP/gRPC on port 4317 as the default
exporter, four unparsed [telemetry] config keys, and "Phase 4a Complete"
status with exit-criteria checkboxes marked done. Every downstream branch
through Phase 5 ships only OTLP/HTTP on port 4318 via OtlpHttpExporterFactory,
never parses the advertised keys, and the Phase 4 work is not yet delivered.
Fixes:
- 02-design-decisions.md: flip §2.1.1 SDK dependency recommendations to
OTLP/HTTP (shipped) with OTLP/gRPC marked Future. Update §2.2 architecture
diagram and text from OTLP/gRPC:4317 to OTLP/HTTP:4318. Rewrite §2.2.1 as
"OTLP/HTTP (Shipped)" and §2.2.2 as "OTLP/gRPC (Future Work — Planned
Upgrade)" with a concrete checklist (Conan dep, config parsing, factory
branch, runbook/dashboard updates) for landing the gRPC transport later.
- 05-configuration-reference.md: drop the fabricated exporter/otlp_grpc key
and the :4317 default from the sample config block and the options-summary
table. Move trace_pathfind, trace_txq, trace_validator, trace_amendment
into a new "Planned (not yet implemented)" table citing the phase that will
add each one. Keep the example config minimal so copy-paste does not produce
a silently-ignored stanza.
- 06-implementation-phases.md: reset Phase 4 Exit Criteria checkboxes from
[x] to [ ] (Phase 4 is not shipped at Phase-1a time). Rename "Phase 4a
Complete" to "Phase 4a Plan" and describe the work as future. Replace the
broken forward link to Phase4_taskList.md (introduced in the Phase 2 PR)
with a sentence pointing readers to where that spec will land. Renumber
the final section 6.12 to 6.11 so it sits directly after 6.10; section 6.11
("Effort Summary") was intentionally removed in earlier edits.
- otel-collector-config.yaml: spanmetrics dimensions use new bare names.
- tempo.yaml: TraceQL filter tags use new bare names.
- 02-design-decisions.md: strip xrpl.txq.* prefix from planned attrs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>