Commit Graph

6 Commits

Author SHA1 Message Date
Pratik Mankawde
f18ddd95c1 fix(telemetry): address clang-tidy errors on phase4 consensus tracing files
- Add [[nodiscard]] to getConsensusTraceStrategy, getYays, getNays
- Add missing <string>, SpanGuard.h, SpanNames.h includes
- Fix widening cast placement (cast before arithmetic, not after)
- Replace nested ternary with lambda for const dir variable
- Add braces to if/else-if chains in Consensus.h
- Concatenate nested namespaces in ConsensusSpanNames.h

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-29 19:54:23 +01:00
Pratik Mankawde
8fb33b0818 feat(telemetry): add Phase 4 consensus tracing with SpanGuard API
Instrument the consensus subsystem with OpenTelemetry spans covering
the full round lifecycle: round start, establish phase, proposal send,
ledger close, position updates, consensus check, accept, validation
send, and mode changes.

Key design choices adapted from the original Phase 4 implementation
to the new SpanGuard factory pattern introduced in Phase 3:

- Add SpanGuard::hashSpan() for category-gated hash-derived trace IDs
  (consensus round spans share trace_id across validators via ledger hash)
- Add SpanGuard::addEvent() overload with key-value attribute pairs
  (used for dispute.resolve events during position updates)
- Add ConsensusSpanNames.h with compile-time span name constants
  following the colocated *SpanNames.h pattern from Phase 3
- Add consensusTraceStrategy config option ("deterministic"/"attribute")
  for cross-node trace correlation strategy selection
- Use SpanGuard::linkedSpan() for follows-from relationships between
  consecutive rounds and cross-thread validation spans
- Use SpanGuard::captureContext() for thread-safe context propagation
  from consensus thread to jtACCEPT worker thread

Spans produced: consensus.round, consensus.proposal.send,
consensus.ledger_close, consensus.establish, consensus.update_positions,
consensus.check, consensus.accept, consensus.accept.apply,
consensus.validation.send, consensus.mode_change

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 17:32:56 +01:00
Pratik Mankawde
d4e91b462e fix(telemetry): resolve clang-tidy warnings in Telemetry interfaces
Use C++17 concatenated namespaces, add [[nodiscard]] to query methods,
add missing direct includes, and use pass-by-value + std::move in
NullTelemetry constructor.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-29 11:16:21 +01:00
Pratik Mankawde
e9c5c3520e fix(telemetry): address Phase 1b code review findings
Redesign SpanGuard with pimpl idiom to hide all OpenTelemetry types
from public headers. Add global Telemetry accessor so SpanGuard factory
methods work without explicit Telemetry references. Add child/linked
span creation and cross-thread context propagation. Update plan docs
to reflect macro removal in favor of SpanGuard factory pattern.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-28 14:26:05 +01:00
Pratik Mankawde
3852b5ae4b fix(telemetry): address review findings and PR #6437 comments
Critical fixes:
- Restore accidentally removed mallocTrim call and MallocTrim.h include
- Add missing shouldTraceLedger() to interface and all implementations
- Derive networkId/networkType from config_->NETWORK_ID (0=mainnet,
  1=testnet, 2=devnet) instead of leaving defaults unpopulated
- Clamp sampling_ratio to [0.0, 1.0] in config parser

PR comment fixes:
- Rename rippled -> xrpld in service name defaults, getTracer() calls,
  Docker network, comments, and docs/build/telemetry.md
- Remove exporter config option (only otlp_http supported)
- Add trace_ledger and service_name to example config
- Clarify head-based sampling semantics in config comments
- Add filter descriptions for span intrinsic filters in Grafana datasource
- Add inline comments to Docker Compose services

Docker/config improvements:
- Remove deprecated version: "3.8" from docker-compose.yml
- Pin images: collector 0.121.0, grafana 11.5.2
- Add health_check extension to otel-collector-config.yaml
- Comment out Tempo metrics_generator remote_write (no Prometheus service)
- Add Prometheus datasource caveat in Grafana datasource config

Other:
- Revert unrelated formatting changes in ServiceRegistry.h
- Change Conan telemetry default to False (matches CMake OFF)
- Add CLAUDE.md-required docs (ASCII diagrams, usage examples,
  @note thread-safety) to Telemetry.h and SpanGuard.h

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-28 14:25:31 +01:00
Pratik Mankawde
88686af850 Phase 1b: Telemetry core infrastructure - CMake, Conan, SpanGuard, config
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-28 14:25:31 +01:00