fix(telemetry): address code review issues in OTelCollector

- Fix use-after-free: extract gauge callback to static function and call
  RemoveCallback in ~OTelGaugeImpl() before unregistering from collector
- Use memory_order_acq_rel on callHooks() debounce CAS for proper
  happens-before relationship between hook invocations
- Add explicit 2s timeout to ForceFlush() in destructor to prevent
  blocking indefinitely when OTLP endpoint is unreachable at shutdown
- Add OTLP receiver to metrics pipeline so native OTel metrics from
  xrpld are actually received by the collector
- Remove stale health check port from docker-compose (extension was
  removed from collector config)
- Clarify fallback docs: StatsD path requires re-enabling receiver/port
- Fix comments: Counter uses uint64_t not int64_t, gauge clamps to
  [0, INT64_MAX] not [0, UINT64_MAX]

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Pratik Mankawde
2026-05-06 14:24:52 +01:00
parent ed31bab500
commit 761688383d
5 changed files with 28 additions and 28 deletions

View File

@@ -27,7 +27,6 @@ services:
- "4317:4317" # OTLP gRPC
- "4318:4318" # OTLP HTTP (traces + native OTel metrics)
- "8889:8889" # Prometheus metrics (spanmetrics + OTLP)
- "13133:13133" # Health check
# StatsD UDP port removed — beast::insight now uses native OTLP.
# Uncomment if using server=statsd fallback:
# - "8125:8125/udp"

View File

@@ -2,7 +2,7 @@
#
# Pipelines:
# traces: OTLP receiver -> batch processor -> debug + Tempo + spanmetrics
# metrics: spanmetrics connector -> Prometheus exporter
# metrics: OTLP receiver + spanmetrics connector -> Prometheus exporter
#
# xrpld sends traces via OTLP/HTTP to port 4318. The collector batches
# them, forwards to Tempo, and derives RED metrics via the spanmetrics
@@ -55,5 +55,5 @@ service:
processors: [batch]
exporters: [debug, otlp/tempo, spanmetrics]
metrics:
receivers: [spanmetrics]
receivers: [otlp, spanmetrics]
exporters: [prometheus]