rippled

mirror of https://github.com/XRPLF/rippled.git synced 2026-07-24 07:30:30 +00:00

Author	SHA1	Message	Date
Pratik Mankawde	ac57a91b77	merge: phase-9 (dashboard UID + line-number cleanup, detach callbacks) into phase-10 # Conflicts: # docker/telemetry/TESTING.md	2026-05-14 17:23:55 +01:00
Pratik Mankawde	145b1469d6	fix(telemetry): rename phase-9 dashboard JSON files rippled-* -> xrpld-* File renames to match the post-docs.sh project-wide rename + the UID rename applied in the previous commit. Five phase-9 dashboards are affected: - rippled-fee-market.json -> xrpld-fee-market.json - rippled-job-queue.json -> xrpld-job-queue.json - rippled-peer-quality.json -> xrpld-peer-quality.json - rippled-rpc-perf.json -> xrpld-rpc-perf-otel.json - rippled-validator-health.json-> xrpld-validator-health.json `rippled-rpc-perf.json` is renamed to `xrpld-rpc-perf-otel.json` (rather than `xrpld-rpc-perf.json`) to avoid colliding with the phase-6 `rpc-performance.json` dashboard which also uses the `xrpld-rpc-perf` UID. The new filename matches its now-unique `xrpld-rpc-perf-otel` UID that was set in the merge commit.	2026-05-14 17:11:25 +01:00
Pratik Mankawde	a9f52458b3	merge: pratik/otel-phase8-log-correlation (dashboard UID + line-number cleanup) into pratik/otel-phase9-metric-gap-fill # Conflicts: # docker/telemetry/grafana/dashboards/consensus-health.json # docker/telemetry/grafana/dashboards/ledger-operations.json # docker/telemetry/grafana/dashboards/peer-network.json # docker/telemetry/grafana/dashboards/rpc-performance.json # docker/telemetry/grafana/dashboards/system-ledger-data-sync.json # docker/telemetry/grafana/dashboards/system-network-traffic.json # docker/telemetry/grafana/dashboards/system-node-health.json # docker/telemetry/grafana/dashboards/system-overlay-traffic-detail.json # docker/telemetry/grafana/dashboards/system-rpc-pathfinding.json # docker/telemetry/grafana/dashboards/transaction-overview.json	2026-05-14 17:10:12 +01:00
Pratik Mankawde	0e5e802e5e	merge: pratik/otel-phase7-native-metrics (dashboard UID + line-number cleanup) into pratik/otel-phase8-log-correlation	2026-05-14 17:07:34 +01:00
Pratik Mankawde	6985e1948b	merge: pratik/otel-phase6-statsd (line-number + docs cleanup) into pratik/otel-phase7-native-metrics # Conflicts: # OpenTelemetryPlan/06-implementation-phases.md # docker/telemetry/grafana/dashboards/system-ledger-data-sync.json # docker/telemetry/grafana/dashboards/system-network-traffic.json # docker/telemetry/grafana/dashboards/system-node-health.json # docker/telemetry/grafana/dashboards/system-overlay-traffic-detail.json # docker/telemetry/grafana/dashboards/system-rpc-pathfinding.json	2026-05-14 17:07:15 +01:00
Pratik Mankawde	1a36ef4b0f	fix(telemetry): rename remaining rippled-* dashboard UIDs + fix stale rpc.request span filter Follow-up to the phase-6 dashboard cleanup. The three dashboards introduced by commit `f6105ece98` (consensus-health, rpc-performance, transaction-overview) were missed in the initial UID rename and still carried `rippled-*` UIDs plus line-number refs in panel descriptions. - UIDs: `rippled-consensus` -> `xrpld-consensus`, `rippled-rpc-perf` -> `xrpld-rpc-perf`, `rippled-transactions` -> `xrpld-transactions`, matching the post-`docs.sh`-rename runbook and the other dashboards in this PR. - Strip `:<line>` suffixes from `ServerHandler.cpp`, `RCLConsensus.cpp`, `NetworkOPs.cpp`, etc. references in panel descriptions. Line numbers drift on every refactor; the filename is enough to grep. - Fix the Overall RPC Throughput panel: two targets filtered on `span_name="rpc.request"` (never emitted) instead of `span_name="rpc.http_request"` (the real emitted name). The panel would have shown zero data until this fix.	2026-05-14 16:58:47 +01:00
Pratik Mankawde	a789f6ccf5	docs(telemetry): fix stale rpc.request refs + drop unparsed exporter key in TESTING.md Follow-up to the dashboard cleanup on this branch. Caught additional sites in TESTING.md that still reference the never-emitted `rpc.request` span: - TraceQL query examples in Step 5 "Verify traces in Tempo" now filter on `name="rpc.http_request"` (the real emitted name). - Expected-spans table replaces `rpc.request` with `rpc.http_request`. - Query loop under the Prometheus verification section now iterates over the full set of emitted RPC entry-point names (`rpc.http_request`, `rpc.ws_upgrade`, `rpc.ws_message`, `rpc.process`). Also drop `exporter=otlp_http` from the sample telemetry config block. `TelemetryConfig.cpp` does not parse an `exporter` key in any phase through Phase 8; only OTLP/HTTP is wired up, so the line is either a silently ignored no-op or misleading documentation.	2026-05-14 16:53:40 +01:00
Pratik Mankawde	44cdc8133e	fix(telemetry): phase-6 dashboards — rename UIDs, add $node filter, drop line numbers Phase-6 introduces ledger-operations, peer-network, and the five StatsD dashboards. Align them with the rest of the chain: - Rename dashboard UIDs from `rippled-` to `xrpld-` so the provisioned UIDs match the post-rename-script documentation (`docs.sh` rewrites .md but not .json, so the two drifted). Runbook references `xrpld-rpc-perf`, `xrpld-transactions`, etc., now the JSON matches. - Add the `$node` template variable + `exported_instance=~"$node"` filter to every target in the five `statsd-*` dashboards. Mirrors the pattern already used by consensus-health, ledger-operations, and peer-network per the project rule that every dashboard must support per-node filtering. - Strip `:<line>` (and `:NN-NN` range) suffixes from C++ file references in every dashboard panel description and in docker/telemetry/TESTING.md. Line numbers drift on every refactor; the filename alone is enough to grep. - Replace stale `rpc.request` entries with the real emitted span names (`rpc.http_request`, `rpc.ws_upgrade`, `rpc.ws_message`, `rpc.process`) in TESTING.md so operators can copy-paste the filters and hit real traces. - Also drop the `:706` line ref from the `StatsDCollector.cpp` callout in `06-implementation-phases.md`.	2026-05-14 16:51:14 +01:00
Pratik Mankawde	34bf61ff77	merge: pratik/otel-phase9-metric-gap-fill fix(SpanKind) into pratik/otel-phase10-workload-validation # Conflicts: # docker/telemetry/otel-collector-config.yaml # docker/telemetry/xrpld-telemetry.cfg	2026-05-14 15:59:39 +01:00
Pratik Mankawde	53e1ff82d8	Merge branch 'pratik/otel-phase8-log-correlation' into pratik/otel-phase9-metric-gap-fill	2026-05-14 14:01:46 +01:00
Pratik Mankawde	8df3ea1bbe	Merge branch 'pratik/otel-phase7-native-metrics' into pratik/otel-phase8-log-correlation	2026-05-14 14:01:41 +01:00
Pratik Mankawde	5a6882f119	Merge branch 'pratik/otel-phase6-statsd' into pratik/otel-phase7-native-metrics # Conflicts: # docker/telemetry/otel-collector-config.yaml	2026-05-14 14:01:36 +01:00
Pratik Mankawde	b449db0434	fix(telemetry): align spanmetrics dimensions, Tempo tags, and dashboard queries with C++ attribute names Spanmetrics dimensions used xrpl.rpc.command etc. but C++ emits bare "command". Tempo tags for phase6-added consensus/tx/peer filters used qualified names but C++ uses bare names. Dashboard panel referenced xrpl_tx_suppressed (never populated) instead of suppressed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-14 14:01:12 +01:00
Pratik Mankawde	9babfff3c8	Merge branch 'pratik/otel-phase5-docs-deployment' into pratik/otel-phase6-statsd	2026-05-14 13:59:19 +01:00
Pratik Mankawde	61ab5c6fe3	fix(telemetry): align Tempo consensus search tags with C++ attribute names Consensus span attributes use bare names (close_time_correct, consensus_state, close_resolution_ms) and shared canonical attrs (xrpl.ledger.seq) per SpanNames.h. xrpl.consensus.mode and xrpl.consensus.round are correct (domain-qualified to avoid collision). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-14 13:59:08 +01:00
Pratik Mankawde	837f7e7b50	Merge branch 'pratik/otel-phase3-tx-tracing' into pratik/otel-phase4-consensus-tracing	2026-05-14 13:58:38 +01:00
Pratik Mankawde	b392035544	fix(telemetry): align Tempo TX search tags with C++ attribute names Transaction span attributes use bare names (local, tx_status) per SpanNames.h convention, not xrpl.tx.* qualified names. xrpl.tx.hash is correct (shared canonical attr defined in SpanNames.h). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-14 13:58:31 +01:00
Pratik Mankawde	450004ebd8	Merge branch 'pratik/otel-phase2-rpc-tracing' into pratik/otel-phase3-tx-tracing	2026-05-14 13:58:19 +01:00
Pratik Mankawde	6f403fdd1b	fix(telemetry): align Tempo search tags with C++ span attribute names RPC span attributes use bare names (command, rpc_status, rpc_role) per the naming convention in SpanNames.h, not xrpl.rpc.* qualified names. Node health attributes (amendment_blocked, server_state) are resource attributes set at Tracer init, not span attributes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-14 13:58:13 +01:00
Pratik Mankawde	5dc4ae8fcc	Merge branch 'pratik/otel-phase8-log-correlation' into pratik/otel-phase9-metric-gap-fill	2026-05-14 13:49:59 +01:00
Pratik Mankawde	690841e934	Merge branch 'pratik/otel-phase7-native-metrics' into pratik/otel-phase8-log-correlation	2026-05-14 13:49:51 +01:00
Pratik Mankawde	7d61a4a0ef	feat(telemetry): add missing Phase 9 metric panels to dashboards 13 metrics from 09-data-collection-reference.md were not displayed on any Grafana dashboard. Adds panels for all of them: system-node-health.json (+7 panels): - NodeStore Bytes Read/Written (node_written_bytes, node_read_bytes) - NodeStore Read Threads & Duration (node_reads_duration_us, read_request_bundle, read_threads_running, read_threads_total) - AL_size added to Cache Sizes panel - Current Ledger Index (ledger_current_index) - NuDB Storage Size (storage_detail{metric="nudb_bytes"}) rippled-validator-health.json (+2 panels): - UNL Blocked (validator_health{metric="unl_blocked"}) - Agreement/Missed Counters Rate (validation_agreements_total, validation_missed_total) rippled-job-queue.json (+1 panel): - Transaction Overflow Rate (jq_trans_overflow_total) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-14 13:32:55 +01:00
Pratik Mankawde	93caaba5ca	fix(telemetry): recover Phase 6 dashboard panels lost during statsd→system rename Panels 8-15 from statsd-node-health.json and panels 8-9 from statsd-network-traffic.json were lost when Phase 7 renamed these files to system-*. The merge (`5cd71ed107`) took Phase 7's smaller version without the extra panels added by commit `b933e8ae00` on Phase 6. Recovered panels (system-node-health.json): - Key Jobs Execution Time (11 job types) - Key Jobs Dequeue Wait Time (11 job types) - FullBelowCache Size - FullBelowCache Hit Rate - Ledger Publish Gap (validated - published age delta) - State Duration Rate (Full vs Tracking) - All Jobs Execution Time Detail (34 job types) - All Jobs Dequeue Wait Detail (34 job types) Recovered panels (system-network-traffic.json): - Duplicate Traffic (Wasted Bandwidth) - All Traffic Categories Detail (topk 15 by byte rate) All recovered panels updated to include exported_instance=~"$node" filter per project dashboard guidelines. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-14 12:33:18 +01:00
Pratik Mankawde	02fe838257	auto refresh at 5seconds Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>	2026-05-13 19:00:36 +01:00
Pratik Mankawde	20477e5494	validator path changes Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>	2026-05-13 18:49:21 +01:00
Pratik Mankawde	f0c6227c06	added config for devnet test run Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>	2026-05-13 18:42:57 +01:00
Pratik Mankawde	a04459f1f8	fix(telemetry): update collector config + tempo datasource + design doc for simplified attr names - otel-collector-config.yaml: spanmetrics dimensions use new bare names. - tempo.yaml: TraceQL filter tags use new bare names. - 02-design-decisions.md: strip xrpl.txq.* prefix from planned attrs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-13 16:47:36 +01:00
Pratik Mankawde	815e2b1f5d	refactor(telemetry): fix remaining old attr refs in tests, docs, workload - Update Telemetry.h doc example: xrpl.rpc.command -> command. - Update SpanGuardFactory.cpp test: use new bare attr names. - Update TESTING.md: rename attr refs in span table + PromQL example. - Update expected_spans.json: all attrs match simplified naming. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-13 16:21:18 +01:00
Pratik Mankawde	ec8e3e2950	Merge branch 'pratik/otel-phase9-metric-gap-fill' into pratik/otel-phase10-workload-validation	2026-05-13 16:17:49 +01:00
Pratik Mankawde	495d5bd8a0	Merge branch 'pratik/otel-phase8-log-correlation' into pratik/otel-phase9-metric-gap-fill	2026-05-13 16:17:12 +01:00
Pratik Mankawde	6cd910f06f	Merge branch 'pratik/otel-phase7-native-metrics' into pratik/otel-phase8-log-correlation	2026-05-13 16:17:05 +01:00
Pratik Mankawde	5cd71ed107	Merge branch 'pratik/otel-phase6-statsd' into pratik/otel-phase7-native-metrics	2026-05-13 16:16:50 +01:00
Pratik Mankawde	9e27120a15	refactor(telemetry): simplify ledger/peer attr naming on phase-6, update dashboards - Add canonical ledgerHash (xrpl.ledger.hash) to SpanNames.h. - LedgerSpanNames: reuse shared canonicals (ledgerSeq, closeTime, closeTimeCorrect, closeResolutionMs, ledgerHash); bare names for tx_count, tx_failed, validations. - PeerSpanNames: reuse shared canonicals (peerId, ledgerHash); bare names for proposal_trusted, validation_full, validation_trusted. - Update call sites in BuildLedger.cpp, LedgerMaster.cpp, PeerImp.cpp. - Update 5 Grafana dashboards: strip xrpl.<domain>. prefix from per-span attr refs in PromQL/TraceQL queries. Keep rule-5 entries. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-13 16:16:30 +01:00
Pratik Mankawde	592e546f82	fix(telemetry): align Phase 10 workload configs with xrpld_ metric prefix Phase 10's workload validation configs (expected_metrics.json, regression-metrics.json, validate_telemetry.py) queried the MetricsRegistry metrics under the rippled_ prefix, but MetricsRegistry emits them as xrpld_ (see MetricsRegistry.cpp). On a live run the workload validator reported every MetricsRegistry metric as missing, masking genuine regressions. Rename the following to xrpld_ across the workload validator, expected-metrics manifest, and regression-metrics template: - nodestore_state, cache_metrics, txq_metrics, load_factor_metrics, object_count - rpc_method_started_total / _finished_total / _errored_total / _duration_us - job_queued_total / _started_total / _finished_total / _queued_duration_us_bucket / _running_duration_us_bucket - peer_quality, server_info, validator_health, ledger_economy, db_metrics, complete_ledgers, build_info, state_tracking, storage_detail - ledgers_closed_total, validations_sent_total, validations_checked_total, state_changes_total - validation_agreement, validation_agreements_total, validation_missed_total Mirrors the phase-9 fix in commit `5601615952`. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-13 15:01:13 +01:00
Pratik Mankawde	201da0e00d	Merge branch 'pratik/otel-phase9-metric-gap-fill' into pratik/otel-phase10-workload-validation	2026-05-13 14:59:45 +01:00
Pratik Mankawde	5601615952	fix(telemetry): align Phase 9 dashboards and integration-test with xrpld_ metric prefix MetricsRegistry emits OTel SDK metrics with the xrpld_ prefix (MetricsRegistry.cpp defines "xrpld_nodestore_state", "xrpld_cache_metrics", etc.), but the Phase 9 dashboards and the Step 10c integration-test assertions introduced in `892fee638a` queried the rippled_ prefix. Every Phase 9 panel and assertion therefore rendered "No data" or failed on a live run, even though the underlying series were being exported correctly. Rename the rippled_ prefix to xrpld_ for every MetricsRegistry metric in dashboards and the integration test: - nodestore_state, cache_metrics, txq_metrics, load_factor_metrics, object_count - rpc_method_started_total / _finished_total / _errored_total / _duration_us_bucket - job_queued_total / _started_total / _finished_total / _queued_duration_us_bucket / _running_duration_us_bucket - peer_quality, server_info, validator_health, ledger_economy, db_metrics, complete_ledgers, build_info, state_tracking - ledgers_closed_total, validations_sent_total, validations_checked_total, state_changes_total - validation_agreement (ValidationTracker 1h/24h/7d windows) Also add ValidationTracker window-gauge assertions to Step 10c of integration-test.sh so the 1h/24h/7d agreement and miss counts are checked alongside the other Phase 9 gauges. The rippled_ prefix is preserved for beast::insight metrics (rippled_LedgerMaster_, rippled_Peer_Finder_, rippled_total_, rippled_Overlay_, rippled_State_Accounting_, rippled_transactions_, rippled_proposals_, rippled_validations_Messages_) because those flow through the StatsD-style OTelCollector configured with `[insight] prefix=rippled` and remain on that prefix by design. Verified against a live 6-node consensus network: all 22 Phase 9 + ValidationTracker assertions now report 6+ series per metric. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-13 14:59:00 +01:00
Pratik Mankawde	8e9e852b74	Merge branch 'pratik/otel-phase9-metric-gap-fill' into pratik/otel-phase10-workload-validation	2026-05-13 12:24:15 +01:00
Pratik Mankawde	db04120f74	Merge branch 'pratik/otel-phase8-log-correlation' into pratik/otel-phase9-metric-gap-fill	2026-05-13 12:24:00 +01:00
Pratik Mankawde	fac3287912	fix(telemetry): use .batches for Tempo trace lookup in integration test Tempo /api/traces/{id} returns OTLP-shaped JSON with a top-level "batches" key, not "data". The cross-check in check_log_correlation was querying jq '.data \| length' which always returned null, causing the Log-Tempo cross-check to fail even when the trace existed.	2026-05-13 12:16:41 +01:00
Pratik Mankawde	782d98d249	Merge branch 'pratik/otel-phase9-metric-gap-fill' into pratik/otel-phase10-workload-validation Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>	2026-05-13 11:40:15 +01:00
Pratik Mankawde	c096eeb239	Merge branch 'pratik/otel-phase8-log-correlation' into pratik/otel-phase9-metric-gap-fill	2026-05-13 11:30:22 +01:00
Pratik Mankawde	e49c5997b7	added loki config. Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>	2026-05-06 17:37:43 +01:00
Pratik Mankawde	85330920ac	feat(telemetry): add Loki service and filelog receiver for Phase 8 log ingestion Cherry-pick Loki infrastructure from phase-10 back to where it belongs (Phase 8, Tasks 8.2/8.3): - Add Loki 3.4.2 service to docker-compose.yml (port 3100) - Add filelog receiver to OTel Collector config (tails debug.log, regex_parser extracts trace_id/span_id/partition/severity) - Add otlphttp/loki exporter (uses Loki 3.x native OTLP ingestion) - Add logs pipeline: filelog -> batch -> otlphttp/loki - Add health_check extension - Mount xrpld log directory into collector container - Add prometheus-data and loki-data persistent volumes StatsD receiver intentionally excluded — Phase 7 migrated to native OTLP metrics, making the StatsD receiver unnecessary. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 14:55:45 +01:00
Pratik Mankawde	fac6c3ac1d	Merge branch 'pratik/otel-phase7-native-metrics' into pratik/otel-phase8-log-correlation	2026-05-06 14:34:17 +01:00
Pratik Mankawde	a8549a7ab2	fix(telemetry): address code review findings for Phase 8 log-trace correlation - Replace GetSpan() with direct context value check in Logs::format() to avoid heap allocation (new DefaultSpan) on the no-span path - Restore Phase 7 documentation accidentally deleted during merge - Fix undefined $JAEGER variable → use $TEMPO in integration test - Remove useless LCOV_EXCL markers around #ifdef block - Fix indentation inconsistencies in Log.cpp injection block - Remove incorrect url field from loki.yaml derivedFields - Update stale code sample in Phase8_taskList.md to match implementation - Correct "<10ns" performance claims to accurate ~15-20ns (no-span) and ~50ns (active-span) measurements across all docs - Replace Jaeger references with Tempo in TESTING.md (port 16686→3200) - Improve error handling in check_log_correlation(): track files_scanned, detect missing log files, fix silent grep error masking Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 14:32:46 +01:00
Pratik Mankawde	761688383d	fix(telemetry): address code review issues in OTelCollector - Fix use-after-free: extract gauge callback to static function and call RemoveCallback in ~OTelGaugeImpl() before unregistering from collector - Use memory_order_acq_rel on callHooks() debounce CAS for proper happens-before relationship between hook invocations - Add explicit 2s timeout to ForceFlush() in destructor to prevent blocking indefinitely when OTLP endpoint is unreachable at shutdown - Add OTLP receiver to metrics pipeline so native OTel metrics from xrpld are actually received by the collector - Remove stale health check port from docker-compose (extension was removed from collector config) - Clarify fallback docs: StatsD path requires re-enabling receiver/port - Fix comments: Counter uses uint64_t not int64_t, gauge clamps to [0, INT64_MAX] not [0, UINT64_MAX] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-06 14:24:52 +01:00
Pratik Mankawde	a0477f9475	Merge branch 'pratik/otel-phase9-metric-gap-fill' into pratik/otel-phase10-workload-validation	2026-04-29 21:11:03 +01:00
Pratik Mankawde	1658d3dc40	Merge branch 'pratik/otel-phase8-log-correlation' into pratik/otel-phase9-metric-gap-fill	2026-04-29 21:09:47 +01:00
Pratik Mankawde	8e7a2d6c53	Merge branch 'pratik/otel-phase7-native-metrics' into pratik/otel-phase8-log-correlation # Conflicts: # OpenTelemetryPlan/06-implementation-phases.md # OpenTelemetryPlan/08-appendix.md # OpenTelemetryPlan/OpenTelemetryPlan.md	2026-04-29 21:07:32 +01:00
Pratik Mankawde	9adcc49171	fix: re-apply phase-7 doc/config changes lost during merge Re-applies phase-7 unique modifications to documentation and configuration files that were overwritten when taking phase-6's versions during the merge conflict resolution. Changes: - docker-compose.yml: comment out StatsD port 8125, add OTLP notes - otel-collector-config.yaml: remove StatsD receiver, update pipeline - integration-test.sh: server=otel, check_otel_metric, StatsD port check - telemetry-runbook.md: System Metrics section, server=otel config, troubleshooting for missing OTel metrics - 02-design-decisions.md: Phase 7 coexistence strategy notes - 05-configuration-reference.md: OTel System Metrics correlation - 06-implementation-phases.md: add Phase 7 section (~180 lines) - OpenTelemetryPlan.md: update phases table (7 phases, 60.6 days) - 08-appendix.md: add Phase7_taskList.md to document index - Delete 5 statsd-.json dashboards (replaced by system-.json) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 21:05:48 +01:00

1 2 3

117 Commits