Commit Graph

131 Commits

Author SHA1 Message Date
Pratik Mankawde
d058e5ac3c Merge branch 'pratik/otel-phase2-rpc-tracing' into pratik/otel-phase3-tx-tracing
Forward-merge the Rule D __name__ builtin fix (and prior naming-check work).
2026-06-11 18:34:48 +01:00
Pratik Mankawde
ae80391da6 Merge branch 'pratik/otel-phase1c-rpc-integration' into pratik/otel-phase2-rpc-tracing
Forward-merge the Rule D __name__ builtin fix (and prior naming-check work).
2026-06-11 18:34:34 +01:00
Pratik Mankawde
6ec60ff52c ci: Add __name__ to OTel naming check Rule D builtins
Rule D (dashboard PromQL labels must exist in L1) flagged `__name__` once the
phase-7 system-*.json dashboards started using `sum by (le, __name__)`.
`__name__` is the Prometheus reserved label for the metric name itself — a
builtin, not a span attribute. Add it to the builtin allowlist and cover it
with a test. (Earlier dashboards only used `__name__` inside `{__name__=~...}`
matchers, which the label regex did not extract, so this surfaced only now.)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 18:34:19 +01:00
Pratik Mankawde
8add336c1a Merge branch 'pratik/otel-phase2-rpc-tracing' into pratik/otel-phase3-tx-tracing 2026-06-11 18:23:02 +01:00
Pratik Mankawde
d27d67dfe4 Merge branch 'pratik/otel-phase1c-rpc-integration' into pratik/otel-phase2-rpc-tracing
# Conflicts:
#	.github/scripts/levelization/results/loops.txt
2026-06-11 18:22:54 +01:00
Pratik Mankawde
59030e5d61 fixed a rule in otel naming check file. added tests for it.
Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>
2026-06-11 18:21:42 +01:00
Pratik Mankawde
6c4c3e1049 layering
Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>
2026-06-11 18:01:18 +01:00
Pratik Mankawde
7cb08307a7 Merge branch 'pratik/otel-phase2-rpc-tracing' into pratik/otel-phase3-tx-tracing
Bring phase-2 forward into phase 3 (transaction tracing). Phase 3 introduces
TxSpanNames.h, TxQSpanNames.h, and TxApplySpanNames.h.

Conflict resolution:
- TxQ.cpp: kept phase-3's txq_span-based instrumentation (phase-2 had none).
  Dropped the orphaned `NumberSO{... fixUniversalNumber}` line — develop's
  #5962 (Retire fixUniversalNumber) removed that symbol repo-wide; the
  conflict block had carried one stale copy that would not compile.
- 05/08/OpenTelemetryPlan.md: dropped the deleted 04-code-samples / POC_taskList
  references (carried from phase-2), kept phase-3's new secure-OTel.md doc rows,
  section, and Mermaid node/edge/style. Config code block -> prose; merged the
  secure-OTel hardening pointer with the authoritative-config prose.
- Phase3_taskList.md: removed the "dotted keys for readability" note that came
  from phase-2 — phase 3 already uses the underscore keys.

Reviewed by code-review agents: telemetry instrumentation intact, naming check
green (47 keys across 7 *SpanNames.h headers), no conflict markers.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 16:45:34 +01:00
Pratik Mankawde
b1e6d90af1 ordering changes
Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>
2026-06-11 16:29:06 +01:00
Pratik Mankawde
4086ac9518 Merge branch 'pratik/otel-phase1c-rpc-integration' into pratik/otel-phase2-rpc-tracing
Bring the hardened OTel naming check forward from phase-1c: unconditional
Rule F, test-file exemption, and the Rule H in-place-constant warning. The
check passes clean on phase 2 (24 keys across 4 *SpanNames.h headers including
PathFind; the SpanGuardFactory.cpp test is correctly exempt from Rule F).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 16:10:42 +01:00
Pratik Mankawde
4d044e6254 ci: Harden OTel naming check — unconditional Rule F, test exemption, Rule H
Three robustness fixes to check_otel_naming.py, all on phase-1c where the
script lives:

- Rule F now runs UNCONDITIONALLY. It is a purely syntactic check on the
  call-sites and does not need the L1 key set, so code that calls
  SpanGuard::span/setAttribute directly without ever defining a *SpanNames.h
  is still caught (previously it was silently skipped when no header existed).
- Exempt test files from Rule F (tests pass arbitrary literal keys to exercise
  the API). The call-site matcher now requires a SpanGuard/`.`/`->` receiver,
  so std::span and bare declarations no longer false-positive.
- Add Rule H (warning, non-fatal): a namespace-qualified constant used at a
  telemetry call-site but not defined in any *SpanNames.h is flagged, catching
  constants defined in-place instead of in the proper header. Bare locals and
  std:: names are not warned to avoid noise.

SpanGuard.h / Telemetry.h @code examples updated to reference constants that
exist on this branch. README documents the new behavior.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 16:10:08 +01:00
Pratik Mankawde
d8d6142fbe ci: Revert phase-2-local OTel naming-check edits
The script and its README live on phase-1c (where check_otel_naming.py was
introduced). The test-file Rule-F exemption was mistakenly applied here on
phase-2; revert to phase-1c's version verbatim. The exemption and further
script improvements will land on phase-1c and merge forward, keeping the
script's logic on the branch that owns it.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 16:01:57 +01:00
Pratik Mankawde
afe0818c33 Merge branch 'pratik/otel-phase1c-rpc-integration' into pratik/otel-phase2-rpc-tracing
Bring the naming convention, code-sample cleanup, and CI naming check into
phase 2 (RPC tracing). Phase 2 introduces PathFindSpanNames.h.

Conflict resolution:
- 04-code-samples.md, POC_taskList.md: deletion wins.
- 02-design-decisions.md: took the convention-applied tables, but kept phase-2's
  accurate PathFinding summary row (pathfind_fast/search_level/num_paths/...,
  matching the implemented PathFindSpanNames.h).
- 05/08: took the code-block-free prose; kept phase-2's Phase2-5_taskList.md
  index rows (dropping only the deleted POC row). Fixed stale setup_Telemetry/
  make_Telemetry doc references to the code-correct setupTelemetry/makeTelemetry.
- Telemetry.h auto-merged to the constant-based @code examples.

check_otel_naming.py change: exempt test files from Rule F (tests pass
arbitrary literal keys to exercise the API). The check passes clean on the
merged tree (24 keys across 4 *SpanNames.h headers, including PathFind).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 15:47:30 +01:00
Pratik Mankawde
ca7282479f ci: Enforce lower_snake_case attribute keys in OTel naming check
Add Rule G to check_otel_naming.py: every span-attribute key must be
lower_snake_case (^[a-z][a-z0-9_]*$ per dot-separated segment). This catches
camelCase, UPPERCASE, and spaces in keys, which the structural (dotted) and
source (literal) rules did not. Document it in the script README and
CONTRIBUTING.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 15:33:08 +01:00
Pratik Mankawde
134a24d5bc ci: Add OpenTelemetry span-attribute naming check (phase 1c)
Add check_otel_naming.py and wire it into on-pr.yml so every PR validates
that span-attribute names stay consistent across the code, collector, Tempo,
dashboards, and docs.

- The valid key set is derived dynamically from the *SpanNames.h constants and
  the resource attributes the code registers in Telemetry.cpp — no hardcoded
  allowlist to drift.
- Each rule is presence-gated: it runs only when the file it needs is in the
  tree, so the check is correct whether telemetry changes land in one PR or
  several (the collector/Tempo/dashboard/runbook layers arrive in later phases).
- Rule A flags dotted span-attribute keys; Rule F flags string-literal
  attribute keys and span-name arguments (values may be runtime data).
- stdlib-only, mirroring the levelization check (bare `python`, no pip step).
- Telemetry.h / SpanGuard.h @code examples now use *SpanNames.h constants so
  the strict literal check passes.
- CONTRIBUTING.md documents the check and how to run it locally.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 15:26:38 +01:00
Pratik Mankawde
480b6cab3c Merge branch 'pratik/otel-phase1b-telemetry-infra' into pratik/otel-phase1c-rpc-integration
Bring the phase-1a/1b naming-convention and code-sample cleanup into 1c.

Conflict resolution:
- 04-code-samples.md, POC_taskList.md: deletion wins.
- OpenTelemetryPlan docs (01/02/03/05): took the convention-applied,
  code-block-free versions; verified no attribute category, table row, or
  section header was lost (the differences were dotted->underscore renames).
- Telemetry.h: kept 1c's RpcSpanNames.h constant-based example
  (rpc_span::attr::command) over the string literal.
- 31 non-telemetry files are clean develop carry-forward (identical to 1b).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 14:58:51 +01:00
Pratik Mankawde
e9cb9421ef Merge branch 'pratik/otel-phase1a-plan-docs' into pratik/otel-phase1b-telemetry-infra
Bring the span attribute naming convention (phase 1a) into phase 1b.

Conflict resolution kept phase-1b's SpanGuard-based workflow and applied
the underscore naming convention to all non-code-sample text:
- Converted prose, tables, Mermaid labels, and TraceQL/PromQL query
  references across the plan docs to the underscore form.
- Converted the two @code attribute-key examples in Telemetry.h
  (command, tx_type).
- Left the code-sample files (04-code-samples.md, POC_taskList.md) and
  03-implementation-strategy.md code blocks at the phase-1b version; the
  code-sample docs are slated for removal on phase-1a.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 14:10:52 +01:00
Ayaz Salikhov
8000adfa79 ci: Make configurations launch on certain event types (#7447) 2026-06-10 18:08:34 +00:00
Pratik Mankawde
f37589b1f5 Merge branch 'pratik/otel-phase2-rpc-tracing' into pratik/otel-phase3-tx-tracing 2026-06-10 16:05:29 +01:00
Pratik Mankawde
d126868a25 Merge branch 'pratik/otel-phase1c-rpc-integration' into pratik/otel-phase2-rpc-tracing
Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>
2026-06-10 15:58:49 +01:00
Pratik Mankawde
8908036b11 Merge branch 'pratik/otel-phase1b-telemetry-infra' into pratik/otel-phase1c-rpc-integration 2026-06-10 14:55:29 +01:00
Pratik Mankawde
38fbab1d18 levelization
Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>
2026-06-10 10:50:54 +01:00
Pratik Mankawde
331d9d55b1 Merge branch 'pratik/otel-phase2-rpc-tracing' into pratik/otel-phase3-tx-tracing 2026-06-10 10:31:05 +01:00
Pratik Mankawde
4ce882965a Merge branch 'pratik/otel-phase1c-rpc-integration' into pratik/otel-phase2-rpc-tracing 2026-06-10 10:29:51 +01:00
Pratik Mankawde
3cb9a2bf51 levelization
Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>
2026-06-10 10:25:56 +01:00
Pratik Mankawde
b32db4ceeb Merge branch 'pratik/otel-phase2-rpc-tracing' into pratik/otel-phase3-tx-tracing 2026-06-10 10:18:24 +01:00
Pratik Mankawde
207864a98b Merge branch 'pratik/otel-phase1c-rpc-integration' into pratik/otel-phase2-rpc-tracing 2026-06-10 10:18:05 +01:00
Pratik Mankawde
57d382ceda Merge branch 'pratik/otel-phase1b-telemetry-infra' into pratik/otel-phase1c-rpc-integration
Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>
2026-06-10 10:16:35 +01:00
Pratik Mankawde
848cbcbfbe Merge branch 'pratik/otel-phase1a-plan-docs' into pratik/otel-phase1b-telemetry-infra
Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>
2026-06-10 10:15:31 +01:00
Pratik Mankawde
8fe3f06999 Merge branch 'pratik/otel-phase2-rpc-tracing' into pratik/otel-phase3-tx-tracing
Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>
2026-06-09 19:05:40 +01:00
Pratik Mankawde
d3c09fd3f4 Merge branch 'pratik/otel-phase1c-rpc-integration' into pratik/otel-phase2-rpc-tracing
Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>
2026-06-09 19:04:17 +01:00
Pratik Mankawde
142e8c5b36 Merge branch 'pratik/otel-phase1b-telemetry-infra' into pratik/otel-phase1c-rpc-integration
Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>
2026-06-09 18:58:11 +01:00
Pratik Mankawde
e5f890e195 leveling changes
Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>
2026-06-09 18:50:21 +01:00
Bart
c552eb333f refactor: Change config section and key string literals into constants (#7095)
Co-authored-by: Bart <11445373+bthomee@users.noreply.github.com>
2026-06-09 14:58:21 +00:00
Pratik Mankawde
be67ad25e7 Merge branch 'pratik/otel-phase2-rpc-tracing' into pratik/otel-phase3-tx-tracing
# Conflicts:
#	OpenTelemetryPlan/05-configuration-reference.md
2026-06-09 14:52:12 +01:00
Pratik Mankawde
3ee8f900ec Merge branch 'pratik/otel-phase1c-rpc-integration' into pratik/otel-phase2-rpc-tracing
# Conflicts:
#	src/tests/libxrpl/CMakeLists.txt
2026-06-09 14:50:20 +01:00
Pratik Mankawde
a119efc478 Merge branch 'pratik/otel-phase1b-telemetry-infra' into pratik/otel-phase1c-rpc-integration 2026-06-09 13:44:53 +01:00
Pratik Mankawde
57a54ad0fe Merge branch 'pratik/otel-phase1a-plan-docs' into pratik/otel-phase1b-telemetry-infra 2026-06-09 13:34:58 +01:00
Ayaz Salikhov
a389f922dd ci: Use new packaging images and don't cancel develop builds (#7417)
Co-authored-by: Bart <bthomee@users.noreply.github.com>
2026-06-08 13:41:08 +00:00
Ayaz Salikhov
949887feb9 build: Create single test binary xrpl_tests (#7327) 2026-06-05 19:24:32 +00:00
Ayaz Salikhov
63ffdc39dc ci: Refactor build-related nix / docker / workflows (#7408) 2026-06-05 17:05:19 +00:00
Pratik Mankawde
6428c9f13c feat(telemetry): add preflight/preclaim stage spans and stage attribute
The tx.transactor span covered only the apply stage; preflight and
preclaim had no telemetry, so a transaction that hard-failed those
stages produced no apply-pipeline span and per-stage latency/failure
was invisible.

Add tx.preflight and tx.preclaim spans in applySteps.cpp via a
makeStageSpan() helper using SpanGuard::hashSpan, so all three stages
share a deterministic trace_id derived from txID[0:16] even though they
run sequentially and often cross-thread. Each span carries stage,
tx_type, and ter_result; exceptions are recorded as tefEXCEPTION before
the public wrappers map them. The type lookup is guarded behind the
span-active check so it costs nothing when tracing is off.

Add a stage="apply" attribute to the tx.transactor span and move its
three hardcoded attribute strings to a new library-safe header
include/xrpl/tx/detail/TxApplySpanNames.h, which mirrors the daemon-side
TxSpanNames.h strings so the collector spanmetrics connector aggregates
both span sets under one dimension set.

A constants-contract test pins the span-name, attribute-key, and
stage-value strings; span content stays covered by the docker
integration test, as the rest of the telemetry suite is.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 11:11:55 +01:00
Ayaz Salikhov
8abe82eefa ci: Redesign matrix configuration based on Nix images (#7385)
Co-authored-by: semgrep-companion-app[bot] <218312740+semgrep-companion-app[bot]@users.noreply.github.com>
2026-06-04 20:02:59 +00:00
Pratik Mankawde
a13a858112 feat(telemetry): add tx.transactor span for per-transactor execution timing
Wraps Transactor::operator() with a span that captures tx_type,
ter_result, and applied. This is the universal dispatch point — every
transaction flows through it, giving per-type latency breakdown.

Adds libxrpl.tx > xrpl.telemetry levelization dependency.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-03 16:40:10 +01:00
Pratik Mankawde
c187a62353 Merge branch 'pratik/otel-phase2-rpc-tracing' into pratik/otel-phase3-tx-tracing
Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>
2026-05-29 16:47:15 +01:00
Pratik Mankawde
c848e51e13 Merge branch 'pratik/otel-phase1c-rpc-integration' into pratik/otel-phase2-rpc-tracing
Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>
2026-05-29 16:44:07 +01:00
Pratik Mankawde
8f9057729c Merge branch 'pratik/otel-phase1b-telemetry-infra' into pratik/otel-phase1c-rpc-integration
Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>
2026-05-29 16:14:21 +01:00
Pratik Mankawde
f031befc6e compilation fixes and levelization fixes
Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>
2026-05-29 16:04:19 +01:00
Pratik Mankawde
3a1f22583f Merge branch 'pratik/otel-phase1a-plan-docs' into pratik/otel-phase1b-telemetry-infra
Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>
2026-05-29 15:34:22 +01:00
Ayaz Salikhov
f9551ac5ca style: Run shfmt on workflows, actions and markdown bash code (#7333) 2026-05-27 19:24:18 +00:00