Commit Graph

252 Commits

Author SHA1 Message Date
Pratik Mankawde
7d10ccb58a Add back conan install step
Conan was previously installed by prepare-runner or a separate step.
Since we're not using prepare-runner on native runners, install it
via pip alongside other Python dependencies.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 10:59:16 +00:00
Pratik Mankawde
345a8f80a5 Allow print-env to fail gracefully on native runners
The print-env action uses $CC which is not set on native ubuntu-latest
(only in container builds). Add continue-on-error so this diagnostic
step doesn't block the pipeline.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 10:59:16 +00:00
Pratik Mankawde
e95781bde1 Fix prepare-runner /root permission error on ubuntu-latest
The XRPLF/actions/prepare-runner action hardcodes /root/.ccache and
/root/.conan2 for Linux, assuming container execution as root. This
workflow runs natively on ubuntu-latest as the runner user.

Replace prepare-runner with inline apt-get install of ccache + ninja,
and use CMake compiler launchers for ccache instead. Keep all other
main CI patterns: pinned actions, get-nproc, env-based secrets,
CCACHE_SLOPPINESS, print-env step.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 10:59:16 +00:00
Pratik Mankawde
dbb292497e Align telemetry workflow build with main CI pipeline
Match reusable-build-test-config.yml exactly:
- Use XRPLF/actions/prepare-runner for system-level ccache setup
- Use XRPLF/actions/get-nproc for dynamic parallelism
- Remove redundant Conan package cache (remote is the cache)
- Remove explicit CMake compiler launchers (prepare-runner handles it)
- Add CCACHE_SLOPPINESS and print-env step
- Pin action versions to commit SHAs
- Move secrets and github.event.inputs to env blocks

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 10:59:16 +00:00
Pratik Mankawde
4bba36ac0b Add workflow file to its own paths trigger
Without this, changes to the workflow file itself don't trigger the
telemetry validation pipeline on push.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 10:59:16 +00:00
Pratik Mankawde
a3a3dec5c8 Fix broken YAML in telemetry-validation workflow
The previous suggestion commits introduced duplicate keys and missing
YAML structure. This commit fixes:
- Duplicate env: block → single block with ccache vars
- Missing jobs: key
- Duplicate apt-get install → single line with ccache
- Missing 'Install Python dependencies' step name
- Duplicate 'uses: setup-conan'
- Missing 'uses: actions/cache@v4' and 'with:' for cache step
- Duplicate 'Configure CMake' step name

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 10:59:16 +00:00
Pratik Mankawde
0917a7fd83 Use CCache and conan login to use pre-built artifacts
Co-authored-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>
2026-03-17 10:59:16 +00:00
Pratik Mankawde
9adff73e6e Fix CI: align telemetry workflow build with main CI pipeline
Rewrite the build steps to mirror the main CI (reusable-build-test-config):
- Use build-deps action (conan install with --options:host='&:xrpld=True')
  so the generated toolchain sets xrpld=ON and telemetry=True automatically
- Separate configure and build steps matching main CI pattern
- Run cmake from within build/ dir pointing to .. (same as main CI)
- Remove manual -Dtelemetry=ON -Dxrpld=ON (come from toolchain)
- Remove Docker DinD service (ubuntu-latest has Docker pre-installed)
- Install ninja-build system package for the Ninja generator
- Increase timeout to 90 min for full build + validation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 10:59:16 +00:00
Pratik Mankawde
cbce327cad minor change
Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>
2026-03-17 10:59:16 +00:00
Pratik Mankawde
af648377fc Fix CI: align telemetry workflow build with main CI approach
The telemetry-validation workflow used --output-folder=build with conan
install, which overrides the conanfile.py layout() and breaks include
paths for Conan-managed protobuf (runtime_version.h not found). Aligned
the build steps with the main CI's reusable-build-test-config.yml: drop
--output-folder, add Ninja generator, and explicit build_type setting.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 10:59:16 +00:00
Pratik Mankawde
6afd2e35bc Fix CI: cspell EOJSON delimiter and telemetry workflow conan setup
- Rename EOJSON heredoc delimiter to EOF_JSON to avoid cspell unknown word
- Add conan installation step (pip3 install conan) to telemetry-validation workflow
- Use shared setup-conan action for proper Conan profile/remote configuration
- Align build commands with reusable-build-test-config.yml conventions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 10:59:16 +00:00
Pratik Mankawde
787b496484 Phase 10: Synthetic workload generation and telemetry validation tools
Add comprehensive workload harness for end-to-end validation of the
Phases 1-9 telemetry stack:

Task 10.1 — Multi-node test harness:
  - docker-compose.workload.yaml with full OTel stack (Collector, Jaeger,
    Tempo, Prometheus, Loki, Grafana)
  - generate-validator-keys.sh for automated key generation
  - xrpld-validator.cfg.template for node configuration

Task 10.2 — RPC load generator:
  - rpc_load_generator.py with WebSocket client, configurable rates,
    realistic command distribution (40% health, 30% wallet, 15% explorer,
    10% tx lookups, 5% DEX), W3C traceparent injection

Task 10.3 — Transaction submitter:
  - tx_submitter.py with 10 transaction types (Payment, OfferCreate,
    OfferCancel, TrustSet, NFTokenMint, NFTokenCreateOffer, EscrowCreate,
    EscrowFinish, AMMCreate, AMMDeposit), auto-funded test accounts

Task 10.4 — Telemetry validation suite:
  - validate_telemetry.py checking spans (Jaeger), metrics (Prometheus),
    log-trace correlation (Loki), dashboards (Grafana)
  - expected_spans.json (17 span types, 22 attributes, 3 hierarchies)
  - expected_metrics.json (SpanMetrics, StatsD, Phase 9, dashboards)

Task 10.5 — Performance benchmark suite:
  - benchmark.sh for baseline vs telemetry comparison
  - collect_system_metrics.sh for CPU/memory/latency sampling
  - Thresholds: <3% CPU, <5MB memory, <2ms RPC p99, <5% TPS, <1% consensus

Task 10.6 — CI integration:
  - telemetry-validation.yml GitHub Actions workflow
  - run-full-validation.sh orchestrator script
  - Manual trigger + telemetry branch auto-trigger

Task 10.7 — Documentation:
  - workload/README.md with quick start and tool reference
  - Updated telemetry-runbook.md with validation and benchmark sections
  - Updated 09-data-collection-reference.md with validation inventory

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 10:59:16 +00:00
Pratik Mankawde
840ab2a3ed Update levelization results for MetricsRegistry GTest migration
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 10:56:00 +00:00
Pratik Mankawde
61809b9735 Update levelization results for xrpld.telemetry module
Regenerate loops.txt and ordering.txt to account for the bidirectional
dependency between xrpld.app and xrpld.telemetry introduced in Phase 9.
MetricsRegistry.cpp reads metrics from xrpld.app services (LedgerMaster,
TxQ, AcceptedLedger) while Application.cpp wires MetricsRegistry into
the app lifecycle — a pattern consistent with existing accepted loops
(overlay, peerfinder, rpc, shamap).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 10:56:00 +00:00
Pratik Mankawde
31d0ed178a Phase 4a: Establish-phase gap fill & cross-node correlation
Add full consensus tracing with deterministic trace ID correlation
and establish-phase instrumentation:

- Deterministic trace_id from previousLedger.id() for cross-node
  correlation (switchable via consensus_trace_strategy config)
- Round-to-round span links (follows-from) for causal chaining
- Establish phase spans with convergence tracking, dispute resolution
  events, and threshold escalation attributes
- Validation spans with links to round spans (thread-safe via
  roundSpanContext_ snapshot for jtACCEPT cross-thread access)
- Mode change spans for proposing/observing transitions
- New startSpan overload with span links in Telemetry interface
- XRPL_TRACE_ADD_EVENT macro with do-while(0) safety wrapper
- Config validation for consensus_trace_strategy
- Test adaptor (csf::Peer) updated with getTelemetry() stub

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:10:00 +00:00
Pratik Mankawde
e6b05d700c Phase 3: Transaction tracing — protobuf context, PeerImp, NetworkOPs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:09:01 +00:00
Pratik Mankawde
ea00fb9bc8 Phase 2: Complete RPC tracing — interface, macros, attributes, tests
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:09:01 +00:00
Pratik Mankawde
a707abd6f2 Phase 1c: RPC layer telemetry integration
Add tracing instrumentation to the RPC request handling layer:

TracingInstrumentation.h:
- Convenience macros (XRPL_TRACE_RPC, XRPL_TRACE_TX, etc.) that
  create RAII SpanGuard objects when telemetry is enabled
- XRPL_TRACE_SET_ATTR / XRPL_TRACE_EXCEPTION for span enrichment
- Zero-overhead no-ops when XRPL_ENABLE_TELEMETRY is not defined

RPCHandler.cpp:
- Trace each RPC command with span name "rpc.command.<method>"
- Record command name, API version, role, and status as attributes
- Capture exceptions on the span for error visibility

ServerHandler.cpp:
- Trace HTTP requests ("rpc.request"), WebSocket messages
  ("rpc.ws_message"), and processRequest ("rpc.process")

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:09:01 +00:00
Pratik Mankawde
d8d7a40fca Phase 1b: Telemetry core infrastructure
Add the OpenTelemetry telemetry library and supporting infrastructure:

Build system:
- Conan opentelemetry-cpp dependency with OTLP/gRPC exporter
- CMake integration for xrpl_telemetry library target
- Levelization ordering updates

Core library (libxrpl):
- Telemetry class: provider lifecycle, span creation, sampling config
- SpanGuard: RAII span management with attribute/exception helpers
- TelemetryConfig: parse [telemetry] config section
- NullTelemetry: no-op implementation when telemetry is disabled

Application integration:
- Telemetry member in ApplicationImp with start/stop lifecycle
- getTelemetry() interface on Application
- ServiceRegistry telemetry accessor

Docker observability stack:
- OTel Collector, Jaeger, Grafana docker-compose setup
- Collector config with OTLP gRPC receiver and Jaeger exporter

Config and docs:
- Example telemetry config section in xrpld-example.cfg
- Build documentation for telemetry setup

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:09:01 +00:00
Ayaz Salikhov
bee2d112c6 ci: Fix how clang-tidy is run when .clang-tidy is changed (#6521) 2026-03-11 14:18:18 +01:00
Bart
01c977bbfe ci: Fix rules used to determine when to upload Conan recipes (#6524)
The refs as previously used pointed to the source branch, not the target branch. However, determining the target branch is different depending on the GitHub event. The pull request logic was incorrect and needed to be fixed, and the logic inside the workflow could be simplified. Both modifications have been made in this commit.
2026-03-11 13:43:58 +01:00
Bart
3baf5454f2 ci: Only upload artifacts in the XRPLF/rippled repository (#6523)
This change will only attempt to upload artifacts for CI runs performed in the XRPLF/rippled repository.
2026-03-11 11:48:40 +01:00
Michael Legleux
24a5cbaa93 chore: Build voidstar on amd64 only (#6481)
* chore: Build voidstar on amd64 only

* fatal error if configuring voidstar on  non x86
2026-03-10 23:59:43 +00:00
Ayaz Salikhov
45b8c4d732 chore: Update XRPLF/actions (#6508)
This change mainly includes XRPLF/actions#51.
2026-03-09 21:47:22 +00:00
Ayaz Salikhov
7e2b137131 chore: Use check-pr-title from XRPLF/actions (#6506) 2026-03-09 17:53:52 +01:00
dependabot[bot]
9e0d350fca ci: [DEPENDABOT] bump tj-actions/changed-files from 47.0.4 to 47.0.5 (#6501) 2026-03-09 15:27:03 +01:00
Michael Legleux
08e734457f fix: Fix docs deployment for pull requests (#6482) 2026-03-05 00:12:41 -08:00
Michael Legleux
77518394e8 fix: Stop committing generated docs to prevent repo bloat (#6474) 2026-03-04 19:19:57 -08:00
Ayaz Salikhov
c69091bded chore: Add Git information compile-time info to only one file (#6464)
The existing code added the git commit info (`GIT_COMMIT_HASH` and `GIT_BRANCH`) to every file, which was a problem for leveraging `ccache` to cache build objects. This change adds a separate C++ file from where these compile-time variables are propagated to wherever they are needed. A new CMake file is added to set the commit info if the `git` binary is available.
2026-03-04 19:45:28 +00:00
dependabot[bot]
0abd762781 ci: [DEPENDABOT] bump actions/upload-artifact from 6.0.0 to 7.0.0 (#6450) 2026-03-03 17:17:08 +00:00
Sergey Kuznetsov
b58c681189 chore: Make nix hook optional (#6431)
This change makes the `nix` pre-commit hook optional in development environments, and enforced only inside Github Actions.
2026-02-27 13:36:10 -05:00
Mayukha Vadari
404f35d556 test: Grep for failures in CI (#6339)
This change adjusts the CI tests to make it easier to spot errors, without needing to sift through the thousands of lines of output.
2026-02-27 03:01:38 +00:00
Alex Kremer
2e595b6031 chore: Enable clang-tidy checks without issues (#6414)
This change enables all clang-tidy checks that are already passing. It also modifies the clang-tidy CI job, so it runs against all files if .clang-tidy changed.
2026-02-26 18:26:58 +00:00
Ayaz Salikhov
65e63ebef3 chore: Update cleanup-workspace to delete old .conan2 dir on macOS (#6412) 2026-02-25 01:12:16 +00:00
Sergey Kuznetsov
0fd237d707 chore: Add nix development environment (#6314)
---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-02-23 20:10:07 -05:00
dependabot[bot]
3542daa4cc ci: [DEPENDABOT] bump actions/upload-artifact from 4.6.2 to 6.0.0 (#6396)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.6.2 to 6.0.0.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](ea165f8d65...b7c566a772)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-version: 6.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-20 22:48:01 +00:00
dependabot[bot]
fd9f57ec97 ci: [DEPENDABOT] bump actions/checkout from 4.3.0 to 6.0.2 (#6397)
Bumps [actions/checkout](https://github.com/actions/checkout) from 4.3.0 to 6.0.2.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v4.3.0...de0fac2e4500dabe0009e67214ff5f5447ce83dd)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: 6.0.2
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-20 22:09:48 +00:00
dependabot[bot]
625becff18 ci: [DEPENDABOT] bump actions/setup-python from 5.6.0 to 6.2.0 (#6395)
Bumps [actions/setup-python](https://github.com/actions/setup-python) from 5.6.0 to 6.2.0.
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](a26af69be9...a309ff8b42)

---
updated-dependencies:
- dependency-name: actions/setup-python
  dependency-version: 6.2.0
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-20 21:29:05 +00:00
dependabot[bot]
4bcbc6e50f ci: [DEPENDABOT] bump tj-actions/changed-files from 46.0.5 to 47.0.4 (#6394)
Bumps [tj-actions/changed-files](https://github.com/tj-actions/changed-files) from 46.0.5 to 47.0.4.
- [Release notes](https://github.com/tj-actions/changed-files/releases)
- [Changelog](https://github.com/tj-actions/changed-files/blob/main/HISTORY.md)
- [Commits](ed68ef82c0...7dee1b0c15)

---
updated-dependencies:
- dependency-name: tj-actions/changed-files
  dependency-version: 47.0.4
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-20 19:59:37 +00:00
dependabot[bot]
0bc4a0cfe8 ci: [DEPENDABOT] bump codecov/codecov-action from 5.4.3 to 5.5.2 (#6398)
Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 5.4.3 to 5.5.2.
- [Release notes](https://github.com/codecov/codecov-action/releases)
- [Changelog](https://github.com/codecov/codecov-action/blob/main/CHANGELOG.md)
- [Commits](18283e04ce...671740ac38)

---
updated-dependencies:
- dependency-name: codecov/codecov-action
  dependency-version: 5.5.2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-20 19:11:26 +00:00
Ayaz Salikhov
cb54adefed ci: Build docs in PRs and in private repos (#6400) 2026-02-20 13:41:43 -05:00
Ayaz Salikhov
d03d72bfd5 ci: Add dependabot config (#6379) 2026-02-20 09:19:00 +00:00
Sergey Kuznetsov
31302877ab ci: Add clang tidy workflow to ci (#6369) 2026-02-19 14:06:44 -05:00
Jingchen
0976b2b68b refactor: Modularize app/tx (#6228) 2026-02-17 18:10:07 +00:00
Jingchen
36240116a5 refactor: Decouple app/tx from Application and Config (#6227)
This change decouples app/tx from `Application` and `Config` to clear the way to moving transactors to `libxrpl`.
2026-02-17 11:29:53 -05:00
Jingchen
ac0ad3627f refactor: Modularize HashRouter, Conditions, and OrderBookDB (#6226)
This change modularizes additional components by moving code to `libxrpl`.
2026-02-13 10:34:37 -05:00
Jingchen
5edd3566f7 refactor: Modularize the NetworkOPs interface (#6225)
This change moves the NetworkOPs interface into `libxrpl` and it leaves its implementation in `xrpld`.
2026-02-12 13:15:03 -05:00
Jingchen
9f17d10348 refactor: Modularize RelationalDB (#6224)
The rdb module was not properly designed, which is fixed in this change. The module had three classes:
1) The abstract class `RelationalDB`.
2) The abstract class `SQLiteDatabase`, which inherited from `RelationalDB` and added some pure virtual methods.
3) The concrete class `SQLiteDatabaseImp`, which inherited from `SQLiteDatabase` and implemented all methods.

The updated code simplifies this as follows:
* The `SQLiteDatabaseImp` has become `SQLiteDatabase`, and
* The former `SQLiteDatabase `has merged with `RelationalDatabase`.
2026-02-11 16:22:01 +00:00
Jingchen
ef284692db refactor: Modularize WalletDB and Manifest (#6223)
This change modularizes the `WalletDB` and `Manifest`. Note that the wallet db has nothing to do with account wallets and it stores node configuration, which is why it depends on the manifest code.
2026-02-11 13:42:31 +00:00
Ayaz Salikhov
2305bc98a4 chore: Remove CODEOWNERS (#6337) 2026-02-06 11:39:23 -05:00