Commit Graph

9787 Commits

Author SHA1 Message Date
Pratik Mankawde
70396debcb Phase 5: Observability stack — spanmetrics, dashboards, runbook
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 16:44:11 +00:00
Pratik Mankawde
9bb8f2e12a Use Title Case for consensus mode names in telemetry attributes
Add toDisplayString(ConsensusMode) helper that returns Title Case
names (Proposing, Observing, Wrong Ledger, Switched Ledger) for use
in OTel span attributes and Grafana dashboards. The existing
to_string() is preserved unchanged for log output stability.

Updated call sites:
- RCLConsensus.cpp: onClose, onModeChange, startRoundInternal
- TracingMacros.cpp: test attribute value

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 16:43:55 +00:00
Pratik Mankawde
31d0ed178a Phase 4a: Establish-phase gap fill & cross-node correlation
Add full consensus tracing with deterministic trace ID correlation
and establish-phase instrumentation:

- Deterministic trace_id from previousLedger.id() for cross-node
  correlation (switchable via consensus_trace_strategy config)
- Round-to-round span links (follows-from) for causal chaining
- Establish phase spans with convergence tracking, dispute resolution
  events, and threshold escalation attributes
- Validation spans with links to round spans (thread-safe via
  roundSpanContext_ snapshot for jtACCEPT cross-thread access)
- Mode change spans for proposing/observing transitions
- New startSpan overload with span links in Telemetry interface
- XRPL_TRACE_ADD_EVENT macro with do-while(0) safety wrapper
- Config validation for consensus_trace_strategy
- Test adaptor (csf::Peer) updated with getTelemetry() stub

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:10:00 +00:00
Pratik Mankawde
7675af41ec Add consensus.accept.apply span with ledger close time attributes
Add a new trace span in doAccept() capturing ledger close time details:
- xrpl.consensus.close_time: agreed-upon close time (epoch seconds)
- xrpl.consensus.close_time_correct: whether validators converged
  (per avCT_CONSENSUS_PCT = 75% threshold)
- xrpl.consensus.close_resolution_ms: time rounding granularity
- xrpl.consensus.state: "finished" or "moved_on" (consensus failure)
- xrpl.consensus.proposing: whether this node was proposing

Update Tempo datasource with close time filters, plan docs with
new span inventory, and add test coverage for the attribute pattern.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:10:00 +00:00
Pratik Mankawde
0bbffaebc4 Phase 4: Consensus tracing — round lifecycle, proposals, validations
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:09:24 +00:00
Pratik Mankawde
e6b05d700c Phase 3: Transaction tracing — protobuf context, PeerImp, NetworkOPs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:09:01 +00:00
Pratik Mankawde
f114ec5462 Fix gersemi cmake formatting in test CMakeLists
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:09:01 +00:00
Pratik Mankawde
ea00fb9bc8 Phase 2: Complete RPC tracing — interface, macros, attributes, tests
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:09:01 +00:00
Pratik Mankawde
a707abd6f2 Phase 1c: RPC layer telemetry integration
Add tracing instrumentation to the RPC request handling layer:

TracingInstrumentation.h:
- Convenience macros (XRPL_TRACE_RPC, XRPL_TRACE_TX, etc.) that
  create RAII SpanGuard objects when telemetry is enabled
- XRPL_TRACE_SET_ATTR / XRPL_TRACE_EXCEPTION for span enrichment
- Zero-overhead no-ops when XRPL_ENABLE_TELEMETRY is not defined

RPCHandler.cpp:
- Trace each RPC command with span name "rpc.command.<method>"
- Record command name, API version, role, and status as attributes
- Capture exceptions on the span for error visibility

ServerHandler.cpp:
- Trace HTTP requests ("rpc.request"), WebSocket messages
  ("rpc.ws_message"), and processRequest ("rpc.process")

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:09:01 +00:00
Pratik Mankawde
b3609fb345 Respect custom service_instance_id when set in [telemetry] config
Only override serviceInstanceId with the node public key when the user
hasn't explicitly set service_instance_id in the [telemetry] section.
This allows operators to assign human-friendly names (e.g. "validator-1")
while still defaulting to the node's base58 public key.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:09:01 +00:00
Pratik Mankawde
99a0b4094e Fix empty service.instance.id by deferring node identity injection
The Telemetry object is constructed in ApplicationImp's member initializer
list where nodeIdentity_ is not yet available, resulting in an empty
service.instance.id resource attribute. Add setServiceInstanceId() virtual
method that Application::setup() calls after nodeIdentity_ is known but
before telemetry_->start() creates the OTel resource.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:09:01 +00:00
Pratik Mankawde
d8d7a40fca Phase 1b: Telemetry core infrastructure
Add the OpenTelemetry telemetry library and supporting infrastructure:

Build system:
- Conan opentelemetry-cpp dependency with OTLP/gRPC exporter
- CMake integration for xrpl_telemetry library target
- Levelization ordering updates

Core library (libxrpl):
- Telemetry class: provider lifecycle, span creation, sampling config
- SpanGuard: RAII span management with attribute/exception helpers
- TelemetryConfig: parse [telemetry] config section
- NullTelemetry: no-op implementation when telemetry is disabled

Application integration:
- Telemetry member in ApplicationImp with start/stop lifecycle
- getTelemetry() interface on Application
- ServiceRegistry telemetry accessor

Docker observability stack:
- OTel Collector, Jaeger, Grafana docker-compose setup
- Collector config with OTLP gRPC receiver and Jaeger exporter

Config and docs:
- Example telemetry config section in xrpld-example.cfg
- Build documentation for telemetry setup

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:09:01 +00:00
Alex Kremer
7b3724b7a3 fix: Add missed clang-tidy bugprone-inc-dec-conditions check (#6526) 2026-03-11 14:04:26 +00:00
Alex Kremer
f27d8f3890 chore: Enable clang-tidy bugprone-inc-dec-in-conditions check (#6455) 2026-03-10 20:12:15 +00:00
Alex Kremer
8345cd77df chore: Enable clang-tidy bugprone-unused-raii check (#6505) 2026-03-10 19:48:56 +00:00
Alex Kremer
c38aabdaee chore: Enable clang-tidy bugprone-unhandled-self-assignment check (#6504) 2026-03-10 17:42:49 +00:00
Alex Kremer
a896ed3987 chore: Enable clang-tidy bugprone-optional-value-conversion check (#6470) 2026-03-10 15:56:24 +01:00
Alex Kremer
1a7d67c4db chore: Enable clang-tidy bugprone-reserved-identifier check (#6456) 2026-03-10 10:29:08 +01:00
Alex Kremer
92983d8040 chore: Enable clang-tidy bugprone-too-small-loop-variable check (#6473) 2026-03-10 08:56:44 +00:00
Alex Kremer
320a65f77c chore: Enable clang-tidy bugprone-suspicious-stringview-data-usage check (#6467) 2026-03-10 08:34:27 +00:00
Alex Kremer
e284969ae4 chore: Enable clang-tidy bugprone-pointer-arithmetic-on-polymorphic-object check (#6469) 2026-03-09 19:36:56 +01:00
Alex Kremer
0335076359 chore: Fix additional clang-tidy issues for unused-local-non-trivial-variable check (#6509) 2026-03-09 17:16:04 +00:00
Sergey Kuznetsov
e2290b1a0a feat: Add mutex wrapper from clio (#6447)
This change adds a mutex wrapper copied from clio. The wrapper attaches a mutex to the data it protects, which improves safety and readability.
2026-03-09 16:33:20 +00:00
Alex Kremer
1ee0567b14 chore: Enable clang-tidy bugprone-suspicious-missing-comma check (#6468) 2026-03-09 15:48:38 +00:00
Alex Kremer
6b301efc8c chore: Enable clang-tidy bugprone-unused-local-non-trivial-variable check (#6458) 2026-03-09 15:25:52 +00:00
Vito Tumas
5865bd017f refactor: Update transaction folder structure (#6483)
This change reorganizes the `tx/transactors` directory for consistency and discoverability. There are no behavioral changes, this is a pure refactor. Underscores were chosen as the way to separate multi-words as this is the more popular option in C++ projects.
 
Specific changes:
- Rename all subdirectories to lowercase/snake_case (`AMM` → `amm`, `Check` → `check`, `NFT` → `nft`, `PermissionedDomain` → `permissioned_domain`, etc.)
- Merge `AMM/` and `Offer/` into `dex/`, including `PermissionedDEXHelpers`
- Rename `MPT/` → `token/`, absorbing `SetTrust` and `Clawback`
- Move top-level transactors into named groups: `account/`, `bridge/`, `credentials/`, `did/`, `escrow/`, `oracle/`, `payment/`, `payment_channel/`, `system/`
- Update all include paths across the codebase and `transactions.macro`
2026-03-06 08:25:31 +00:00
Ayaz Salikhov
af0ec7defd chore: Apply gersemi changes (#6486) 2026-03-05 19:54:44 +00:00
Alex Kremer
dde450784d Add Formats and Flags to server_definitions (#6321)
This change implements https://github.com/XRPLF/XRPL-Standards/discussions/418: "System XLS: Add Formats and Flags to server_definitions".
2026-03-05 16:11:27 +00:00
Ayaz Salikhov
c69091bded chore: Add Git information compile-time info to only one file (#6464)
The existing code added the git commit info (`GIT_COMMIT_HASH` and `GIT_BRANCH`) to every file, which was a problem for leveraging `ccache` to cache build objects. This change adds a separate C++ file from where these compile-time variables are propagated to wherever they are needed. A new CMake file is added to set the commit info if the `git` binary is available.
2026-03-04 19:45:28 +00:00
Alex Kremer
b451d5e412 chore: Enable clang-tidy bugprone-return-const-ref-from-parameter check (#6459) 2026-03-04 18:10:10 +00:00
Alex Kremer
af97df5a63 chore: Enable clang-tidy bugprone-move-forwarding-reference check (#6457) 2026-03-04 17:03:27 +00:00
Peter Chen
e39954d128 fix: Gateway balance with MPT (#6143)
When `gateway_balances` gets called on an account that is involved in the `EscrowCreate` transaction (with MPT being escrowed), the method returns internal error. This change fixes this case by excluding the MPT type when totaling escrow amount.
2026-03-04 15:50:51 +00:00
tequ
3cd1e3d94e refactor: Update PermissionedDomainDelete to use keylet for sle access (#6063) 2026-03-04 04:11:58 +01:00
Ayaz Salikhov
fcec31ed20 chore: Update pre-commit hooks (#6460) 2026-03-03 20:23:22 +00:00
Sergey Kuznetsov
5300e65686 tests: Improve stability of Subscribe tests (#6420)
The `Subscribe` tests were flaky, because each test performs some operations (e.g. sends transactions) and waits for messages to appear in subscription with a 100ms timeout. If tests are slow (e.g. compiled in debug mode or a slow machine) then some of them could fail. This change adds an attempt to synchronize the background Env's thread and the test's thread by ensuring that all the scheduled operations are started before the test's thread starts to wait for a websocket message. This is done by limiting I/O threads of the app inside Env to 1 and adding a synchronization barrier after closing the ledger.
2026-03-03 08:46:55 -05:00
Alex Kremer
afc660a1b5 refactor: Fix clang-tidy bugprone-empty-catch check (#6419)
This change fixes or suppresses instances detected by the `bugprone-empty-catch` clang-tidy check.
2026-03-02 17:08:56 +00:00
Vito Tumas
1a7f824b89 refactor: Splits invariant checks into multiple classes (#6440)
The invariant check system had grown into a single monolithic file pair containing 24 invariant checker classes. The large `InvariantCheck.cpp` file was a frequent source of merge conflicts and difficult to navigate. This refactoring improves maintainability and readability with zero behavioral changes.

In particular, this change:
- Splits `InvariantCheck.h` and `InvariantCheck.cpp` into 10 focused header/source pairs organized by domain under a new `invariants/` subdirectory.
- Extracts the shared `Privilege` enum and `hasPrivilege()` function into a dedicated `InvariantCheckPrivilege.h` header, so domain-specific files can reference them independently.
2026-02-27 21:02:39 +00:00
Mayukha Vadari
404f35d556 test: Grep for failures in CI (#6339)
This change adjusts the CI tests to make it easier to spot errors, without needing to sift through the thousands of lines of output.
2026-02-27 03:01:38 +00:00
Bart
3a8a18c2ca refactor: Use uint256 directly as key instead of void pointer (#6313)
This change replaces `void const*` by `uint256 const&` for database fetches.

Object hashes are expressed using the `uint256` data type, and are converted to `void *` when calling the `fetch` or `fetchBatch` functions. However, in these fetch functions they are converted back to `uint256`, making the conversion process unnecessary. In a few cases the underlying pointer is needed, but that can then be easy obtained via `[hash variable].data()`.
2026-02-25 18:23:34 -05:00
Valentin Balaschenko
bdd106d992 Explicitly trim the heap after cache sweeps (#6022)
Limited to Linux/glibc builds.
2026-02-24 21:33:13 +00:00
Ed Hennis
6f35d94b2f Fix tautological assertion (#6393) 2026-02-20 01:58:47 +00:00
Ayaz Salikhov
2c1fad1023 chore: Apply clang-format width 100 (#6387) 2026-02-19 23:30:00 +00:00
Sergey Kuznetsov
31302877ab ci: Add clang tidy workflow to ci (#6369) 2026-02-19 14:06:44 -05:00
Jingchen
0976b2b68b refactor: Modularize app/tx (#6228) 2026-02-17 18:10:07 +00:00
Jingchen
36240116a5 refactor: Decouple app/tx from Application and Config (#6227)
This change decouples app/tx from `Application` and `Config` to clear the way to moving transactors to `libxrpl`.
2026-02-17 11:29:53 -05:00
Sergey Kuznetsov
958d8f3754 chore: Update clang-format to 21.1.8 (#6352) 2026-02-16 14:31:18 -05:00
Jingchen
ac0ad3627f refactor: Modularize HashRouter, Conditions, and OrderBookDB (#6226)
This change modularizes additional components by moving code to `libxrpl`.
2026-02-13 10:34:37 -05:00
nuxtreact
cd218346ff chore: Fix minor issues in comments (#6346) 2026-02-12 14:55:27 -05:00
Jingchen
5edd3566f7 refactor: Modularize the NetworkOPs interface (#6225)
This change moves the NetworkOPs interface into `libxrpl` and it leaves its implementation in `xrpld`.
2026-02-12 13:15:03 -05:00
Jingchen
9f17d10348 refactor: Modularize RelationalDB (#6224)
The rdb module was not properly designed, which is fixed in this change. The module had three classes:
1) The abstract class `RelationalDB`.
2) The abstract class `SQLiteDatabase`, which inherited from `RelationalDB` and added some pure virtual methods.
3) The concrete class `SQLiteDatabaseImp`, which inherited from `SQLiteDatabase` and implemented all methods.

The updated code simplifies this as follows:
* The `SQLiteDatabaseImp` has become `SQLiteDatabase`, and
* The former `SQLiteDatabase `has merged with `RelationalDatabase`.
2026-02-11 16:22:01 +00:00