Task 8.6: Add Log-Trace Correlation section to telemetry-runbook.md
with LogQL examples, verification steps, and troubleshooting guidance.
Update 09-data-collection-reference.md section 5a from "Future" to
actual implementation docs covering log format, ingestion pipeline,
Grafana correlation config, and Loki backend. Add Phase 8 log
correlation test section and troubleshooting to TESTING.md.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace StatsD UDP metric transport with native OpenTelemetry Metrics SDK
export via OTLP/HTTP behind the existing beast::insight::Collector interface.
- Task 7.1: Link opentelemetry-cpp to beast module in CMake when telemetry=ON
- Task 7.2: New OTelCollector class mapping beast::insight instruments to OTel
SDK (Counter, ObservableGauge, Histogram, Counter<uint64>) with OTLP/HTTP
export via PeriodicMetricReader at 1s intervals
- Task 7.3: Add server=otel branch to CollectorManager with endpoint config
- Task 7.4: Update otel-collector-config.yaml to use OTLP receiver for metrics
pipeline (StatsD receiver commented out for backward compat)
- Task 7.5: Metric names preserved via dot-to-underscore formatting matching
StatsD->Prometheus conventions
- Task 7.6: Rename Grafana dashboards from statsd-* to system-*, update titles
and UIDs from "StatsD" to "System Metrics"
- Task 7.7: Update integration test to use server=otel, verify OTLP metrics
- Task 7.8: Update runbook, TESTING.md, config reference, and data collection
reference docs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add Ledger Apply Duration and Close Time Agreement panels to
consensus-health dashboard
- Add consensus.accept.apply to telemetry runbook with TraceQL
queries for close time disagreements and consensus failures
- Add span to TESTING.md expected span catalog and verification loop
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This change adds support for sanitizer build options in CI builds workflow. Currently `asan+ubsan` is enabled, while `tsan+ubsan` is left disabled as more changes are required.
The .gitignore and .gitattributes files contain references to files and directories that the current build no longer produces, so this change removes obsolete entries in these files, and does some general reorganizing of the remaining entries.
This change updates BUILD.md for Conan 2, add fixes/workarounds for Apple Clang 17, Clang 20 and CMake 4. This also removes (from BUILD.md only) workarounds for compiler versions which we no longer support e.g. Clang 15 and adds compilation flag -Wno-deprecated-declarations to enable building with Clang 20 on Linux.
- Detects if the consensus process is "stalled". If it is, then we can declare a
consensus and end successfully even if we do not have 80% agreement on
our proposal.
- "Stalled" is defined as:
- We have a close time consensus
- Each disputed transaction is individually stalled:
- It has been in the final "stuck" 95% requirement for at least 2
(avMIN_ROUNDS) "inner rounds" of phaseEstablish,
- and either all of the other trusted proposers or this validator, if proposing,
have had the same vote(s) for at least 4 (avSTALLED_ROUNDS) "inner
rounds", and at least 80% of the validators (including this one, if
appropriate) agree about the vote (whether yes or no).
- If we have been in the establish phase for more than 10x the previous
consensus establish phase's time, then consensus is considered "expired",
and we will leave the round, which sends a partial validation (indicating
that the node is moving on without validating). Two restrictions avoid
prematurely exiting, or having an extended exit in extreme situations.
- The 10x time is clamped to be within a range of 15s
(ledgerMAX_CONSENSUS) to 120s (ledgerABANDON_CONSENSUS).
- If consensus has not had an opportunity to walk through all avalanche
states (defined as not going through 8 "inner rounds" of phaseEstablish),
then ConsensusState::Expired is treated as ConsensusState::No.
- When enough nodes leave the round, any remaining nodes will see they've
fallen behind, and move on, too, generally before hitting the timeout. Any
validations or partial validations sent during this time will help the
consensus process bring the nodes back together.
On macOS, if you have not installed something that depends on `xz`, then your
system may lack `lzma`, resulting in a build error similar to:
```
Downloading libarchive-3.6.0.tar.xz completed [6250.61k]
libarchive/3.6.0:
ERROR: libarchive/3.6.0: Error in source() method, line 120
get(self, **self.conan_data["sources"][self.version], strip_root=True)
ReadError: file could not be opened successfully:
- method gz: ReadError('not a gzip file')
- method bz2: ReadError('not a bzip2 file')
- method xz: CompressionError('lzma module is not available')
- method tar: ReadError('invalid header')
```
The solution is to ensure that `lzma` is installed by installing `xz`.
Add instructions for installing rippled using the package managers APT
and YUM. Some steps were adapted from xrpl.org.
---------
Co-authored-by: Michael Legleux <mlegleux@ripple.com>
Make it easy for projects to depend on libxrpl by adding an `ALIAS`
target named `xrpl::libxrpl` for projects to link.
The name was chosen because:
* The current library target is named `xrpl_core`. There is no other
"non-core" library target against which we need to distinguish the
"core" library. We only export one library target, and it should just
be named after the project to keep things simple and predictable.
* Underscores in target or library names are generally discouraged.
* Every target exported in CMake should be prefixed with the project
name.
By adding an `ALIAS` target, existing consumers who use the `xrpl_core`
target will not be affected.
* In the future, there can be a migration plan to make `xrpl_core` the
`ALIAS` target (and `libxrpl` the "real" target, which will affect the
filename of the compiled binary), and eventually remove it entirely.
Also:
* Fix the Conan recipe so that consumers using Conan import a target
named `xrpl::libxrpl`. This way, every consumer can use the same
instructions.
* Document the two easiest methods to depend on libxrpl. Both have been
tested.
* See #4443.
* Remove obsolete build instructions.
* By using Conan, builders can choose which dependencies specifically to
build and link as shared objects.
* Refactor the build instructions based on the plan in #4433.
This change can help improve the liveness of the network during periods of network
instability, by allowing the network to track which validators are presently not online
and to disregard them for the purposes of quorum calculations.
Switch to target-oriented dependencies. Use imported targets for
dependencies (openssl, boost). Localize FindBoost to remove cmake
version dependence for latest boost support. Logically separate
"ripple-libpp" core sources and add install targets.
Add ninja build for msvc. Add two clang sanitizer builds. Misc script
changes to work with latest modernized cmake.
Fixes: RIPD-1521
Switch to pure doxygen HTML for developer docs. Remove docca/boostbook
system. Convert consensus document to markdown. Add existing markdown
files to doxygen input set. Fix some image paths and scale images for
use with MD links. Rename/cleanup some files for consistency.
Add pipeline logic for windows slaves. Add ninja and parallel test run
option. Add make doc target build in build-and-test.sh. Cleanup README
files. Add nounity windows build. Add link to jenkins summary table.
Add rippled_classic build (win). Improve formatting of summary table.
These changes augment the Validations class with a LedgerTrie to better
track the history of support for validated ledgers. This improves the
selection of the preferred working ledger for consensus. The Validations
class now tracks both full and partial validations. Partial validations
are only used to determine the working ledger; full validations are
required for any quorum related function. Validators are also now
explicitly restricted to sending validations with increasing ledger
sequence number.
- Separate `Scheduler` from `BasicNetwork`.
- Add an event/collector framework for monitoring invariants and calculating statistics.
- Allow distinct network and trust connections between Peers.
- Add a simple routing strategy to support broadcasting arbitrary messages.
- Add a common directed graph (`Digraph`) class for representing network and trust topologies.
- Add a `PeerGroup` class for simpler specification of the trust and network topologies.
- Add a `LedgerOracle` class to ensure distinct ledger histories and simplify branch checking.
- Add a `Submitter` to send transactions in at fixed or random intervals to fixed or random peers.
Co-authored-by: Joseph McGee
Introduces a generic Validations class for storing and querying current and
recent validations. Aditionally migrates the validation related timing
constants from LedgerTiming to the new Validations code.
Introduces RCLValidations as the version of Validations adapted for use in the
RCL. This adds support for flushing/writing validations to the sqlite log and
also manages concurrent access to the Validations data.
RCLValidations::flush() no longer uses the JobQueue for its database
write at shutdown. It performs the write directly without
changing threads.