This updates Boost to 1.88, which is needed because Clio wants to move to 1.88 as that fixes several ASAN false positives around coroutine usage. In order for Clio to move to newer boost, libXRPL needs to move too. Hence the changes in this PR. A lot has changed between 1.83 and 1.88 so there are lots of changes in the diff, especially in regards to Boost.Asio and coroutines in particular.
This change includes `network_id` data in the validations and ledger subscription stream responses, as well as unit tests to validate the response fields. Fixes#4783
- Specification: [XRPLF/XRPL-Standards 56](https://github.com/XRPLF/XRPL-Standards/blob/master/XLS-0056d-batch/README.md)
- Amendment: `Batch`
- Implements execution of multiple transactions within a single batch transaction with four execution modes: `tfAllOrNothing`, `tfOnlyOne`, `tfUntilFailure`, and `tfIndependent`.
- Enables atomic multi-party transactions where multiple accounts can participate in a single batch, with up to 8 inner transactions and 8 batch signers per batch transaction.
- Inner transactions use `tfInnerBatchTxn` flag with zero fees, no signature, and empty signing public key.
- Inner transactions are applied after the outer batch succeeds via the `applyBatchTransactions` function in apply.cpp.
- Network layer prevents relay of transactions with `tfInnerBatchTxn` flag - each peer applies inner transactions locally from the batch.
- Batch transactions are excluded from AccountDelegate permissions but inner transactions retain full delegation support.
- Metadata includes `ParentBatchID` linking inner transactions to their containing batch for traceability and auditing.
- Extended STTx with batch-specific signature verification methods and added protocol structures (`sfRawTransactions`, `sfBatchSigners`).
- Specification: XRPLF/XRPL-Standards#239
- Amendment: `SingleAssetVault`
- Implements a vault feature used to store a fungible asset (XRP, IOU, or MPT, but not NFT) and to receive shares in the vault (an MPT) in exchange.
- A vault can be private or public.
- A private vault can use permissioned domains, subject to the `PermissionedDomains` amendment.
- Shares can be exchanged back into asset with `VaultWithdraw`.
- Permissions on the asset in the vault are transitively applied on shares in the vault.
- Issuer of the asset in the vault can clawback with `VaultClawback`.
- Extended `MPTokenIssuance` with `DomainID`, used by the permissioned domain on the vault shares.
Co-authored-by: John Freeman <jfreeman08@gmail.com>
Combines four related changes:
1. "Decrease `shouldRelay` limit to 30s." Pretty self-explanatory. Currently, the limit is 5 minutes, by which point the `HashRouter` entry could have expired, making this transaction look brand new (and thus causing it to be relayed back to peers which have sent it to us recently).
2. "Give a transaction more chances to be retried." Will put a transaction into `LedgerMaster`'s held transactions if the transaction gets a `ter`, `tel`, or `tef` result. Old behavior was just `ter`.
* Additionally, to prevent a transaction from being repeatedly held indefinitely, it must meet some extra conditions. (Documented in a comment in the code.)
3. "Pop all transactions with sequential sequences, or tickets." When a transaction is processed successfully, currently, one held transaction for the same account (if any) will be popped out of the held transactions list, and queued up for the next transaction batch. This change pops all transactions for the account, but only if they have sequential sequences (for non-ticket transactions) or use a ticket. This issue was identified from interactions with @mtrippled's #4504, which was merged, but unfortunately reverted later by #4852. When the batches were spaced out, it could potentially take a very long time for a large number of held transactions for an account to get processed through. However, whether batched or not, this change will help get held transactions cleared out, particularly if a missing earlier transaction is what held them up.
4. "Process held transactions through existing NetworkOPs batching." In the current processing, at the end of each consensus round, all held transactions are directly applied to the open ledger, then the held list is reset. This bypasses all of the logic in `NetworkOPs::apply` which, among other things, broadcasts successful transactions to peers. This means that the transaction may not get broadcast to peers for a really long time (5 minutes in the current implementation, or 30 seconds with this first commit). If the node is a bottleneck (either due to network configuration, or because the transaction was submitted locally), the transaction may not be seen by any other nodes or validators before it expires or causes other problems.
This change fixes a number of issues involved with CTID:
* CTID is not present on all RPC tx transactions.
* rpcWRONG_NETWORK is missing in the ErrorCodes.cpp
It’s possible for this to happen legitimately if a set of peers, including a validator, are connected in a cycle, and the latency and message processing time between those peers is significantly less than the latency between the validator and the last peer. It’s unlikely in the real world, but obviously easy to simulate with Antithesis.
- Detects if the consensus process is "stalled". If it is, then we can declare a
consensus and end successfully even if we do not have 80% agreement on
our proposal.
- "Stalled" is defined as:
- We have a close time consensus
- Each disputed transaction is individually stalled:
- It has been in the final "stuck" 95% requirement for at least 2
(avMIN_ROUNDS) "inner rounds" of phaseEstablish,
- and either all of the other trusted proposers or this validator, if proposing,
have had the same vote(s) for at least 4 (avSTALLED_ROUNDS) "inner
rounds", and at least 80% of the validators (including this one, if
appropriate) agree about the vote (whether yes or no).
- If we have been in the establish phase for more than 10x the previous
consensus establish phase's time, then consensus is considered "expired",
and we will leave the round, which sends a partial validation (indicating
that the node is moving on without validating). Two restrictions avoid
prematurely exiting, or having an extended exit in extreme situations.
- The 10x time is clamped to be within a range of 15s
(ledgerMAX_CONSENSUS) to 120s (ledgerABANDON_CONSENSUS).
- If consensus has not had an opportunity to walk through all avalanche
states (defined as not going through 8 "inner rounds" of phaseEstablish),
then ConsensusState::Expired is treated as ConsensusState::No.
- When enough nodes leave the round, any remaining nodes will see they've
fallen behind, and move on, too, generally before hitting the timeout. Any
validations or partial validations sent during this time will help the
consensus process bring the nodes back together.
What the LoadManager class does is stall detection, which is not the same as deadlock detection. In the condition of severe CPU starvation, LoadManager will currently intentionally crash rippled reporting `LogicError: Deadlock detected`. This error message is misleading as the condition being detected is not a deadlock. This change fixes and refactors the code in response.
The codebase is filled with includes that are unused, and which thus can be removed. At the same time, the files often do not include all headers that contain the definitions used in those files. This change uses clang-format and clang-tidy to clean up the includes, with minor manual intervention to ensure the code compiles on all platforms.
Combine multiple related debug log data points into a single
message. Allows quick correlation of events that
previously were either not logged or, if logged, strewn
across multiple lines, making correlation difficult.
The Heartbeat Timer and consensus ledger accept processing
each have this capability.
Also guarantees that log entries will be written if the
node is a validator, regardless of log severity level.
Otherwise, the level of these messages is at INFO severity.
- Drop duplicate outgoing TMGetLedger messages per peer
- Allow a retry after 30s in case of peer or network congestion.
- Addresses RIPD-1870
- (Changes levelization. That is not desirable, and will need to be fixed.)
- Drop duplicate incoming TMGetLedger messages per peer
- Allow a retry after 15s in case of peer or network congestion.
- The requestCookie is ignored when computing the hash, thus increasing
the chances of detecting duplicate messages.
- With duplicate messages, keep track of the different requestCookies
(or lack of cookie). When work is finally done for a given request,
send the response to all the peers that are waiting on the request,
sending one message per peer, including all the cookies and
a "directResponse" flag indicating the data is intended for the
sender, too.
- Addresses RIPD-1871
- Drop duplicate incoming TMLedgerData messages
- Addresses RIPD-1869
- Improve logging related to ledger acquisition
- Class "CanProcess" to keep track of processing of distinct items
---------
Co-authored-by: Valentin Balaschenko <13349202+vlntb@users.noreply.github.com>
- Also get the branch name.
- Use rev-parse instead of describe to get a clean hash.
- Return the git hash and branch name in server_info for admin
connections.
- Include git hash and branch name on separate lines in --version.
* Rename ASSERT to XRPL_ASSERT
* Upgrade to Anthithesis SDK 0.4.4, and use new 0.4.4 features
* automatic cast to bool, like assert
* Add instrumentation workflow to verify build with instrumentation enabled
* Copy Antithesis SDK version 0.4.0 to directory external/
* Add build option `voidstar` to enable instrumentation with Antithesis SDK
* Define instrumentation macros ASSERT and UNREACHABLE in terms of regular C assert
* Replace asserts with named ASSERT or UNREACHABLE
* Add UNREACHABLE to LogicError
* Document instrumentation macros in CONTRIBUTING.md
* 2.2.2 changed functions acquireAsync and NetworkOPsImp::recvValidation to add an item to a collection under lock, unlock, do some work, then lock again to do remove the item. It will deadlock if an exception is thrown while adding the item - before unlocking.
* Replace ScopedUnlock with scope_unlock.
* Retry some failed RPC connections / commands in unit tests
* Remove orphaned `getAccounts` function
Co-authored-by: John Freeman <jfreeman08@gmail.com>
* upstream/master:
Set version to 2.2.2
Allow only 1 job queue slot for each validation ledger check
Allow only 1 job queue slot for acquiring inbound ledger.
Track latencies of certain code blocks, and log if they take too long