Cherry-pick of ripple/rippled@d22a5057b9 / xahau dd085e5d8 ("Prevent
consensus from getting stuck in the establish phase (#5277)"), resolved
against our RNG pipeline and bootstrap fast-start changes.
Upstream adds three layered anti-stall mechanisms:
- Stateful per-dispute avalanche state machine (init→mid→late→stuck)
- Stall detection: declares consensus when all disputes individually settled
- Hard expiration: clamp(10× prev round, 15s, 120s) wall-clock safety net
Conflict resolution:
- ConsensusParms.h: kept both avalanche state machine (const members,
avMIN_ROUNDS, avSTALLED_ROUNDS, getNeededWeight) and our bootstrap
params (bootstrapRoundTimeSeed, bootstrapStableRoundsRequired).
ledgerMAX_CONSENSUS left non-const for bootstrap override.
- Consensus.h: pass both stalled flag and effectiveParms to checkConsensus.
Stall check uses original parms, bootstrap override only affects max
consensus timeout.
- Consensus_test.cpp: kept all 12 RNG tests and new testDisputes test.
Use an explicit 5s cap instead of dividing the default 15s.
5s is the sweet spot: long enough for peers to exchange proposals
and converge naturally, short enough to avoid wasted time.
Shorter values (e.g. 3.75s) cause nodes to hit reachedMax before
peers converge, cascading into slower subsequent rounds.
Seed prevRoundTime_ to 3s instead of 15s on first round, override
idle interval to bypass closeTimeResolution (10-30s on early ledgers),
and halve ledgerMAX_CONSENSUS during bootstrap. Auto-disables after 3
consecutive rounds with UNL quorum participation.
Cuts 5-node testnet cold-start from ~28s to ~13s.
Also adds projected-source markers to TxQ, NetworkOPs, and Submit for
the transaction-submission documentation template.
Consolidate the repeated entropy fallback condition
(entropyFailed || no reveals || sub-quorum reveals) into a single
method. Fixes EntropyCount field reporting non-zero when the digest
was correctly zeroed due to sub-quorum reveals.
Sub-quorum reveals (e.g. 3/4 threshold) were producing real entropy,
allowing a minority of validators to disproportionately influence the
output. Both injectEntropyPseudoTx and buildExplicitFinalProposalTxSet
now fall back to zero entropy when reveals < quorumThreshold().
When prevProposers < quorumThreshold, the network is still converging
and RNG can only produce zero entropy. Skip the commit/reveal pipeline
to avoid PIPELINE_TIMEOUT and conflict-wait delays that compound across
staggered startup rounds.
hasQuorum() and getExportsWithQuorum() were using raw signerMap.size()
which includes unverified signatures. TxQ could inject a ttEXPORT
pseudo-tx that then fails the stricter verified-signature check in
Change::applyExport(). Use verifiedSignatureCount() instead so TxQ
only injects when cryptographically verified quorum is actually met.
Also add cmake plumbing for enhanced logging: link date::date-tz when
available and enable BEAST_ENHANCED_LOGGING for Debug builds.
Quorum fix:
- Rename expectedProposers_ → likelyParticipants_ to clarify role
- Fix commit quorum to 80% of active UNL snapshot (not shrinkable by
recent proposer count, which was allowing 2/3 to pass as quorum)
- hasQuorumOfCommits() now uses simple threshold check only
- Add CSF test: persistent loss does not shrink quorum
Log level cleanup:
- Demote ~30 RNG/STALLDIAG per-peer/per-tick lines from info/debug to
debug/trace across Consensus.h and RCLConsensus.cpp
- Principle: per-peer/per-tick → trace; state transitions → debug;
milestones → info
- Reduces testnet log volume by ~93%
date::current_zone() can throw if the timezone database is unavailable
or misconfigured (e.g. minimal container images). Fall back to UTC
formatting so enhanced logging does not make startup fatal.
Clarify inline that seq=3 publish can carry unchanged txSetHash while still providing extra entropySetHash delivery/fetch opportunity under packet loss or reordering.
Keep explicit final proposal as an opt-in experimental path with implicit mode as default.
Add inline rationale/TBD notes, extend stall diagnostics, and cover runtime-config + CSF txn-path behavior with tests.
Add Builds/levelization/levelization.py for fast local iteration and semantic comparison against canonical shell output via --compare-to.
Keep Builds/levelization/levelization.sh as canonical path, and update levelization workflow to fail if python output diverges from shell-generated results.
Also harden interactive-shell detection in levelization.sh for portability and document local usage in README.
Decouple RCLConsensus.h from Consensus.h by forward-declaring Consensus and storing Consensus<Adaptor> behind std::unique_ptr, moving thin wrappers out-of-line into RCLConsensus.cpp.
Also remove direct RCLConsensus.h include from NetworkOPs.h (forward declare), and add explicit includes in DatagramMonitor.h and ServerDefinitions.cpp to replace transitive dependencies.
Keep RNG fast-path behavior unchanged in Consensus.h; build and ripple.consensus.Consensus remain green.
Keep entropy-set recovery path but elect a deterministic single broadcaster (lowest NodeID among tx-converged participants) instead of every proposer broadcasting entropySetHash.
This lowers steady-state proposal chatter while preserving liveness for lagging peers that need entropy-set fetch/merge.
Expose rng_claim_drop_pct in runtime config (RPC + env) as a clamped 0-100 percentage used by RNG claim-drop testing.
Include RuntimeConfig RPC tests for round-trip and clamping behavior.
Track active RNG round sequence for fetched set validation so lagging observers can merge current-round commit sets instead of rejecting them as closed+1 out-of-round.
Refresh/re-publish commitSetHash after fetch-merge conflicts and publish entropySetHash in ConvergingReveal so peers can recover reveal sets.
Add inline tradeoff notes: extra proposal traffic is accepted to preserve consensus safety/liveness under packet loss or drop injection.
Harden acquired RNG merge paths with strict entry typing, trusted key/node binding, round-sequence gating, reveal-to-commit linkage checks, and stale reveal/proof invalidation on commitment changes.
Adjust proposer expectation logic so non-proposing observers are not counted as expected committers, and add a CSF regression test covering observer self-commit exclusion.
Reject mixed commit/reveal maps, enforce per-entry type checks, bind node identity to trusted validator keys, and gate acquired entries to the active round.
Also verify acquired reveals against stored commitments and clear stale reveal/proof state when commitments change.
Move core xport_reserve and xport implementations from applyHook.cpp
DEFINE_HOOK_FUNCTION wrappers into the decoupled HookAPI class, following
the same pattern used for etxn_reserve and emit.
- Error on unknown message_types instead of silently widening scope
- Make messageCategories optional so per-peer can override global filter
to "all categories" (nullopt=inherit, empty set=explicitly all)
- Clamp send_drop_pct to 0-100% range
- Add STARTDIAG: logging for consensus startup diagnostics
- Add 3 test cases (11 total, 58 assertions)
- Fix convergence regression caused by 2.4.0 merge: replace
stringIsUint256Sized(currenttxhash) with size() < uint256::size()
to accept extended proposals (>32 bytes) containing RNG fields
- Add message_types filter to RuntimeConfig for targeting specific
protocol message categories (proposal, validation, transaction, etc.)
- Add appliesTo() method and messageCategories set to ConfigVals
- Add category name mapping helpers in RPC handler
- Add 2 test cases for message type filtering (8 total)
Add a generic RuntimeConfig service for runtime-configurable parameters,
initially supporting artificial send delays and packet drops for testing
consensus behavior on local testnets.
- RuntimeConfig class with atomic fast-path gate (zero cost when inactive)
- Per-peer targeting via "*" (global) and "ip:port" keys with inheritance
- Pre-merged caching at write time for single-lookup read path
- Admin RPC handler `runtime_config` (set/clear/clear_all/get)
- Env var support: XAHAU_RUNTIME_CONFIG (JSON) or XAHAU_SEND_* vars
- PeerImp::send() integration with delay timer and probabilistic drops
- RPC handler test covering all operations and merge behavior
Due to rounding, the LPTokenBalance of the last LP might not match the LP's trustline balance. This was fixed for `AMMWithdraw` in `fixAMMv1_1` by adjusting the LPTokenBalance to be the same as the trustline balance. Since `AMMClawback` is also performing a withdrawal, we need to adjust LPTokenBalance as well in `AMMClawback.`
This change includes:
1. Refactored `verifyAndAdjustLPTokenBalance` function in `AMMUtils`, which both`AMMWithdraw` and `AMMClawback` call to adjust LPTokenBalance.
2. Added the unit test `testLastHolderLPTokenBalance` to test the scenario.
3. Modify the existing unit tests for `fixAMMClawbackRounding`.
Add single-sign rejection check in Change::applyExport() matching
rippled's multi-sign validation: SigningPubKey must be present but
empty, TxnSignature must not be present.
Fix Export_test.cpp hook to encode an empty VL blob for SigningPubKey
instead of 33 zero bytes (AI slop from export-uvtxn branch).
Extract duplicated (n * 80 + 99) / 100 ceiling quorum formula into shared
calculateQuorumThreshold() in ConsensusParms.h, matching the standard
ValidatorList::calculateQuorum(). Used by ExportSignatureCollector,
Change.cpp, and RCLConsensus.cpp.
Revert Import.cpp quorum from ceiling back to original truncating formula
(totalValidatorCount * 0.8) since Import handles XPOP imports, not the
new Export feature. Added TODO for future upgrade.