1. Record SidecarKind in pendingRngFetches_ before calling
onAcquiredSidecarSet on local-cache-hit path. Without this,
cached reveal/exportSig sets silently fell back to commit kind
and were rejected by the sfSidecarType check.
2. Wrap export sig visitLeaves callback in try/catch (matching the
RNG path) and enforce sfSidecarType == sidecarExportSig before
processing — closes the shape-only acceptance gap.
Replace the content-sniffing heuristic in onAcquiredSidecarSet with
typed dispatch based on SidecarKind.
The type is already known at fetch time:
- commitSetHash → SidecarKind::commit
- entropySetHash → SidecarKind::reveal
- exportSigSetHash → SidecarKind::exportSig
pendingRngFetches_ changes from hash_set<uint256> to
hash_map<uint256, SidecarKind>. When the set arrives,
look up the kind by hash and dispatch — no leaf inspection.
This is the set-classification fix (Option E from the design doc):
no new SField, no STTx changes, no protocol additions, no RNG
proof-chain churn.
- Add rng_poll_ms, no_export_sig, bootstrap_fast_start to the
runtime_config RPC handler (SET and GET paths) so all ConfigVals
fields are configurable live via admin RPC.
- Remove unused `added` counter in CSF fetchRngSetIfNeeded (was
causing compiler warnings after debug logging removal).
- Use Buffer::operator== instead of std::memcmp in upgradeSignature,
drop <cstring> include.
Both runtime_config and disconnect RPC handlers are already
Role::ADMIN. Add a TODO to consider gating the entire
RuntimeConfig system on a config flag or compile-time define
for production nodes.
Move XAHAU_RNG_POLL_MS and XAHAUD_NO_EXPORT_SIG into RuntimeConfig
as rngPollMs and noExportSig fields. Both are now configurable via
the XAHAU_RUNTIME_CONFIG JSON blob or individual env vars, and
controllable at runtime via the runtime_config RPC.
rngPollMs is clamped to minimum 50ms (prevents tight-loop polling).
Default remains 250ms.
This removes the last loose std::getenv calls from production code
outside of RuntimeConfig. All env-var-based configuration now flows
through a single system.
upgradeSignature now takes the verified buffer and compares it against
the currently stored buffer before promoting to verified. This guards
against concurrent overlay threads overwriting the buffer between the
caller's unverifiedSignatures() snapshot and the upgrade call.
If the stored buffer was overwritten (different size or content), the
upgrade is silently skipped — the new buffer will be verified on its
next encounter.
Strip JLOG(j_.debug()) calls from buildEntropySet, fetchRngSetIfNeeded,
and finalizeRoundEntropy in CSF Peer.h. These were added for local
debugging and caused CI failures due to output size limits.
Running 3 rounds caused peer 0 to desync on round 2, dropping
prevProposers for the majority on round 3, triggering bootstrap
skip → zero entropy on the last round. The gate works correctly
(logs show aligned=3, peersSeen=3) but the test was checking the
LAST round's entropy, not the round where the gate was exercised.
Run 1 round after warmup — sufficient to exercise the gate.
The entropy convergence deadline was measured from revealPhaseStart_,
which is set when entering ConvergingReveal. By the time the entropy
set is published (after reveal timeout + observation tick), most of
the deadline budget was already spent — leaving insufficient time
for peer alignment.
Add entropyPublishStart_ timestamp set when the entropy set is first
published. All convergence gate deadlines now measure from this
point, giving the full 2x rngREVEAL_TIMEOUT window for peer
proposals to propagate and alignment to be observed.
When peers have published entropySetHash but none match ours yet
(e.g. a subset peer is the only one seen so far), wait for the
bounded deadline instead of immediately falling back to zero.
Other aligned peers may not have published yet — give them time.
Only fall back to zero if no alignment is observed within the
deadline (2x rngREVEAL_TIMEOUT).
After fetch/merge, if our entropy set hash didn't change, the
conflicting peer had a subset of our data — not a real threat.
Clear the conflict flag so we don't fall back to zero when a peer
simply has fewer reveals than us.
If the hash DID change (merge added data), re-count alignment
with the updated hash before treating it as a real conflict.
This prevents the majority from falling back to zero just because
one peer (e.g. isolated) has a smaller reveal set.
The observation tick alone was insufficient — a node could pass the
gate without any peer confirming its entropySetHash. Now the gate
requires at least one tx-converged peer with a matching hash before
accepting non-zero entropy.
Three cases after the observation tick:
1. aligned > 0: peers confirm our hash → proceed with entropy
2. conflict: fetch/merge/rebuild → bounded wait → zero fallback
3. aligned=0, peersSeen=0: no peers published yet → bounded wait →
zero fallback if still no peers at deadline
4. aligned=0, peersSeen>0: peers published but none match → zero
Also:
- CSF finalizeRoundEntropy now uses shouldZeroEntropy() (quorum check)
- Two new TDD tests:
- testRngNoEntropyWithoutPeerAlignment: healthy network must agree
- testRngAlignmentRequiredForNonZeroEntropy: isolated peer must not
produce non-zero entropy that differs from majority
CSF shouldZeroEntropy() now checks reveals < quorumThreshold (80% of
UNL), matching production. MajorRevealLoss test adjusted to verify
majority group agreement rather than requiring full synchronization
(peer 0 may desync when it misses most reveals).
All 15 ConsensusRng tests now pass.
Two fixes addressing the asymmetric-view problem:
1. Convergence gate now forces one observation tick after first
publishing the entropySet before accepting. Previously a node
could publish + accept in the same tick, never seeing a peer's
different hash. The entropySetPublished_ flag ensures at least
one round-trip for proposal propagation.
2. CSF shouldZeroEntropy() now checks quorum threshold (80% of UNL),
matching production behavior. Previously it only checked empty().
Result: PartialReveals test now passes — all 6 peers converge on
the same entropy (count=6) via union merge after the observation tick.
14/15 ConsensusRng tests pass.
The CSF never self-seeded its own reveal into pendingReveals_ because
harvestRngData only processes peer proposals, not self. The real code
handles this in decorateMessage, but the CSF has no equivalent.
Add selfSeedReveal() called from the tick at reveal transition.
Both the real ConsensusExtensions and the CSF Extensions implement it.
The real code now has belt-and-suspenders: tick + decorateMessage.
This fixes CSF peers having N-1 reveals instead of N, which caused
every peer to compute entropy from a different subset.
Add a content-addressed SidecarStore to the CSF, simulating the
InboundTransactions SHAMap fetch pipeline. Tagged entries (commit
or reveal) are published by hash during buildCommitSet/buildEntropySet
and fetched by hash during fetchRngSetIfNeeded, with type-aware
union merge into the correct local pending set.
Also adds debug logging to CSF Extensions for entropy pipeline
troubleshooting.
Add a bounded pre-accept convergence check for entropySetHash,
closing the gap where two honest validators could accept with
different reveal subsets and compute different entropy (ledger fork).
After publishing the entropy set, the gate:
1. Inspects tx-converged peer positions for conflicting entropySetHash
2. Fetches differing sets via fetchRngSetIfNeeded (union merge)
3. Rebuilds and re-publishes the local entropy set after merge
4. Waits within a bounded window (2x rngREVEAL_TIMEOUT)
5. Falls back to zero entropy if conflict persists past deadline
This follows the same pattern as the existing commitSetHash conflict
handling and exportSigSetHash convergence gate. Union merge ensures
monotonic set growth — honest timing skew resolves quickly, and
hostile hash spam hits the hard deadline and falls back safely.
The "one bad actor shouldn't deny entropy" optimization (supermajority
vote) is deferred to a follow-up patch per codex recommendation.
Three new CSF tests that document expected behavior for the
entropySetHash convergence gate (not yet implemented):
1. testRngEntropyConvergesWithPartialReveals: two groups each drop
one peer's reveal, creating different quorate subsets. Must not
fork — either converge via SHAMap merge or both fall back to zero.
2. testRngEntropyFallbackOnMajorRevealLoss: one peer drops most
reveals (below quorum locally). Network must still agree.
3. testRngSingleByzantineCannotDenyEntropy: one Byzantine peer
(future: forced garbage entropySetHash) should not prevent the
other 80% from producing valid entropy.
Also adds dropRevealFrom_ test knob to CSF Peer::Extensions for
simulating asymmetric reveal delivery.
- Skip addVerifiedSignature in decorateMessage when sigBuf is empty
(sign() threw — don't mark a failed sign as "verified")
- Add XRPL_ASSERT in addVerifiedSignature and addUnverifiedSignature
requiring non-empty signature buffers
- Add XRPL_ASSERT in checkQuorumAndSnapshot verifying that every
entry in the verified set exists in the signatures map with a
non-empty buffer
Enforce the contract: source chain finalizes an export only when it
has a quorum of cryptographically verified multisignatures.
ExportSigCollector changes:
- signatureCount() now counts verified entries only
- checkQuorumAndSnapshot() returns verified-only snapshot
- snapshot() and snapshotWithSigs() return verified-only data
- buildExportSigSet (via snapshot) publishes verified-only entries
- unverifiedSignatures() returns sigs needing verification
- upgradeSignature() promotes unverified to verified
- addStandaloneSignature() marks as verified (no consensus to check)
- All add methods now set firstSeenSeq (fixes stale cleanup bug)
Export::doApply changes:
- Upgrade pass before quorum check: deserializes the inner tx (which
is always available as ctx_.tx), verifies any unverified sigs via
buildMultiSigningData + verify(), upgrades them in the collector
- Then checks quorum on verified-only count
- Assembles blob from verified-only snapshot
This means:
- Unverified sigs (relay ordering) are local cache only
- They don't count toward quorum until upgraded
- SHAMap convergence operates on verified sigs only
- Destination chain verification remains defense-in-depth
SHAMap has no leafCount() method — it was a local variable in
SHAMap.cpp, not a public API. Use std::distance(begin(), end())
on the SHAMap's ForwardRange iterators instead. Cost is O(n) but
the set is bounded by UNL size (~20-35 entries).
shouldZeroEntropy() and sfEntropyCount no longer fall back to
pendingReveals_. If entropySetMap_ is null, entropy failed — the
pipeline didn't complete, and the map is the only canonical source.
pendingReveals_ is now strictly an internal staging area for the
commit/reveal pipeline. All final entropy decisions flow through
entropySetMap_, which is the consensus-agreed set.
The H2 entropy fix switched the digest computation to entropySetMap_
but shouldZeroEntropy() and sfEntropyCount still used pendingReveals_.
Since pendingReveals_ can diverge from the published entropySetMap_
(late reveals mutate it after the map hash is published), two nodes
agreeing on the same entropySetHash could still build different
ttCONSENSUS_ENTROPY pseudo-transactions.
Now shouldZeroEntropy() checks entropySetMap_ leaf count when the map
is available, and sfEntropyCount uses the map's leaf count. Both
fall back to pendingReveals_ only during pipeline stages before the
map is built.
Replace the ambiguous addSignature/hasSignature API with clearly
named methods that make verification state explicit:
addVerifiedSignature() — sig passed buildMultiSigningData + verify()
addUnverifiedSignature() — trusted source but no multisign check yet
addStandaloneSignature() — pubkey-only for standalone/test mode
hasVerifiedSignature() — only returns true for verified sigs
Unverified sigs (relay ordering fallback) are no longer treated as
verified by the cache. When the same sig is encountered again via a
path that CAN verify (e.g. SHAMap merge after the tx arrives), the
verification runs and upgrades it to verified.
addUnverifiedSignature() won't overwrite a verified sig, preventing
downgrade. SigEntry tracks verified validators in a separate set.
Revert the hard reject when ttEXPORT is not in the open ledger.
Under relay ordering, a node can receive a proposal with export sigs
before the ttEXPORT tx itself arrives. Dropping these sigs loses a
valid validator contribution for the entire round with no recovery
path until terRETRY_EXPORT on the next round.
Post C1+C2, the proposal-level authentication is sufficient trust:
checkSign() verified the sender holds the private key, and sender
binding verified the embedded pubkey matches. Store the sig and
let the multisign content be verified on the destination chain.
The collector's stale cleanup (256 ledgers) bounds retention.
When the tx IS in the open ledger (common case), the multisign sig
is still fully verified via buildMultiSigningData + verify().
H2: Compute final entropy from the agreed-upon entropySetMap_ SHAMap
rather than from the local pendingReveals_ in-memory map.
Previously, two nodes with different reveal subsets at timeout would
compute different entropy from their local pendingReveals_ maps,
despite both passing haveConsensus() (which only checks txSetHash).
This could cause a ledger fork.
Now the entropy computation reads directly from the entropySetMap_
whose hash was published in proposals and converged via SHAMap
fetch/merge. Nodes that agree on entropySetHash deterministically
produce the same entropy regardless of local pendingReveals_ state.
If entropySetMap_ is null (bootstrap skip, pipeline failure), the
existing shouldZeroEntropy() fallback handles it.
Reject proposals with more than ExportLimits::maxPendingExports (8)
export sig entries. Honest validators attach at most one sig per
pending export, bounded by the same limit. Prevents DoS via
proposals with millions of entries triggering lock contention on
the validator list and collector mutexes.
Add hasSignature() to ExportSigCollector — checks if a verified sig
already exists for a given (txHash, validator) pair. Both the
proposal ingestion path and the SHAMap merge path now check this
before calling verify(), avoiding redundant ed25519 verification
when the same sig arrives via multiple paths.
No external sig cache exists in rippled, so the collector itself
serves as the verification cache: once a sig is stored (always
post-verify), subsequent encounters skip the crypto work.
Three fixes from codex review:
1. Remove unsafe fallback in proposal ingestion path: reject export
sigs when the ttEXPORT tx is not in the open ledger instead of
storing them unverified. The tx must be in the open ledger for
validators to have signed it, so this is not a legitimate case.
2. Add full sig verification to the SHAMap merge path
(onAcquiredSidecarSet): verify each export sig entry against
buildMultiSigningData + verify() before storing in the collector.
Previously this path only checked trusted() on the pubkey,
allowing a malicious UNL validator to publish a sidecar set with
forged sigs for other validators.
3. Close cluster mode bypass: always call checkSign() and gate export
sig harvesting on sigValid, even when cluster() is true. Cluster
trust is for relay/resource charging, not for accepting on-chain
cryptographic artifacts.
C3: Cryptographically verify each export signature blob against the
inner transaction's signing data before storing in the collector.
Looks up the ttEXPORT tx from the open ledger, reconstructs the
signing data via buildMultiSigningData, and calls verify().
If the tx isn't in our open ledger yet (timing/relay), the sig is
stored unverified as a fallback — it can be verified later at the
SHAMap merge path or will be rejected at Export::doApply if invalid.
This runs on the jtPROPOSAL_t job queue thread (not the IO strand
or transactor), so the verify() cost has no impact on consensus
critical path performance.
C2 hardening: validate all export sig blobs before committing any,
preventing partial state if a later blob fails the sender binding
check. Also moves the trusted() check before the loop since senderPK
is constant.
H1: Add checkQuorumAndSnapshot() to ExportSigCollector that performs
the quorum threshold check and signature snapshot under a single lock
acquisition. Export::doApply now uses this instead of separate
signatureCount() + snapshotWithSigs() calls, eliminating the TOCTOU
window where concurrent overlay threads could mutate the collector
between the two operations.
C1: Move onTrustedPeerMessage() from the synchronous onMessage(TMProposeSet)
handler into checkPropose(), after checkSign() verifies the proposal's
cryptographic signature. Previously, export sigs were ingested before
signature verification, allowing any peer to inject forged sigs by
spoofing nodepubkey to a trusted validator's key.
C2: Add sender binding in onTrustedPeerMessage() — each export sig
blob's embedded validator pubkey must match the proposal sender's
nodepubkey. Reject the entire proposal's export sigs on any mismatch,
preventing a compromised validator from impersonating other validators
to single-handedly forge quorum.
Move ConsensusExtensionsTick.h from xrpld/app/consensus/ to
xrpld/consensus/ — it's a pure template with no app-layer deps.
Extract Peer::Extensions::onTick() definition into test/csf/PeerTick.h
so Peer.h no longer includes from xrpld/app/.
Eliminates the test.csf > xrpld.app levelization edge.
Add --explain flag to levelization.py for tracing dependency edges.
Replace the duplicated throwaway-STTx + real-STTx pattern with a
single STObject: set all fields including fee=0, serialise to compute
the fee, patch the fee, then serialise once into the final STTx.
20 lines shorter, no duplication.
testXportPayment now asserts the full emitted tx lifecycle:
- hook fires with ACCEPT, emitCount=1, returnCode=0
- sfHookEmissions present with 1 entry
- ltEMITTED_TXN in AffectedNodes
- emitted dir is not empty
- after close, emitted ttEXPORT appears in closed ledger
Also add FOCUSED_TEST env var gate for fast iteration during
development (set FOCUSED_TEST=1 to run only focused_test()).
Move export scenario tests from suite.yml into their own
export-suite.yml file. The defaults already set CE+Export features
so individual test entries no longer need to repeat them.
Add SuiteLogsWithOverrides test utility: a Logs subclass that routes
specified journal partitions to stderr (always visible) while keeping
others on suite_.log (only on failure). Useful for debugging specific
subsystems during test development.
Mutating the fee via const_cast after STTx construction left a stale
cached getTransactionID(). When the emitted ttEXPORT was serialised
into the emitted directory and later deserialised, the round-tripped
txid differed from the original, causing tefNONDIR_EMIT in
Transactor::preclaim (the emitted dir entry was keyed with the stale
hash).
Build a throwaway STTx with fee=0 to calculate the fee size, then
construct the real STTx with the correct fee from the start.