Previously, when shouldRetire was false (pre-TRACKING, before the
sticky caught-up flag flipped), retiredLedgers fell out of scope at
the end of setFullLedger and destructed synchronously on the advance
thread. Each publish past ledger_history cascaded through a
million-leaf destruction before doAdvance could loop to the next
publishable ledger, producing a stall-then-flurry pattern during
catch-up.
Always move retiredLedgers into the async job. Inside the job, the
shouldRetire capture gates only the bookkeeping side effects
(mCompleteLedgers / relational / LedgerHistory pruning). Destruction
of the captured shared_ptrs happens on the worker regardless, so the
advance thread stays on the publish hot path.
The FBC claim is tied to a hash, not to a canonical object. If the
canonical that established the claim dies and a fresh one is later
materialised from wire bytes, the fresh canonical has fullBelowGen_ == 0
and empty children_[i]. Liveness-only gating would anchor the empty
canonical and skip descent, and later reads through the unwired
branches would throw SHAMapMissingNode.
Add a fullBelowGen_ match to the null-mode short-circuit: fresh
canonicals fail the check and fall through to descent, which populates
children_ as it walks. Disk-backed mode is unchanged.
prevMissing finds gaps just below the retention window that we'd
re-fetch only to immediately retire again, causing mCompleteLedgers to
flicker between ledger_history and ledger_history+1.
FULL requires validator participation — a tracking-only node never
reaches it, so the retire gate stayed false forever and mCompleteLedgers
grew unbounded. TRACKING is the correct threshold: "convinced we agree
with the network." OperatingMode is numerically ordered by how-caught-up
we are (DISCONNECTED=0 ... FULL=4), so >= TRACKING covers both
tracking-only nodes and validators.
Sticky behavior retained: once we've ever hit >= TRACKING, retirement
stays enabled for the process lifetime; transient drops don't leak
accumulation into mCompleteLedgers.
Four related changes plus diagnostic logging:
1. Sticky FULL gate. Once OperatingMode::FULL has been observed at any
prior setFullLedger, retirement stays enabled even if the mode
briefly dips to TRACKING or SYNCING. Process-wide static atomic.
Fixes mCompleteLedgers drift past ledger_history across mode
flickers.
2. Atomic insert+prune on mCompleteLock. The new seq insert and the
bulk-prefix prune of retired seqs now run under one mCompleteLock
acquisition, inlined from clearPriorLedgers's body. Observers never
see the transient ledger_history+1 window. Peers get a stable
complete_ledgers range.
3. Skip tryFill in memory-resident mode. tryFill walks back the
parent-hash chain and marks seqs in mCompleteLedgers as "we have
these" based on DB / in-memory presence. Under memory-resident mode
we only retain ledger_history, so tryFill either duplicates the
setFullLedger bookkeeping we already did for retained seqs, or lies
by marking seqs outside retention. Gate its dispatch at the
fetchForHistory site.
4. Per-mutation logging. Every mCompleteLedgers mutation site now
emits an info-level JLOG on the LedgerMaster partition, tagged by
call site (clearLedger, tryFill/inner, tryFill/final, setFullLedger,
setFullLedger/insert+prune, setLedgerRangePresent, clearPriorLedgers).
Format: `mCompleteLedgers[site:op]: <args> -> <range_string>`.
Lets us attribute any transient drift to a specific code path.
Three related changes to the memory-resident retirement path exposed by
testing catch-up with ledger_history=16 (5-8 minute cold syncs felt
sluggish, with retire log lines firing during catch-up):
1. Gate retireLedgers on OperatingMode::FULL. During catch-up we let
mCompleteLedgers, LedgerHistory, and the relational tables accumulate
freely — mRetainedLedgers's own pop_front still caps structural
retention at ledger_history, so growth is bounded. This matches the
old disk-backed flow's healthWait() gating: no pruning while lagged.
2. Bulk-prefix clean-up in retireLedgers via clearPriorLedgers(maxSeq+1)
instead of per-seq clearLedger() in a loop. When the first retire
fires after FULL is reached, it collapses all the catch-up
accumulation below the retention window in one pass. Pinning is
preserved.
3. Sync/async split of retirement work in setFullLedger:
- Synchronous (on the publish thread): clearPriorLedgers prune of
mCompleteLedgers. Trivial range-set erase under mCompleteLock.
Keeps the reported complete_ledgers range tight with no transient
16↔17 over-advertising window.
- Asynchronous (JobQueue worker via jtLEDGER_DATA): LedgerHistory
cache eviction, relational deletes, and the shared_ptr destruction
cascade through the retired Ledgers' SHAMap spines. The heavy work
— thousands of shared_ptr decrements per retire for the ledger's
uniquely-held canonical nodes — stays off doAdvance's critical
path.
The retired Ledgers are kept alive in the job closure's captured
vector until the job runs, so destruction happens in the worker.
Disk-backed mode is byte-identical (memoryResidentMode() false).
Two cleanups landing together because they cross the same file:
1. SHAMapStore::retireLedger -> retireLedgers(vector). Caller in
LedgerMaster::setFullLedger collects all popped ledgers from the
pop_front loop and passes them in one call. The implementation
collapses N relational/cache prefix-deletes into a single call at
max(seq), so the plural form costs no more than the singular.
Steady-state remains size 1; bursty catch-up retirements get the
batched-prefix benefit for free.
2. Drop getClosestFullyWiredLedger from LedgerMaster and InboundLedgers
along with all supporting state — the recentHistoryLedgers_ deque,
the historyPrimingCacheSize_ field/helper, the file-local
sameChainDistance copy in InboundLedgers.cpp, plus the matching
header declarations. These were the "find a base ledger to delta
against for priming" machinery, used only by primeInboundLedgerForUse,
which itself is now gone. Test stub onLedgerFetched signature also
updated to match the current interface.
After dropping primeInboundLedgerForUse from init() and done(), the
helper chain (findBestFullyWiredBase, chooseCloserBase, the local
sameChainDistance copy, wireCompleteSHAMap, primeInboundLedgerForUse)
became unused and produced -Wunused-function warnings. Remove them.
Keeps isRWDBNullMode() — still used by init() and done() to gate the
setFullyWired() call. The other sameChainDistance copy in
InboundLedgers.cpp remains in use by getClosestFullyWiredLedger.
In null-nodestore mode the SHAMapStore rotation thread does no useful
work — there's no disk to amortize. The bursty rotation cadence also
causes mCompleteLedgers to over-report relative to mRetainedLedgers
(mCompleteLedgers prunes only on rotation; mRetainedLedgers caps
per-ledger via setFullLedger's pop_front loop). Peers consulting our
complete_ledgers advertisement get misled.
Replace the rotation thread with per-ledger retirement in null mode:
- Add memoryResidentMode() and retireLedger() to SHAMapStore interface.
- SHAMapStoreImp::memoryResidentMode_ is auto-derived from
isRWDBNullMode() (after type=none env-var propagation).
- start() skips spawning the rotation thread when memory-resident.
- working_ initialized false in memory-resident mode so rendezvous()
short-circuits without hanging.
- retireLedger synchronously prunes per-seq state for one ledger:
mCompleteLedgers (preserves pinning), LedgerHistory cache, and the
three relational tables (Transactions, AccountTransactions, Ledgers).
No batching, no backoff sleeps — RWDB-relational deletes are
microseconds.
- LedgerMaster::setFullLedger collects retired ledgers from the
pop_front loop and calls retireLedger on each (after releasing
m_mutex).
Disk-backed mode is unchanged: memoryResidentMode_ stays false, the
rotation thread runs as before, retireLedger short-circuits on the
flag check.
Prototype shape — minimum to validate the model on a live network.
Does not yet: skip state_db_ init in memory-resident mode, reject
explicit online_delete config, or remove the now-unused
healthWait/canDelete machinery for null mode.
Refs .ai-docs/null-nodestore-backend.md.j2 §"Rotation Is Vestigial in
Memory-Resident Mode" for the full reasoning.
NullFactory (type=none) already provides the exact null-backend
semantics: fetchNodeObject returns notFound, store is a no-op, no disk
I/O. Previously SHAMapStoreImp treated any non-"rwdb" type as
disk-backed and called dbPaths() unconditionally, crashing with
boost::filesystem::create_directories on an empty path.
- Recognise "none" alongside "rwdb" as a memory backend (skips
dbPaths() and takes the memory-backend rotation path).
- On type=none, set XAHAU_RWDB_NULL=1 (overwrite=0) so the existing
isRWDBNullMode() helpers in SHAMapSync, InboundLedger, Ledger etc.
detect null-mode semantics (FBC liveness+anchor, setFullyWired,
rotation-copy skip) without requiring the env var to be set
separately.
Makes type=none a first-class null-backend config declaration,
equivalent to type=rwdb + XAHAU_RWDB_NULL=1 but without the env-var
dance. Users can now write:
[node_db]
type = none
online_delete = 16
Re-enable FullBelowCache in null-nodestore mode. Previously disabled via
useFullBelowCache() returning false, forcing sync to walk every branch.
That was a workaround for the stale-claim problem where an FBC entry
could outlive the canonical node it vouches for, leading to
SHAMapMissingNode on later reads.
At the two FBC short-circuit sites (SHAMap::addKnownNode and
gmn_ProcessNodes), null mode now:
- validates the claim via TreeNodeCache::fetch (returns non-null iff the
canonical node is held alive anywhere in the system), and
- anchors the canonical into THIS SHAMap via canonicalizeChild, so
retention is structural and independent of whichever ledger originally
anchored the claim.
Disk-backed mode is byte-identical to before (gated on isRWDBNullMode()).
With the anchor rule in place, the post-sync wiring walks in
InboundLedger::init() and done() are redundant; drop both and call
setFullyWired() directly in null mode.
Adds projected-source markers at key points for the design doc at
.ai-docs/null-nodestore-backend.md.j2 (not tracked).
- Search both LedgerMaster and InboundLedgers for the closest fully wired base.
- Implement sameChainDistance helper to accurately calculate distance between ledgers on the same chain.
- Use findBestFullyWiredBase to minimize the 'prime walk' delta.
Introduces a 'NULL' node-store mode (via XAHAU_RWDB_NULL) that operates
entirely in-memory by leveraging a sliding window of retained Ledger objects.
Key changes:
- SHAMapSync: Bypass FullBelowCache in null mode to force full tree wiring.
- Ledger: Add 'fullyWired' state tracking and mandatory wiring before use.
- LedgerMaster: Implement 'mRetainedLedgers' sliding window to pin SHAMap graphs.
- PeerImp: Add fallbacks to TreeNodeCache and LedgerMaster for peer requests.
- contract: Add boost::stacktrace to LogThrow for easier debugging of misses.
- basics: Add ReaderPreferringSharedMutex to mitigate reader starvation.
Due to rounding, the LPTokenBalance of the last LP might not match the LP's trustline balance. This was fixed for `AMMWithdraw` in `fixAMMv1_1` by adjusting the LPTokenBalance to be the same as the trustline balance. Since `AMMClawback` is also performing a withdrawal, we need to adjust LPTokenBalance as well in `AMMClawback.`
This change includes:
1. Refactored `verifyAndAdjustLPTokenBalance` function in `AMMUtils`, which both`AMMWithdraw` and `AMMClawback` call to adjust LPTokenBalance.
2. Added the unit test `testLastHolderLPTokenBalance` to test the scenario.
3. Modify the existing unit tests for `fixAMMClawbackRounding`.
* Add AMM bid/create/deposit/swap/withdraw/vote invariants:
- Deposit, Withdrawal invariants: `sqrt(asset1Balance * asset2Balance) >= LPTokens`.
- Bid: `sqrt(asset1Balance * asset2Balance) > LPTokens` and the pool balances don't change.
- Create: `sqrt(asset1Balance * assetBalance2) == LPTokens`.
- Swap: `asset1BalanceAfter * asset2BalanceAfter >= asset1BalanceBefore * asset2BalanceBefore`
and `LPTokens` don't change.
- Vote: `LPTokens` and pool balances don't change.
- All AMM and swap transactions: amounts and tokens are greater than zero, except on withdrawal if all tokens
are withdrawn.
* Add AMM deposit and withdraw rounding to ensure AMM invariant:
- On deposit, tokens out are rounded downward and deposit amount is rounded upward.
- On withdrawal, tokens in are rounded upward and withdrawal amount is rounded downward.
* Add Order Book Offer invariant to verify consumed amounts. Consumed amounts are less than the offer.
* Fix Bid validation. `AuthAccount` can't have duplicate accounts or the submitter account.