mirror of
https://github.com/XRPLF/rippled.git
synced 2025-12-06 17:27:55 +00:00
Add more documentation of ledger acquisition (RIPD-373)
Capturing information from a seminar on the topic into the source tree.
This commit is contained in:
committed by
Vinnie Falco
parent
6914aa3e27
commit
02c2029ac1
@@ -1,9 +1,9 @@
|
||||
|
||||
# Ledger Process
|
||||
# Ledger Process #
|
||||
|
||||
## Introduction
|
||||
## Introduction ##
|
||||
|
||||
## Life Cycle
|
||||
## Life Cycle ##
|
||||
|
||||
Every server always has an open ledger. All received new transactions are
|
||||
applied to the open ledger. The open ledger can't close until we reach
|
||||
@@ -40,7 +40,7 @@ The purpose of the open ledger is as follows:
|
||||
- Forms the basis of the initial proposal during consensus
|
||||
- Used to decide if we can reject the transaction without relaying it
|
||||
|
||||
## Byzantine Failures
|
||||
## Byzantine Failures ##
|
||||
|
||||
Byzantine failures are resolved as follows. If there is a supermajority ledger,
|
||||
then a minority of validators will discover that the consensus round is
|
||||
@@ -52,26 +52,167 @@ If there is no majority ledger, then starting on the next consensus round there
|
||||
will not be a consensus on the last closed ledger. Another avalanche process
|
||||
is started.
|
||||
|
||||
## Validators
|
||||
## Validators ##
|
||||
|
||||
The only meaningful difference between a validator and a 'regular' server is
|
||||
that the validator sends its proposals and validations to the network.
|
||||
|
||||
---
|
||||
|
||||
# Definitions
|
||||
# The Ledger Stream #
|
||||
|
||||
## Open Ledger
|
||||
## Ledger Priorities ##
|
||||
|
||||
There are two ledgers that are the most important for a rippled server to have:
|
||||
|
||||
- The consensus ledger and
|
||||
- The last validated ledger.
|
||||
|
||||
If we need either of those two ledgers they are fetched with the highest
|
||||
priority. Also, when they arrive, they replace their earlier counterparts
|
||||
(if they exist).
|
||||
|
||||
The `LedgerMaster` object tracks
|
||||
- the last published ledger,
|
||||
- the last validated ledger, and
|
||||
- ledger history.
|
||||
So the `LedgerMaster` is at the center of fetching historical ledger data.
|
||||
|
||||
In specific, the `LedgerMaster::doAdvance()` method triggers the code that
|
||||
fetches historical data and controls the state machine for ledger acquisition.
|
||||
|
||||
The server tries to publish an on-going stream of consecutive ledgers to its
|
||||
clients. After the server has started and caught up with network
|
||||
activity, say when ledger 500 is being settled, then the server puts its best
|
||||
effort into publishing validated ledger 500 followed by validated ledger 501
|
||||
and then 502. This effort continues until the server is shut down.
|
||||
|
||||
But loading or network connectivity may sometimes interfere with that ledger
|
||||
stream. So suppose the server publishes validated ledger 600 and then
|
||||
receives validated ledger 603. Then the server wants to back fill its ledger
|
||||
history with ledgers 601 and 602.
|
||||
|
||||
The server prioritizes keeping up with current ledgers. But if it is caught
|
||||
up on the current ledger, and there are no higher priority demands on the
|
||||
server, then it will attempt to back fill its historical ledgers. It fills
|
||||
in the historical ledger data first by attempting to retrieve it from the
|
||||
local database. If the local database does not have all of the necessary data
|
||||
then the server requests the remaining information from network peers.
|
||||
|
||||
Suppose the server is missing multiple historical ledgers. Take the previous
|
||||
example where we have ledgers 603 and 600, but we're missing 601 and 602. In
|
||||
that case the server requests information for ledger 602 first, before
|
||||
back-filling ledger 601. We want to expand the contiguous range of
|
||||
most-recent ledgers that the server has locally. There's also a limit to
|
||||
how much historical ledger data is useful. So if we're on ledger 603, but
|
||||
we're missing ledger 4 we may not bother asking for ledger 4.
|
||||
|
||||
## Assembling a Ledger ##
|
||||
|
||||
When data for a ledger arrives from a peer, it may take a while before the
|
||||
server can apply that data. So when ledger data arrives we schedule a job
|
||||
thread to apply that data. If more data arrives before the job starts we add
|
||||
that data to the job. We defer requesting more ledger data until all of the
|
||||
data we have for that ledger has been processed. Once all of that data is
|
||||
processed we can intelligently request only the additional data that we need
|
||||
to fill in the ledger. This reduces network traffic and minimizes the load
|
||||
on peers supplying the data.
|
||||
|
||||
If we receive data for a ledger that is not currently under construction,
|
||||
we don't just throw the data away. In particular the AccountStateNodes
|
||||
may be useful, since they can be re-used across ledgers. This data is
|
||||
stashed in memory (not the database) where the acquire process can find
|
||||
it.
|
||||
|
||||
Peers deliver ledger data in the order in which the data can be validated.
|
||||
Data arrives in the following order:
|
||||
|
||||
1. The hash of the ledger header
|
||||
2. The ledger header
|
||||
3. The root nodes of the transaction tree and state tree
|
||||
4. The lower (non-root) nodes of the state tree
|
||||
5. The lower (non-root) nodes of the transaction tree
|
||||
|
||||
Inner-most nodes are supplied before outer nodes. This allows the
|
||||
requesting server to hook things up (and validate) in the order in which
|
||||
data arrives.
|
||||
|
||||
If this process fails, then a server can also ask for ledger data by hash,
|
||||
rather than by asking for specific nodes in a ledger. Asking for information
|
||||
by hash is less efficient, but it allows a peer to return the information
|
||||
even if the information is not assembled into a tree. All the peer needs is
|
||||
the raw data.
|
||||
|
||||
## Which Peer To Ask ##
|
||||
|
||||
Peers go though state transitions as the network goes through its state
|
||||
transitions. Peer's provide their state to their directly connected peers.
|
||||
By monitoring the state of each connected peer a server can tell which of
|
||||
its peers has the information that it needs.
|
||||
|
||||
Therefore if a server suffers a byzantine failure the server can tell which
|
||||
of its peers did not suffer that same failure. So the server knows which
|
||||
peer(s) to ask for the missing information.
|
||||
|
||||
Peers also report their contiguous range of ledgers. This is another way that
|
||||
a server can determine which peer to ask for a particular ledger or piece of
|
||||
a ledger.
|
||||
|
||||
There are also indirect peer queries. If there have been timeouts while
|
||||
acquiring ledger data then a server may issue indirect queries. In that
|
||||
case the server receiving the indirect query passes the query along to any
|
||||
of its peers that may have the requested data. This is important if the
|
||||
network has a byzantine failure. If also helps protect the validation
|
||||
network. A validator may need to get a peer set from one of the other
|
||||
validators, and indirect queries improve the likelihood of success with
|
||||
that.
|
||||
|
||||
## Kinds of Fetch Packs ##
|
||||
|
||||
A FetchPack is the way that peers send partial ledger data to other peers
|
||||
so the receiving peer can reconstruct a ledger.
|
||||
|
||||
A 'normal' FetchPack is a bucket of nodes indexed by hash. The server
|
||||
building the FetchPack puts information into the FetchPack that the
|
||||
destination server is likely to need. Normally they contain all of the
|
||||
missing nodes needed to fill in a ledger.
|
||||
|
||||
A 'compact' FetchPack, on the other hand, contains only leaf nodes, no
|
||||
inner nodes. Because there are no inner nodes, the ledger information that
|
||||
it contains cannot be validated as the ledger is assembled. We have to,
|
||||
initially, take the accuracy of the FetchPack for granted and assemble the
|
||||
ledger. Once the entire ledger is assembled the entire ledger can be
|
||||
validated. But if the ledger does not validate then there's nothing to be
|
||||
done but throw the entire FetchPack away; there's no way to save a portion
|
||||
of the FetchPack.
|
||||
|
||||
The FetchPacks just described could be termed 'reverse FetchPacks.' They
|
||||
only provide historical data. There may be a use for what could be called a
|
||||
'forward FetchPack.' A forward FetchPack would contain the information that
|
||||
is needed to build a new ledger out of the preceding ledger.
|
||||
|
||||
A forward compact FetchPack would need to contain:
|
||||
- The header for the new ledger,
|
||||
- The leaf nodes of the transaction tree (if there is one),
|
||||
- The index of deleted nodes in the state tree,
|
||||
- The index and data for new nodes in the state tree, and
|
||||
- The index and new data of modified nodes in the state tree.
|
||||
|
||||
---
|
||||
|
||||
# Definitions #
|
||||
|
||||
## Open Ledger ##
|
||||
|
||||
The open ledger is the ledger that the server applies all new incoming
|
||||
transactions to.
|
||||
|
||||
## Last Validated Ledger
|
||||
## Last Validated Ledger ##
|
||||
|
||||
The most recent ledger that the server is certain will always remain part
|
||||
of the permanent, public history.
|
||||
|
||||
## Last Closed Ledger
|
||||
## Last Closed Ledger ##
|
||||
|
||||
The most recent ledger that the server believes the network reached consensus
|
||||
on. Different servers can arrive at a different conclusion about the last
|
||||
@@ -79,17 +220,17 @@ closed ledger. This is a consequence of Byzantanine failure. The purpose of
|
||||
validations is to resolve the differences between servers and come to a common
|
||||
conclusion about which last closed ledger is authoritative.
|
||||
|
||||
## Consensus
|
||||
## Consensus ##
|
||||
|
||||
A distributed agreement protocol. Ripple uses the consensus process to solve
|
||||
the problem of double-spending.
|
||||
|
||||
## Validation
|
||||
## Validation ##
|
||||
|
||||
A signed statement indicating that it built a particular ledger as a result
|
||||
of the consensus process.
|
||||
|
||||
## Proposal
|
||||
## Proposal ##
|
||||
|
||||
A signed statement of which transactions it believes should be included in
|
||||
the next consensus ledger.
|
||||
@@ -153,7 +294,7 @@ same value as a trust line between accounts B and A.
|
||||
**Balance:**
|
||||
- **currency:** String identifying a valid currency, e.g., "BTC".
|
||||
- **issuer:** There is no issuer, really, this entry is "NoAccount".
|
||||
- **value:**
|
||||
- **value:**
|
||||
|
||||
**Flags:** ???
|
||||
|
||||
|
||||
@@ -3363,6 +3363,15 @@ void NetworkOPsImp::makeFetchPack (
|
||||
reply.set_ledgerhash (request->ledgerhash ());
|
||||
reply.set_type (protocol::TMGetObjectByHash::otFETCH_PACK);
|
||||
|
||||
// Building a fetch pack:
|
||||
// 1. Add the header for the requested ledger.
|
||||
// 2. Add the nodes for the AccountStateMap of that ledger.
|
||||
// 3. If there are transactions, add the nodes for the
|
||||
// transactions of the ledger.
|
||||
// 4. If the FetchPack now contains greater than or equal to
|
||||
// 256 entries then stop.
|
||||
// 5. If not very much time has elapsed, then loop back and repeat
|
||||
// the same process adding the previous ledger to the FetchPack.
|
||||
do
|
||||
{
|
||||
std::uint32_t lSeq = wantLedger->getLedgerSeq ();
|
||||
|
||||
@@ -103,8 +103,9 @@ void SHAMap::visitLeavesInternal (std::function<void (SHAMapItem::ref item)>& fu
|
||||
}
|
||||
}
|
||||
|
||||
/** Get a list of node IDs and hashes for nodes that are part of this SHAMap but not available locally.
|
||||
The filter can hold alternate sources of nodes that are not permanently stored locally
|
||||
/** Get a list of node IDs and hashes for nodes that are part of this SHAMap
|
||||
but not available locally. The filter can hold alternate sources of
|
||||
nodes that are not permanently stored locally
|
||||
*/
|
||||
void SHAMap::getMissingNodes (std::vector<SHAMapNodeID>& nodeIDs, std::vector<uint256>& hashes, int max,
|
||||
SHAMapSyncFilter* filter)
|
||||
@@ -144,6 +145,12 @@ void SHAMap::getMissingNodes (std::vector<SHAMapNodeID>& nodeIDs, std::vector<ui
|
||||
|
||||
SHAMapTreeNode *node = root.get ();
|
||||
SHAMapNodeID nodeID;
|
||||
|
||||
// The firstChild value is selected randomly so if multiple threads
|
||||
// are traversing the map, each thread will start at a different
|
||||
// (randomly selected) inner node. This increases the likelihood
|
||||
// that the two threads will produce different request sets (which is
|
||||
// more efficient than sending identical requests).
|
||||
int firstChild = rand() % 256;
|
||||
int currentChild = 0;
|
||||
bool fullBelow = true;
|
||||
@@ -649,6 +656,15 @@ static void addFPtoList (std::list<SHAMap::fetchPackEntry_t>& list, const uint25
|
||||
list.push_back (SHAMap::fetchPackEntry_t (hash, blob));
|
||||
}
|
||||
|
||||
/**
|
||||
@param have A pointer to the map that the recipient already has (if any).
|
||||
@param includeLeaves True if leaf nodes should be included.
|
||||
@param max The maximum number of nodes to return.
|
||||
@param func The functor to call for each node added to the FetchPack.
|
||||
|
||||
Note: a caller should set includeLeaves to false for transaction trees.
|
||||
There's no point in including the leaves of transaction trees.
|
||||
*/
|
||||
void SHAMap::getFetchPack (SHAMap* have, bool includeLeaves, int max,
|
||||
std::function<void (const uint256&, const Blob&)> func)
|
||||
{
|
||||
|
||||
Reference in New Issue
Block a user