chore: Run prettier on all files (#5657)

This commit is contained in:
Mayukha Vadari
2025-08-11 12:15:42 -04:00
committed by GitHub
parent abf12db788
commit 97f0747e10
60 changed files with 6244 additions and 6127 deletions

View File

@@ -1,4 +1,3 @@
# Unit Tests
## Running Tests
@@ -12,13 +11,13 @@ just `NoRippleCheckLimits`).
More than one suite or group of suites can be specified as a comma separated
list via the argument. For example, `--unittest=beast,OversizeMeta` will run
all suites in the `beast` library (root identifier) as well as the test suite
named `OversizeMeta`). All name matches are case sensitive.
named `OversizeMeta`). All name matches are case sensitive.
Tests can be executed in parallel using several child processes by specifying
the `--unittest-jobs=N` parameter. The default behavior is to execute serially
using a single process.
The order that suites are executed is determined by the suite priority that
The order that suites are executed is determined by the suite priority that
is optionally specified when the suite is declared in the code with one of the
`BEAST_DEFINE_TESTSUITE` macros. By default, suites have a priority of 0, and
other suites can choose to declare an integer priority value to make themselves

View File

@@ -26,7 +26,7 @@ collect when running the simulation. The specification includes:
- A collection of [`Peer`s](./Peer.h) that represent the participants in the
network, with each independently running the consensus algorithm.
- The `Peer` trust relationships as a `TrustGraph`. This is a directed graph
whose edges define what other `Peer`s a given `Peer` trusts. In other words,
whose edges define what other `Peer`s a given `Peer` trusts. In other words,
the set of out edges for a `Peer` in the graph correspond to the UNL of that
`Peer`.
- The network communication layer as a `BasicNetwork`. This models the overlay
@@ -45,6 +45,7 @@ eventually fully validating the consensus history of accepted transactions. Each
the registered `Collector`s.
## Example Simulation
Below is a basic simulation we can walk through to get an understanding of the
framework. This simulation is for a set of 5 validators that aren't directly
connected but rely on a single hub node for communication.
@@ -98,12 +99,12 @@ center[0]->runAsValidator = false;
The simulation code starts by creating a single instance of the [`Sim`
class](./Sim.h). This class is used to manage the overall simulation and
internally owns most other components, including the `Peer`s, `Scheduler`,
`BasicNetwork` and `TrustGraph`. The next two lines create two differ
`BasicNetwork` and `TrustGraph`. The next two lines create two differ
`PeerGroup`s of size 5 and 1 . A [`PeerGroup`](./PeerGroup.h) is a convenient
way for configuring a set of related peers together and internally has a vector
of pointers to the `Peer`s which are owned by the `Sim`. `PeerGroup`s can be
combined using `+/-` operators to configure more complex relationships of nodes
as shown by `PeerGroup network`. Note that each call to `createGroup` adds that
as shown by `PeerGroup network`. Note that each call to `createGroup` adds that
many new `Peer`s to the simulation, but does not specify any trust or network
relationships for the new `Peer`s.
@@ -125,14 +126,14 @@ validators.connect(center, delay);
Although the `sim` object has accessible instances of
[TrustGraph](./TrustGraph.h) and [BasicNetwork](./BasicNetwork.h), it is more
convenient to manage the graphs via the `PeerGroup`s. The first two lines
create a trust topology in which all `Peer`s trust the 5 validating `Peer`s. Or
convenient to manage the graphs via the `PeerGroup`s. The first two lines
create a trust topology in which all `Peer`s trust the 5 validating `Peer`s. Or
in the UNL perspective, all `Peer`s are configured with the same UNL listing the
5 validating `Peer`s. The two lines could've been rewritten as
`network.trust(validators)`.
The next lines create the network communication topology. Each of the validating
`Peer`s connects to the central hub `Peer` with a fixed delay of 200ms. Note
`Peer`s connects to the central hub `Peer` with a fixed delay of 200ms. Note
that the network connections are really undirected, but are represented
internally in a directed graph using edge pairs of inbound and outbound connections.
@@ -143,11 +144,11 @@ SimDurationCollector simDur;
sim.collectors.add(simDur);
```
The next lines add a single collector to the simulation. The
The next lines add a single collector to the simulation. The
`SimDurationCollector` is a simple example collector which tracks the total
duration of the simulation. More generally, a collector is any class that
duration of the simulation. More generally, a collector is any class that
implements `void on(NodeID, SimTime, Event)` for all [Events](./events.h)
emitted by a Peer. Events are arbitrary types used to indicate some action or
emitted by a Peer. Events are arbitrary types used to indicate some action or
change of state of a `Peer`. Other [existing collectors](./collectors.h) measure
latencies of transaction submission to validation or the rate of ledger closing
and monitor any jumps in ledger history.
@@ -176,9 +177,9 @@ to send transactions in at fixed or random intervals to fixed or random `Peer`s.
## Run
The example has two calls to `sim.run(1)`. This call runs the simulation until
each `Peer` has closed one additional ledger. After closing the additional
ledger, the `Peer` stops participating in consensus. The first call is used to
The example has two calls to `sim.run(1)`. This call runs the simulation until
each `Peer` has closed one additional ledger. After closing the additional
ledger, the `Peer` stops participating in consensus. The first call is used to
ensure a more useful prior state of all `Peer`s. After the transaction
submission, the second call to `run` results in one additional ledger that
accepts those transactions.
@@ -188,4 +189,4 @@ Alternatively, you can specify a duration to run the simulation, e.g.
scheduler has elapsed 10 additional seconds. The `sim.scheduler.in` or
`sim.scheduler.at` methods can schedule arbitrary code to execute at a later
time in the simulation, for example removing a network connection or modifying
the trust graph.
the trust graph.

View File

@@ -1,4 +1,5 @@
# Unit tests
This directory contains unit tests for the project. The difference from existing `src/test` folder
is that we switch to 3rd party testing framework (doctest). We intend to gradually move existing tests
from our own framework to doctest and such tests will be moved to this new folder.

View File

@@ -1,13 +1,12 @@
# RCL Consensus
# RCL Consensus
This directory holds the types and classes needed
to connect the generic consensus algorithm to the
rippled-specific instance of consensus.
* `RCLCxTx` adapts a `SHAMapItem` transaction.
* `RCLCxTxSet` adapts a `SHAMap` to represent a set of transactions.
* `RCLCxLedger` adapts a `Ledger`.
* `RCLConsensus` is implements the requirements of the generic
`Consensus` class by connecting to the rest of the `rippled`
application.
- `RCLCxTx` adapts a `SHAMapItem` transaction.
- `RCLCxTxSet` adapts a `SHAMap` to represent a set of transactions.
- `RCLCxLedger` adapts a `Ledger`.
- `RCLConsensus` is implements the requirements of the generic
`Consensus` class by connecting to the rest of the `rippled`
application.

View File

@@ -1,9 +1,8 @@
# Ledger Process
# Ledger Process #
## Introduction
## Introduction ##
## Life Cycle ##
## Life Cycle
Every server always has an open ledger. All received new transactions are
applied to the open ledger. The open ledger can't close until we reach
@@ -37,10 +36,11 @@ round. This is a "rebase": now that we know the real history, the current open
ledger is rebased against the last closed ledger.
The purpose of the open ledger is as follows:
- Forms the basis of the initial proposal during consensus
- Used to decide if we can reject the transaction without relaying it
## Byzantine Failures ##
## Byzantine Failures
Byzantine failures are resolved as follows. If there is a supermajority ledger,
then a minority of validators will discover that the consensus round is
@@ -52,167 +52,169 @@ If there is no majority ledger, then starting on the next consensus round there
will not be a consensus on the last closed ledger. Another avalanche process
is started.
## Validators ##
## Validators
The only meaningful difference between a validator and a 'regular' server is
that the validator sends its proposals and validations to the network.
---
# The Ledger Stream #
# The Ledger Stream
## Ledger Priorities ##
## Ledger Priorities
There are two ledgers that are the most important for a rippled server to have:
- The consensus ledger and
- The last validated ledger.
- The consensus ledger and
- The last validated ledger.
If we need either of those two ledgers they are fetched with the highest
priority. Also, when they arrive, they replace their earlier counterparts
priority. Also, when they arrive, they replace their earlier counterparts
(if they exist).
The `LedgerMaster` object tracks
- the last published ledger,
- the last validated ledger, and
- ledger history.
So the `LedgerMaster` is at the center of fetching historical ledger data.
- the last published ledger,
- the last validated ledger, and
- ledger history.
So the `LedgerMaster` is at the center of fetching historical ledger data.
In specific, the `LedgerMaster::doAdvance()` method triggers the code that
fetches historical data and controls the state machine for ledger acquisition.
The server tries to publish an on-going stream of consecutive ledgers to its
clients. After the server has started and caught up with network
clients. After the server has started and caught up with network
activity, say when ledger 500 is being settled, then the server puts its best
effort into publishing validated ledger 500 followed by validated ledger 501
and then 502. This effort continues until the server is shut down.
and then 502. This effort continues until the server is shut down.
But loading or network connectivity may sometimes interfere with that ledger
stream. So suppose the server publishes validated ledger 600 and then
receives validated ledger 603. Then the server wants to back fill its ledger
stream. So suppose the server publishes validated ledger 600 and then
receives validated ledger 603. Then the server wants to back fill its ledger
history with ledgers 601 and 602.
The server prioritizes keeping up with current ledgers. But if it is caught
The server prioritizes keeping up with current ledgers. But if it is caught
up on the current ledger, and there are no higher priority demands on the
server, then it will attempt to back fill its historical ledgers. It fills
server, then it will attempt to back fill its historical ledgers. It fills
in the historical ledger data first by attempting to retrieve it from the
local database. If the local database does not have all of the necessary data
local database. If the local database does not have all of the necessary data
then the server requests the remaining information from network peers.
Suppose the server is missing multiple historical ledgers. Take the previous
example where we have ledgers 603 and 600, but we're missing 601 and 602. In
Suppose the server is missing multiple historical ledgers. Take the previous
example where we have ledgers 603 and 600, but we're missing 601 and 602. In
that case the server requests information for ledger 602 first, before
back-filling ledger 601. We want to expand the contiguous range of
most-recent ledgers that the server has locally. There's also a limit to
how much historical ledger data is useful. So if we're on ledger 603, but
back-filling ledger 601. We want to expand the contiguous range of
most-recent ledgers that the server has locally. There's also a limit to
how much historical ledger data is useful. So if we're on ledger 603, but
we're missing ledger 4 we may not bother asking for ledger 4.
## Assembling a Ledger ##
## Assembling a Ledger
When data for a ledger arrives from a peer, it may take a while before the
server can apply that data. So when ledger data arrives we schedule a job
thread to apply that data. If more data arrives before the job starts we add
that data to the job. We defer requesting more ledger data until all of the
data we have for that ledger has been processed. Once all of that data is
server can apply that data. So when ledger data arrives we schedule a job
thread to apply that data. If more data arrives before the job starts we add
that data to the job. We defer requesting more ledger data until all of the
data we have for that ledger has been processed. Once all of that data is
processed we can intelligently request only the additional data that we need
to fill in the ledger. This reduces network traffic and minimizes the load
to fill in the ledger. This reduces network traffic and minimizes the load
on peers supplying the data.
If we receive data for a ledger that is not currently under construction,
we don't just throw the data away. In particular the AccountStateNodes
may be useful, since they can be re-used across ledgers. This data is
we don't just throw the data away. In particular the AccountStateNodes
may be useful, since they can be re-used across ledgers. This data is
stashed in memory (not the database) where the acquire process can find
it.
Peers deliver ledger data in the order in which the data can be validated.
Data arrives in the following order:
1. The hash of the ledger header
2. The ledger header
3. The root nodes of the transaction tree and state tree
4. The lower (non-root) nodes of the state tree
5. The lower (non-root) nodes of the transaction tree
1. The hash of the ledger header
2. The ledger header
3. The root nodes of the transaction tree and state tree
4. The lower (non-root) nodes of the state tree
5. The lower (non-root) nodes of the transaction tree
Inner-most nodes are supplied before outer nodes. This allows the
Inner-most nodes are supplied before outer nodes. This allows the
requesting server to hook things up (and validate) in the order in which
data arrives.
If this process fails, then a server can also ask for ledger data by hash,
rather than by asking for specific nodes in a ledger. Asking for information
rather than by asking for specific nodes in a ledger. Asking for information
by hash is less efficient, but it allows a peer to return the information
even if the information is not assembled into a tree. All the peer needs is
even if the information is not assembled into a tree. All the peer needs is
the raw data.
## Which Peer To Ask ##
## Which Peer To Ask
Peers go though state transitions as the network goes through its state
transitions. Peer's provide their state to their directly connected peers.
transitions. Peer's provide their state to their directly connected peers.
By monitoring the state of each connected peer a server can tell which of
its peers has the information that it needs.
Therefore if a server suffers a byzantine failure the server can tell which
of its peers did not suffer that same failure. So the server knows which
of its peers did not suffer that same failure. So the server knows which
peer(s) to ask for the missing information.
Peers also report their contiguous range of ledgers. This is another way that
Peers also report their contiguous range of ledgers. This is another way that
a server can determine which peer to ask for a particular ledger or piece of
a ledger.
There are also indirect peer queries. If there have been timeouts while
acquiring ledger data then a server may issue indirect queries. In that
There are also indirect peer queries. If there have been timeouts while
acquiring ledger data then a server may issue indirect queries. In that
case the server receiving the indirect query passes the query along to any
of its peers that may have the requested data. This is important if the
network has a byzantine failure. It also helps protect the validation
network. A validator may need to get a peer set from one of the other
of its peers that may have the requested data. This is important if the
network has a byzantine failure. It also helps protect the validation
network. A validator may need to get a peer set from one of the other
validators, and indirect queries improve the likelihood of success with
that.
## Kinds of Fetch Packs ##
## Kinds of Fetch Packs
A FetchPack is the way that peers send partial ledger data to other peers
so the receiving peer can reconstruct a ledger.
A 'normal' FetchPack is a bucket of nodes indexed by hash. The server
A 'normal' FetchPack is a bucket of nodes indexed by hash. The server
building the FetchPack puts information into the FetchPack that the
destination server is likely to need. Normally they contain all of the
destination server is likely to need. Normally they contain all of the
missing nodes needed to fill in a ledger.
A 'compact' FetchPack, on the other hand, contains only leaf nodes, no
inner nodes. Because there are no inner nodes, the ledger information that
it contains cannot be validated as the ledger is assembled. We have to,
inner nodes. Because there are no inner nodes, the ledger information that
it contains cannot be validated as the ledger is assembled. We have to,
initially, take the accuracy of the FetchPack for granted and assemble the
ledger. Once the entire ledger is assembled the entire ledger can be
validated. But if the ledger does not validate then there's nothing to be
ledger. Once the entire ledger is assembled the entire ledger can be
validated. But if the ledger does not validate then there's nothing to be
done but throw the entire FetchPack away; there's no way to save a portion
of the FetchPack.
The FetchPacks just described could be termed 'reverse FetchPacks.' They
only provide historical data. There may be a use for what could be called a
'forward FetchPack.' A forward FetchPack would contain the information that
The FetchPacks just described could be termed 'reverse FetchPacks.' They
only provide historical data. There may be a use for what could be called a
'forward FetchPack.' A forward FetchPack would contain the information that
is needed to build a new ledger out of the preceding ledger.
A forward compact FetchPack would need to contain:
- The header for the new ledger,
- The leaf nodes of the transaction tree (if there is one),
- The index of deleted nodes in the state tree,
- The index and data for new nodes in the state tree, and
- The index and new data of modified nodes in the state tree.
- The header for the new ledger,
- The leaf nodes of the transaction tree (if there is one),
- The index of deleted nodes in the state tree,
- The index and data for new nodes in the state tree, and
- The index and new data of modified nodes in the state tree.
---
# Definitions #
# Definitions
## Open Ledger ##
## Open Ledger
The open ledger is the ledger that the server applies all new incoming
transactions to.
## Last Validated Ledger ##
## Last Validated Ledger
The most recent ledger that the server is certain will always remain part
of the permanent, public history.
## Last Closed Ledger ##
## Last Closed Ledger
The most recent ledger that the server believes the network reached consensus
on. Different servers can arrive at a different conclusion about the last
@@ -220,29 +222,29 @@ closed ledger. This is a consequence of Byzantanine failure. The purpose of
validations is to resolve the differences between servers and come to a common
conclusion about which last closed ledger is authoritative.
## Consensus ##
## Consensus
A distributed agreement protocol. Ripple uses the consensus process to solve
the problem of double-spending.
## Validation ##
## Validation
A signed statement indicating that it built a particular ledger as a result
of the consensus process.
## Proposal ##
## Proposal
A signed statement of which transactions it believes should be included in
the next consensus ledger.
## Ledger Header ##
## Ledger Header
The "ledger header" is the chunk of data that hashes to the
ledger's hash. It contains the sequence number, parent hash,
hash of the previous ledger, hash of the root node of the
state tree, and so on.
## Ledger Base ##
## Ledger Base
The term "ledger base" refers to a particular type of query
and response used in the ledger fetch process that includes
@@ -251,9 +253,9 @@ such as the root node of the state tree.
---
# Ledger Structures #
# Ledger Structures
## Account Root ##
## Account Root
**Account:** A 160-bit account ID.
@@ -264,8 +266,8 @@ such as the root node of the state tree.
**LedgerEntryType:** "AccountRoot"
**OwnerCount:** The number of items the account owns that are charged to the
account. Offers are charged to the account. Trust lines may be charged to
the account (but not necessarily). The OwnerCount determines the reserve on
account. Offers are charged to the account. Trust lines may be charged to
the account (but not necessarily). The OwnerCount determines the reserve on
the account.
**PreviousTxnID:** 256-bit index of the previous transaction on this account.
@@ -274,43 +276,45 @@ the account.
transaction on this account.
**Sequence:** Must be a value of 1 for the account to process a valid
transaction. The value initially matches the sequence number of the state
tree of the account that signed the transaction. The process of executing
the transaction increments the sequence number. This is how ripple prevents
transaction. The value initially matches the sequence number of the state
tree of the account that signed the transaction. The process of executing
the transaction increments the sequence number. This is how ripple prevents
a transaction from executing more than once.
**index:** 256-bit hash of this AccountRoot.
## Trust Line ##
## Trust Line
The trust line acts as an edge connecting two accounts: the accounts
represented by the HighNode and the LowNode. Which account is "high" and
"low" is determined by the values of the two 160-bit account IDs. The
account with the smaller 160-bit ID is always the low account. This
represented by the HighNode and the LowNode. Which account is "high" and
"low" is determined by the values of the two 160-bit account IDs. The
account with the smaller 160-bit ID is always the low account. This
ordering makes the hash of a trust line between accounts A and B have the
same value as a trust line between accounts B and A.
**Balance:**
- **currency:** String identifying a valid currency, e.g., "BTC".
- **issuer:** There is no issuer, really, this entry is "NoAccount".
- **value:**
- **currency:** String identifying a valid currency, e.g., "BTC".
- **issuer:** There is no issuer, really, this entry is "NoAccount".
- **value:**
**Flags:** ???
**HighLimit:**
- **currency:** Same as for Balance.
- **issuer:** A 160-bit account ID.
- **value:** The largest amount this issuer will accept of the currency.
- **currency:** Same as for Balance.
- **issuer:** A 160-bit account ID.
- **value:** The largest amount this issuer will accept of the currency.
**HighNode:** A deletion hint.
**LedgerEntryType:** "RippleState".
**LowLimit:**
- **currency:** Same as for Balance.
- **issuer:** A 160-bit account ID.
- **value:** The largest amount of the currency this issuer will accept.
- **currency:** Same as for Balance.
- **issuer:** A 160-bit account ID.
- **value:** The largest amount of the currency this issuer will accept.
**LowNode:** A deletion hint
@@ -321,8 +325,7 @@ transaction on this account.
**index:** 256-bit hash of this RippleState.
## Ledger Hashes ##
## Ledger Hashes
**Flags:** ???
@@ -334,8 +337,7 @@ transaction on this account.
**index:** 256-bit hash of this LedgerHashes.
## Owner Directory ##
## Owner Directory
Lists all of the offers and trust lines that are associated with an account.
@@ -351,8 +353,7 @@ Lists all of the offers and trust lines that are associated with an account.
**index:** A hash of the owner account.
## Book Directory ##
## Book Directory
Lists one or more offers that have the same quality.
@@ -360,18 +361,18 @@ If a pair of Currency and Issuer fields are all zeros, then that pair is
dealing in XRP.
The code, at the moment, does not recognize that the Currency and Issuer
fields are currencies and issuers. So those values are presented in hex,
rather than as accounts and currencies. That's a bug and should be fixed
fields are currencies and issuers. So those values are presented in hex,
rather than as accounts and currencies. That's a bug and should be fixed
at some point.
**ExchangeRate:** A 64-bit value. The first 8-bits is the exponent and the
remaining bits are the mantissa. The format is such that a bigger 64-bit
**ExchangeRate:** A 64-bit value. The first 8-bits is the exponent and the
remaining bits are the mantissa. The format is such that a bigger 64-bit
value always represents a higher exchange rate.
Each type can compute its own hash. The hash of a book directory contains,
as its lowest 64 bits, the exchange rate. This means that if there are
multiple *almost* identical book directories, but with different exchange
rates, then these book directories will sit together in the ledger. The best
Each type can compute its own hash. The hash of a book directory contains,
as its lowest 64 bits, the exchange rate. This means that if there are
multiple _almost_ identical book directories, but with different exchange
rates, then these book directories will sit together in the ledger. The best
exchange rate will be the first in the sequence of Book Directories.
**Flags:** ???
@@ -392,14 +393,14 @@ currencies described by this BookDirectory.
**TakerPaysIssuer:** Issuer of the PaysCurrency.
**index:** A 256-bit hash computed using the TakerGetsCurrency, TakerGetsIssuer,
TakerPaysCurrency, and TakerPaysIssuer in the top 192 bits. The lower 64-bits
TakerPaysCurrency, and TakerPaysIssuer in the top 192 bits. The lower 64-bits
are occupied by the exchange rate.
---
# Ledger Publication #
# Ledger Publication
## Overview ##
## Overview
The Ripple server permits clients to subscribe to a continuous stream of
fully-validated ledgers. The publication code maintains this stream.
@@ -408,7 +409,7 @@ The server attempts to maintain this continuous stream unless it falls
too far behind, in which case it jumps to the current fully-validated
ledger and then attempts to resume a continuous stream.
## Implementation ##
## Implementation
`LedgerMaster::doAdvance` is invoked when work may need to be done to
publish ledgers to clients. This code loops until it cannot make further
@@ -430,17 +431,17 @@ the list of resident ledgers.
---
# The Ledger Cleaner #
# The Ledger Cleaner
## Overview ##
## Overview
The ledger cleaner checks and, if necessary, repairs the SQLite ledger and
transaction databases. It can also check for pieces of a ledger that should
be in the node back end but are missing. If it detects this case, it
triggers a fetch of the ledger. The ledger cleaner only operates by manual
transaction databases. It can also check for pieces of a ledger that should
be in the node back end but are missing. If it detects this case, it
triggers a fetch of the ledger. The ledger cleaner only operates by manual
request. It is never started automatically.
## Operations ##
## Operations
The ledger cleaner can operate on a single ledger or a range of ledgers. It
always validates the ledger chain itself, ensuring that the SQLite database
@@ -448,7 +449,7 @@ contains a consistent chain of ledgers from the last validated ledger as far
back as the database goes.
If requested, it can additionally repair the SQLite entries for transactions
in each checked ledger. This was primarily intended to repair incorrect
in each checked ledger. This was primarily intended to repair incorrect
entries created by a bug (since fixed) that could cause transasctions from a
ledger other than the fully-validated ledger to appear in the SQLite
databases in addition to the transactions from the correct ledger.
@@ -460,7 +461,7 @@ To prevent the ledger cleaner from saturating the available I/O bandwidth
and excessively polluting caches with ancient information, the ledger
cleaner paces itself and does not attempt to get its work done quickly.
## Commands ##
## Commands
The ledger cleaner can be controlled and monitored with the **ledger_cleaner**
RPC command. With no parameters, this command reports on the status of the
@@ -486,4 +487,4 @@ ledger(s) for missing nodes in the back end node store
---
# References #
# References

View File

@@ -17,15 +17,16 @@ transactions into the open ledger, even during unfavorable conditions.
How fees escalate:
1. There is a base [fee level](#fee-level) of 256,
which is the minimum that a typical transaction
is required to pay. For a [reference
transaction](#reference-transaction), that corresponds to the
network base fee, which is currently 10 drops.
which is the minimum that a typical transaction
is required to pay. For a [reference
transaction](#reference-transaction), that corresponds to the
network base fee, which is currently 10 drops.
2. However, there is a limit on the number of transactions that
can get into an open ledger for that base fee level. The limit
will vary based on the [health](#consensus-health) of the
consensus process, but will be at least [5](#other-constants).
* If consensus stays [healthy](#consensus-health), the limit will
can get into an open ledger for that base fee level. The limit
will vary based on the [health](#consensus-health) of the
consensus process, but will be at least [5](#other-constants).
- If consensus stays [healthy](#consensus-health), the limit will
be the max of the number of transactions in the validated ledger
plus [20%](#other-constants) or the current limit until it gets
to [50](#other-constants), at which point, the limit will be the
@@ -35,50 +36,56 @@ consensus process, but will be at least [5](#other-constants).
decreases (i.e. a large ledger is no longer recent), the limit will
decrease to the new largest value by 10% each time the ledger has
more than 50 transactions.
* If consensus does not stay [healthy](#consensus-health),
- If consensus does not stay [healthy](#consensus-health),
the limit will clamp down to the smaller of the number of
transactions in the validated ledger minus [50%](#other-constants)
or the previous limit minus [50%](#other-constants).
* The intended effect of these mechanisms is to allow as many base fee
- The intended effect of these mechanisms is to allow as many base fee
level transactions to get into the ledger as possible while the
network is [healthy](#consensus-health), but to respond quickly to
any condition that makes it [unhealthy](#consensus-health), including,
but not limited to, malicious attacks.
3. Once there are more transactions in the open ledger than indicated
by the limit, the required fee level jumps drastically.
* The formula is `( lastLedgerMedianFeeLevel *
TransactionsInOpenLedger^2 / limit^2 )`,
by the limit, the required fee level jumps drastically.
- The formula is `( lastLedgerMedianFeeLevel *
TransactionsInOpenLedger^2 / limit^2 )`,
and returns a [fee level](#fee-level).
4. That may still be pretty small, but as more transactions get
into the ledger, the fee level increases exponentially.
* For example, if the limit is 6, and the median fee is minimal,
into the ledger, the fee level increases exponentially.
- For example, if the limit is 6, and the median fee is minimal,
and assuming all [reference transactions](#reference-transaction),
the 8th transaction only requires a [level](#fee-level) of about 174,000
or about 6800 drops,
but the 20th transaction requires a [level](#fee-level) of about
1,283,000 or about 50,000 drops.
5. Finally, as each ledger closes, the median fee level of that ledger is
computed and used as `lastLedgerMedianFeeLevel` (with a
[minimum value of 128,000](#other-constants))
in the fee escalation formula for the next open ledger.
* Continuing the example above, if ledger consensus completes with
computed and used as `lastLedgerMedianFeeLevel` (with a
[minimum value of 128,000](#other-constants))
in the fee escalation formula for the next open ledger.
- Continuing the example above, if ledger consensus completes with
only those 20 transactions, and all of those transactions paid the
minimum required fee at each step, the limit will be adjusted from
6 to 24, and the `lastLedgerMedianFeeLevel` will be about 322,000,
which is 12,600 drops for a
[reference transaction](#reference-transaction).
* This will only require 10 drops for the first 25 transactions,
- This will only require 10 drops for the first 25 transactions,
but the 26th transaction will require a level of about 349,150
or about 13,649 drops.
* This example assumes a cold-start scenario, with a single, possibly
malicious, user willing to pay arbitrary amounts to get transactions
into the open ledger. It ignores the effects of the [Transaction
Queue](#transaction-queue). Any lower fee level transactions submitted
by other users at the same time as this user's transactions will go into
the transaction queue, and will have the first opportunity to be applied
to the _next_ open ledger. The next section describes how that works in
more detail.
- This example assumes a cold-start scenario, with a single, possibly
malicious, user willing to pay arbitrary amounts to get transactions
into the open ledger. It ignores the effects of the [Transaction
Queue](#transaction-queue). Any lower fee level transactions submitted
by other users at the same time as this user's transactions will go into
the transaction queue, and will have the first opportunity to be applied
to the _next_ open ledger. The next section describes how that works in
more detail.
## Transaction Queue
@@ -92,33 +99,34 @@ traffic periods, and give those transactions a much better chance to
succeed.
1. If an incoming transaction meets both the base [fee
level](#fee-level) and the [load fee](#load-fee) minimum, but does not have a high
enough [fee level](#fee-level) to immediately go into the open ledger,
it is instead put into the queue and broadcast to peers. Each peer will
then make an independent decision about whether to put the transaction
into its open ledger or the queue. In principle, peers with identical
open ledgers will come to identical decisions. Any discrepancies will be
resolved as usual during consensus.
level](#fee-level) and the [load fee](#load-fee) minimum, but does not have a high
enough [fee level](#fee-level) to immediately go into the open ledger,
it is instead put into the queue and broadcast to peers. Each peer will
then make an independent decision about whether to put the transaction
into its open ledger or the queue. In principle, peers with identical
open ledgers will come to identical decisions. Any discrepancies will be
resolved as usual during consensus.
2. When consensus completes, the open ledger limit is adjusted, and
the required [fee level](#fee-level) drops back to the base
[fee level](#fee-level). Before the ledger is made available to
external transactions, transactions are applied from the queue to the
ledger from highest [fee level](#fee-level) to lowest. These transactions
count against the open ledger limit, so the required [fee level](#fee-level)
may start rising during this process.
the required [fee level](#fee-level) drops back to the base
[fee level](#fee-level). Before the ledger is made available to
external transactions, transactions are applied from the queue to the
ledger from highest [fee level](#fee-level) to lowest. These transactions
count against the open ledger limit, so the required [fee level](#fee-level)
may start rising during this process.
3. Once the queue is empty, or the required [fee level](#fee-level)
rises too high for the remaining transactions in the queue, the ledger
is opened up for normal transaction processing.
rises too high for the remaining transactions in the queue, the ledger
is opened up for normal transaction processing.
4. A transaction in the queue can stay there indefinitely in principle,
but in practice, either
* it will eventually get applied to the ledger,
* it will attempt to apply to the ledger and fail,
* it will attempt to apply to the ledger and retry [10
but in practice, either
- it will eventually get applied to the ledger,
- it will attempt to apply to the ledger and fail,
- it will attempt to apply to the ledger and retry [10
times](#other-constants),
* its last ledger sequence number will expire,
* the user will replace it by submitting another transaction with the same
- its last ledger sequence number will expire,
- the user will replace it by submitting another transaction with the same
sequence number and at least a [25% higher fee](#other-constants), or
* it will get dropped when the queue fills up with more valuable transactions.
- it will get dropped when the queue fills up with more valuable transactions.
The size limit is computed dynamically, and can hold transactions for
the next [20 ledgers](#other-constants) (restricted to a minimum of
[2000 transactions](#other-constants)). The lower the transaction's
@@ -128,14 +136,15 @@ If a transaction is submitted for an account with one or more transactions
already in the queue, and a sequence number that is sequential with the other
transactions in the queue for that account, it will be considered
for the queue if it meets these additional criteria:
* the account has fewer than [10](#other-constants) transactions
- the account has fewer than [10](#other-constants) transactions
already in the queue.
* all other queued transactions for that account, in the case where
- all other queued transactions for that account, in the case where
they spend the maximum possible XRP, leave enough XRP balance to pay
the fee,
* the total fees for the other queued transactions are less than both
- the total fees for the other queued transactions are less than both
the network's minimum reserve and the account's XRP balance, and
* none of the prior queued transactions affect the ability of subsequent
- none of the prior queued transactions affect the ability of subsequent
transactions to claim a fee.
Currently, there is an additional restriction that the queue cannot work with
@@ -148,7 +157,7 @@ development will make the queue aware of `sfAccountTxnID` mechanisms.
### Fee Level
"Fee level" is used to allow the cost of different types of transactions
to be compared directly. For a [reference
to be compared directly. For a [reference
transaction](#reference-transaction), the base fee
level is 256. If a transaction is submitted with a higher `Fee` field,
the fee level is scaled appropriately.
@@ -157,16 +166,16 @@ Examples, assuming a [reference transaction](#reference-transaction)
base fee of 10 drops:
1. A single-signed [reference transaction](#reference-transaction)
with `Fee=20` will have a fee level of
`20 drop fee * 256 fee level / 10 drop base fee = 512 fee level`.
with `Fee=20` will have a fee level of
`20 drop fee * 256 fee level / 10 drop base fee = 512 fee level`.
2. A multi-signed [reference transaction](#reference-transaction) with
3 signatures (base fee = 40 drops) and `Fee=60` will have a fee level of
`60 drop fee * 256 fee level / ((1tx + 3sigs) * 10 drop base fee) = 384
3 signatures (base fee = 40 drops) and `Fee=60` will have a fee level of
`60 drop fee * 256 fee level / ((1tx + 3sigs) * 10 drop base fee) = 384
fee level`.
3. A hypothetical future non-reference transaction with a base
fee of 15 drops multi-signed with 5 signatures and `Fee=90` will
have a fee level of
`90 drop fee * 256 fee level / ((1tx + 5sigs) * 15 drop base fee) = 256
fee of 15 drops multi-signed with 5 signatures and `Fee=90` will
have a fee level of
`90 drop fee * 256 fee level / ((1tx + 5sigs) * 15 drop base fee) = 256
fee level`.
This demonstrates that a simpler transaction paying less XRP can be more
@@ -194,7 +203,7 @@ For consensus to be considered healthy, the peers on the network
should largely remain in sync with one another. It is particularly
important for the validators to remain in sync, because that is required
for participation in consensus. However, the network tolerates some
validators being out of sync. Fundamentally, network health is a
validators being out of sync. Fundamentally, network health is a
function of validators reaching consensus on sets of recently submitted
transactions.
@@ -214,61 +223,61 @@ often coincides with new ledgers with zero transactions.
### Other Constants
* *Base fee transaction limit per ledger*. The minimum value of 5 was
chosen to ensure the limit never gets so small that the ledger becomes
unusable. The "target" value of 50 was chosen so the limit never gets large
enough to invite abuse, but keeps up if the network stays healthy and
active. These exact values were chosen experimentally, and can easily
change in the future.
* *Expected ledger size growth and reduction percentages*. The growth
value of 20% was chosen to allow the limit to grow quickly as load
increases, but not so quickly as to allow bad actors to run unrestricted.
The reduction value of 50% was chosen to cause the limit to drop
significantly, but not so drastically that the limit cannot quickly
recover if the problem is temporary. These exact values were chosen
experimentally, and can easily change in the future.
* *Minimum `lastLedgerMedianFeeLevel`*. The value of 500 was chosen to
ensure that the first escalated fee was more significant and noticable
than what the default would allow. This exact value was chosen
experimentally, and can easily change in the future.
* *Transaction queue size limit*. The limit is computed based on the
base fee transaction limit per ledger, so that the queue can grow
automatically as the network's performance improves, allowing
more transactions per second, and thus more transactions per ledger
to process successfully. The limit of 20 ledgers was used to provide
a balance between resource (specifically memory) usage, and giving
transactions a realistic chance to be processed. The minimum size of
2000 transactions was chosen to allow a decent functional backlog during
network congestion conditions. These exact values were
chosen experimentally, and can easily change in the future.
* *Maximum retries*. A transaction in the queue can attempt to apply
to the open ledger, but get a retry (`ter`) code up to 10 times, at
which point, it will be removed from the queue and dropped. The
value was chosen to be large enough to allow temporary failures to clear
up, but small enough that the queue doesn't fill up with stale
transactions which prevent lower fee level, but more likely to succeed,
transactions from queuing.
* *Maximum transactions per account*. A single account can have up to 10
transactions in the queue at any given time. This is primarily to
mitigate the lost cost of broadcasting multiple transactions if one of
the earlier ones fails or is otherwise removed from the queue without
being applied to the open ledger. The value was chosen arbitrarily, and
can easily change in the future.
* *Minimum last ledger sequence buffer*. If a transaction has a
`LastLedgerSequence` value, and cannot be processed into the open
ledger, that `LastLedgerSequence` must be at least 2 more than the
sequence number of the open ledger to be considered for the queue. The
value was chosen to provide a balance between letting the user control
the lifespan of the transaction, and giving a queued transaction a
chance to get processed out of the queue before getting discarded,
particularly since it may have dependent transactions also in the queue,
which will never succeed if this one is discarded.
* *Replaced transaction fee increase*. Any transaction in the queue can be
replaced by another transaction with the same sequence number and at
least a 25% higher fee level. The 25% increase is intended to cover the
resource cost incurred by broadcasting the original transaction to the
network. This value was chosen experimentally, and can easily change in
the future.
- _Base fee transaction limit per ledger_. The minimum value of 5 was
chosen to ensure the limit never gets so small that the ledger becomes
unusable. The "target" value of 50 was chosen so the limit never gets large
enough to invite abuse, but keeps up if the network stays healthy and
active. These exact values were chosen experimentally, and can easily
change in the future.
- _Expected ledger size growth and reduction percentages_. The growth
value of 20% was chosen to allow the limit to grow quickly as load
increases, but not so quickly as to allow bad actors to run unrestricted.
The reduction value of 50% was chosen to cause the limit to drop
significantly, but not so drastically that the limit cannot quickly
recover if the problem is temporary. These exact values were chosen
experimentally, and can easily change in the future.
- _Minimum `lastLedgerMedianFeeLevel`_. The value of 500 was chosen to
ensure that the first escalated fee was more significant and noticable
than what the default would allow. This exact value was chosen
experimentally, and can easily change in the future.
- _Transaction queue size limit_. The limit is computed based on the
base fee transaction limit per ledger, so that the queue can grow
automatically as the network's performance improves, allowing
more transactions per second, and thus more transactions per ledger
to process successfully. The limit of 20 ledgers was used to provide
a balance between resource (specifically memory) usage, and giving
transactions a realistic chance to be processed. The minimum size of
2000 transactions was chosen to allow a decent functional backlog during
network congestion conditions. These exact values were
chosen experimentally, and can easily change in the future.
- _Maximum retries_. A transaction in the queue can attempt to apply
to the open ledger, but get a retry (`ter`) code up to 10 times, at
which point, it will be removed from the queue and dropped. The
value was chosen to be large enough to allow temporary failures to clear
up, but small enough that the queue doesn't fill up with stale
transactions which prevent lower fee level, but more likely to succeed,
transactions from queuing.
- _Maximum transactions per account_. A single account can have up to 10
transactions in the queue at any given time. This is primarily to
mitigate the lost cost of broadcasting multiple transactions if one of
the earlier ones fails or is otherwise removed from the queue without
being applied to the open ledger. The value was chosen arbitrarily, and
can easily change in the future.
- _Minimum last ledger sequence buffer_. If a transaction has a
`LastLedgerSequence` value, and cannot be processed into the open
ledger, that `LastLedgerSequence` must be at least 2 more than the
sequence number of the open ledger to be considered for the queue. The
value was chosen to provide a balance between letting the user control
the lifespan of the transaction, and giving a queued transaction a
chance to get processed out of the queue before getting discarded,
particularly since it may have dependent transactions also in the queue,
which will never succeed if this one is discarded.
- _Replaced transaction fee increase_. Any transaction in the queue can be
replaced by another transaction with the same sequence number and at
least a 25% higher fee level. The 25% increase is intended to cover the
resource cost incurred by broadcasting the original transaction to the
network. This value was chosen experimentally, and can easily change in
the future.
### `fee` command
@@ -287,6 +296,7 @@ ledger. It includes the sequence number of the current open ledger,
but may not make sense if rippled is not synced to the network.
Result format:
```
{
"result" : {
@@ -319,13 +329,13 @@ without warning.**
Up to two fields in `server_info` output are related to fee escalation.
1. `load_factor_fee_escalation`: The factor on base transaction cost
that a transaction must pay to get into the open ledger. This value can
change quickly as transactions are processed from the network and
ledgers are closed. If not escalated, the value is 1, so will not be
returned.
that a transaction must pay to get into the open ledger. This value can
change quickly as transactions are processed from the network and
ledgers are closed. If not escalated, the value is 1, so will not be
returned.
2. `load_factor_fee_queue`: If the queue is full, this is the factor on
base transaction cost that a transaction must pay to get into the queue.
If not full, the value is 1, so will not be returned.
base transaction cost that a transaction must pay to get into the queue.
If not full, the value is 1, so will not be returned.
In all cases, the transaction fee must be high enough to overcome both
`load_factor_fee_queue` and `load_factor` to be considered. It does not
@@ -341,22 +351,21 @@ without warning.**
Three fields in `server_state` output are related to fee escalation.
1. `load_factor_fee_escalation`: The factor on base transaction cost
that a transaction must pay to get into the open ledger. This value can
change quickly as transactions are processed from the network and
ledgers are closed. The ratio between this value and
`load_factor_fee_reference` determines the multiplier for transaction
fees to get into the current open ledger.
that a transaction must pay to get into the open ledger. This value can
change quickly as transactions are processed from the network and
ledgers are closed. The ratio between this value and
`load_factor_fee_reference` determines the multiplier for transaction
fees to get into the current open ledger.
2. `load_factor_fee_queue`: This is the factor on base transaction cost
that a transaction must pay to get into the queue. The ratio between
this value and `load_factor_fee_reference` determines the multiplier for
transaction fees to get into the transaction queue to be considered for
a later ledger.
that a transaction must pay to get into the queue. The ratio between
this value and `load_factor_fee_reference` determines the multiplier for
transaction fees to get into the transaction queue to be considered for
a later ledger.
3. `load_factor_fee_reference`: Like `load_base`, this is the baseline
that is used to scale fee escalation computations.
that is used to scale fee escalation computations.
In all cases, the transaction fee must be high enough to overcome both
`load_factor_fee_queue` and `load_factor` to be considered. It does not
need to overcome `load_factor_fee_escalation`, though if it does not, it
is more likely to be queued than immediately processed into the open
ledger.

View File

@@ -71,18 +71,18 @@ Amendment must receive at least an 80% approval rate from validating nodes for
a period of two weeks before being accepted. The following example outlines the
process of an Amendment from its conception to approval and usage.
* A community member proposes to change transaction processing in some way.
- A community member proposes to change transaction processing in some way.
The proposal is discussed amongst the community and receives its support
creating a community or human consensus.
* Some members contribute their time and work to develop the Amendment.
- Some members contribute their time and work to develop the Amendment.
* A pull request is created and the new code is folded into a rippled build
- A pull request is created and the new code is folded into a rippled build
and made available for use.
* The consensus process begins with the validating nodes.
- The consensus process begins with the validating nodes.
* If the Amendment holds an 80% majority for a two week period, nodes will begin
- If the Amendment holds an 80% majority for a two week period, nodes will begin
including the transaction to enable it in their initial sets.
Nodes may veto Amendments they consider undesirable by never announcing their
@@ -112,7 +112,7 @@ enabled.
Optional online deletion happens through the SHAMapStore. Records are deleted
from disk based on ledger sequence number. These records reside in the
key-value database as well as in the SQLite ledger and transaction databases.
key-value database as well as in the SQLite ledger and transaction databases.
Without online deletion storage usage grows without bounds. It can only
be pruned by stopping, manually deleting data, and restarting the server.
Online deletion requires less operator intervention to manage the server.
@@ -142,14 +142,14 @@ server restarts.
Configuration:
* In the [node_db] configuration section, an optional online_delete parameter is
set. If not set or if set to 0, online delete is disabled. Otherwise, the
setting defines number of ledgers between deletion cycles.
* Another optional parameter in [node_db] is that for advisory_delete. It is
disabled by default. If set to non-zero, requires an RPC call to activate the
deletion routine.
* online_delete must not be greater than the [ledger_history] parameter.
* [fetch_depth] will be silently set to equal the online_delete setting if
online_delete is greater than fetch_depth.
* In the [node_db] section, there is a performance tuning option, delete_batch,
which sets the maximum size in ledgers for each SQL DELETE query.
- In the [node_db] configuration section, an optional online_delete parameter is
set. If not set or if set to 0, online delete is disabled. Otherwise, the
setting defines number of ledgers between deletion cycles.
- Another optional parameter in [node_db] is that for advisory_delete. It is
disabled by default. If set to non-zero, requires an RPC call to activate the
deletion routine.
- online_delete must not be greater than the [ledger_history] parameter.
- [fetch_depth] will be silently set to equal the online_delete setting if
online_delete is greater than fetch_depth.
- In the [node_db] section, there is a performance tuning option, delete_batch,
which sets the maximum size in ledgers for each SQL DELETE query.

View File

@@ -2,8 +2,8 @@
The guiding principles of the Relational Database Interface are summarized below:
* All hard-coded SQL statements should be stored in the [files](#source-files) under the `xrpld/app/rdb` directory. With the exception of test modules, no hard-coded SQL should be added to any other file in rippled.
* The base class `RelationalDatabase` is inherited by derived classes that each provide an interface for operating on distinct relational database systems.
- All hard-coded SQL statements should be stored in the [files](#source-files) under the `xrpld/app/rdb` directory. With the exception of test modules, no hard-coded SQL should be added to any other file in rippled.
- The base class `RelationalDatabase` is inherited by derived classes that each provide an interface for operating on distinct relational database systems.
## Overview
@@ -45,36 +45,34 @@ src/xrpld/app/rdb/
```
### File Contents
| File | Contents |
| ----------- | ----------- |
| `Node.[h\|cpp]` | Defines/Implements methods used by `SQLiteDatabase` for interacting with SQLite node databases|
|`SQLiteDatabase.[h\|cpp]`| Defines/Implements the class `SQLiteDatabase`/`SQLiteDatabaseImp` which inherits from `RelationalDatabase` and is used to operate on the main stores |
| `PeerFinder.[h\|cpp]` | Defines/Implements methods for interacting with the PeerFinder SQLite database |
|`RelationalDatabase.cpp`| Implements the static method `RelationalDatabase::init` which is used to initialize an instance of `RelationalDatabase` |
| `RelationalDatabase.h` | Defines the abstract class `RelationalDatabase`, the primary class of the Relational Database Interface |
| `State.[h\|cpp]` | Defines/Implements methods for interacting with the State SQLite database which concerns ledger deletion and database rotation |
| `Vacuum.[h\|cpp]` | Defines/Implements a method for performing the `VACUUM` operation on SQLite databases |
| `Wallet.[h\|cpp]` | Defines/Implements methods for interacting with Wallet SQLite databases |
| File | Contents |
| ------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- |
| `Node.[h\|cpp]` | Defines/Implements methods used by `SQLiteDatabase` for interacting with SQLite node databases |
| `SQLiteDatabase.[h\|cpp]` | Defines/Implements the class `SQLiteDatabase`/`SQLiteDatabaseImp` which inherits from `RelationalDatabase` and is used to operate on the main stores |
| `PeerFinder.[h\|cpp]` | Defines/Implements methods for interacting with the PeerFinder SQLite database |
| `RelationalDatabase.cpp` | Implements the static method `RelationalDatabase::init` which is used to initialize an instance of `RelationalDatabase` |
| `RelationalDatabase.h` | Defines the abstract class `RelationalDatabase`, the primary class of the Relational Database Interface |
| `State.[h\|cpp]` | Defines/Implements methods for interacting with the State SQLite database which concerns ledger deletion and database rotation |
| `Vacuum.[h\|cpp]` | Defines/Implements a method for performing the `VACUUM` operation on SQLite databases |
| `Wallet.[h\|cpp]` | Defines/Implements methods for interacting with Wallet SQLite databases |
## Classes
The abstract class `RelationalDatabase` is the primary class of the Relational Database Interface and is defined in the eponymous header file. This class provides a static method `init()` which, when invoked, creates a concrete instance of a derived class whose type is specified by the system configuration. All other methods in the class are virtual. Presently there exist two classes that derive from `RelationalDatabase`, namely `SQLiteDatabase` and `PostgresDatabase`.
The abstract class `RelationalDatabase` is the primary class of the Relational Database Interface and is defined in the eponymous header file. This class provides a static method `init()` which, when invoked, creates a concrete instance of a derived class whose type is specified by the system configuration. All other methods in the class are virtual. Presently there exist two classes that derive from `RelationalDatabase`, namely `SQLiteDatabase` and `PostgresDatabase`.
## Database Methods
The Relational Database Interface provides three categories of methods for interacting with databases:
* Free functions for interacting with SQLite databases used by various components of the software. These methods feature a `soci::session` parameter which facilitates connecting to SQLite databases, and are defined and implemented in the following files:
- Free functions for interacting with SQLite databases used by various components of the software. These methods feature a `soci::session` parameter which facilitates connecting to SQLite databases, and are defined and implemented in the following files:
* `PeerFinder.[h\|cpp]`
* `State.[h\|cpp]`
* `Vacuum.[h\|cpp]`
* `Wallet.[h\|cpp]`
- `PeerFinder.[h\|cpp]`
- `State.[h\|cpp]`
- `Vacuum.[h\|cpp]`
- `Wallet.[h\|cpp]`
- Free functions used exclusively by `SQLiteDatabaseImp` for interacting with SQLite databases owned by the node store. Unlike the free functions in the files listed above, these are not intended to be invoked directly by clients. Rather, these methods are invoked by derived instances of `RelationalDatabase`. These methods are defined in the following files:
- `Node.[h|cpp]`
* Free functions used exclusively by `SQLiteDatabaseImp` for interacting with SQLite databases owned by the node store. Unlike the free functions in the files listed above, these are not intended to be invoked directly by clients. Rather, these methods are invoked by derived instances of `RelationalDatabase`. These methods are defined in the following files:
* `Node.[h|cpp]`
* Member functions of `RelationalDatabase`, `SQLiteDatabase`, and `PostgresDatabase` which are used to access the node store.
- Member functions of `RelationalDatabase`, `SQLiteDatabase`, and `PostgresDatabase` which are used to access the node store.

View File

@@ -1,9 +1,8 @@
# Consensus
This directory contains the implementation of a
generic consensus algorithm. The implementation
generic consensus algorithm. The implementation
follows a CRTP design, requiring client code to implement
specific functions and types to use consensus in their
application. The interface is undergoing refactoring and
application. The interface is undergoing refactoring and
is not yet finalized.

View File

@@ -1,6 +1,7 @@
# Database Documentation
* [NodeStore](#nodestore)
* [Benchmarks](#benchmarks)
- [NodeStore](#nodestore)
- [Benchmarks](#benchmarks)
# NodeStore
@@ -12,41 +13,43 @@ identified by the hash, which is a 256 bit hash of the blob. The blob is a
variable length block of serialized data. The type identifies what the blob
contains. The fields are as follows:
* `mType`
- `mType`
An enumeration that determines what the blob holds. There are four
different types of objects stored.
An enumeration that determines what the blob holds. There are four
different types of objects stored.
* **ledger**
- **ledger**
A ledger header.
A ledger header.
* **transaction**
- **transaction**
A signed transaction.
A signed transaction.
* **account node**
- **account node**
A node in a ledger's account state tree.
A node in a ledger's account state tree.
* **transaction node**
- **transaction node**
A node in a ledger's transaction tree.
A node in a ledger's transaction tree.
* `mHash`
- `mHash`
A 256-bit hash of the blob.
A 256-bit hash of the blob.
* `mData`
- `mData`
A blob containing the payload. Stored in the following format.
A blob containing the payload. Stored in the following format.
| Byte | | |
| :------ | :----- | :------------------------- |
| 0...7 | unused | |
| 8 | type | NodeObjectType enumeration |
| 9...end | data | body of the object data |
|Byte | | |
|:------|:--------------------|:-------------------------|
|0...7 |unused | |
|8 |type |NodeObjectType enumeration|
|9...end|data |body of the object data |
---
The `NodeStore` provides an interface that stores, in a persistent database, a
collection of NodeObjects that rippled uses as its primary representation of
ledger entries. All ledger entries are stored as NodeObjects and as such, need
@@ -64,41 +67,42 @@ the configuration file [node_db] section as follows.
One or more lines of key / value pairs
Example:
```
type=RocksDB
path=rocksdb
compression=1
```
Choices for 'type' (not case-sensitive)
* **HyperLevelDB**
- **HyperLevelDB**
An improved version of LevelDB (preferred).
An improved version of LevelDB (preferred).
* **LevelDB**
- **LevelDB**
Google's LevelDB database (deprecated).
Google's LevelDB database (deprecated).
* **none**
- **none**
Use no backend.
Use no backend.
* **RocksDB**
- **RocksDB**
Facebook's RocksDB database, builds on LevelDB.
Facebook's RocksDB database, builds on LevelDB.
* **SQLite**
- **SQLite**
Use SQLite.
Use SQLite.
'path' speficies where the backend will store its data files.
Choices for 'compression'
* **0** off
* **1** on (default)
- **0** off
- **1** on (default)
# Benchmarks
@@ -129,48 +133,48 @@ RocksDBQuickFactory is intended to provide a testbed for comparing potential
rocksdb performance with the existing recommended configuration in rippled.cfg.
Through various executions and profiling some conclusions are presented below.
* If the write ahead log is enabled, insert speed soon clogs up under load. The
BatchWriter class intends to stop this from blocking the main threads by queuing
up writes and running them in a separate thread. However, rocksdb already has
separate threads dedicated to flushing the memtable to disk and the memtable is
itself an in-memory queue. The result is two queues with a guarantee of
durability in between. However if the memtable was used as the sole queue and
the rocksdb::Flush() call was manually triggered at opportune moments, possibly
just after ledger close, then that would provide similar, but more predictable
guarantees. It would also remove an unneeded thread and unnecessary memory
usage. An alternative point of view is that because there will always be many
other rippled instances running there is no need for such guarantees. The nodes
will always be available from another peer.
- If the write ahead log is enabled, insert speed soon clogs up under load. The
BatchWriter class intends to stop this from blocking the main threads by queuing
up writes and running them in a separate thread. However, rocksdb already has
separate threads dedicated to flushing the memtable to disk and the memtable is
itself an in-memory queue. The result is two queues with a guarantee of
durability in between. However if the memtable was used as the sole queue and
the rocksdb::Flush() call was manually triggered at opportune moments, possibly
just after ledger close, then that would provide similar, but more predictable
guarantees. It would also remove an unneeded thread and unnecessary memory
usage. An alternative point of view is that because there will always be many
other rippled instances running there is no need for such guarantees. The nodes
will always be available from another peer.
* Lookup in a block was previously using binary search. With rippled's use case
it is highly unlikely that two adjacent key/values will ever be requested one
after the other. Therefore hash indexing of blocks makes much more sense.
Rocksdb has a number of options for hash indexing both memtables and blocks and
these need more testing to find the best choice.
- Lookup in a block was previously using binary search. With rippled's use case
it is highly unlikely that two adjacent key/values will ever be requested one
after the other. Therefore hash indexing of blocks makes much more sense.
Rocksdb has a number of options for hash indexing both memtables and blocks and
these need more testing to find the best choice.
* The current Database implementation has two forms of caching, so the LRU cache
of blocks at Factory level does not make any sense. However, if the hash
indexing and potentially the new [bloom
filter](http://rocksdb.org/blog/1427/new-bloom-filter-format/) can provide
faster lookup for non-existent keys, then potentially the caching could exist at
Factory level.
- The current Database implementation has two forms of caching, so the LRU cache
of blocks at Factory level does not make any sense. However, if the hash
indexing and potentially the new [bloom
filter](http://rocksdb.org/blog/1427/new-bloom-filter-format/) can provide
faster lookup for non-existent keys, then potentially the caching could exist at
Factory level.
* Multiple runs of the benchmarks can yield surprisingly different results. This
can perhaps be attributed to the asynchronous nature of rocksdb's compaction
process. The benchmarks are artifical and create highly unlikely write load to
create the dataset to measure different read access patterns. Therefore multiple
runs of the benchmarks are required to get a feel for the effectiveness of the
changes. This contrasts sharply with the keyvadb benchmarking were highly
repeatable timings were discovered. Also realistically sized datasets are
required to get a correct insight. The number of 2,000,000 key/values (actually
4,000,000 after the two insert benchmarks complete) is too low to get a full
picture.
- Multiple runs of the benchmarks can yield surprisingly different results. This
can perhaps be attributed to the asynchronous nature of rocksdb's compaction
process. The benchmarks are artifical and create highly unlikely write load to
create the dataset to measure different read access patterns. Therefore multiple
runs of the benchmarks are required to get a feel for the effectiveness of the
changes. This contrasts sharply with the keyvadb benchmarking were highly
repeatable timings were discovered. Also realistically sized datasets are
required to get a correct insight. The number of 2,000,000 key/values (actually
4,000,000 after the two insert benchmarks complete) is too low to get a full
picture.
* An interesting side effect of running the benchmarks in a profiler was that a
clear pattern of what RocksDB does under the hood was observable. This led to
the decision to trial hash indexing and also the discovery of the native CRC32
instruction not being used.
- An interesting side effect of running the benchmarks in a profiler was that a
clear pattern of what RocksDB does under the hood was observable. This led to
the decision to trial hash indexing and also the discovery of the native CRC32
instruction not being used.
* Important point to note that is if this factory is tested with an existing set
of sst files none of the old sst files will benefit from indexing changes until
they are compacted at a future point in time.
- Important point to note that is if this factory is tested with an existing set
of sst files none of the old sst files will benefit from indexing changes until
they are compacted at a future point in time.

View File

@@ -39,10 +39,10 @@ The HTTP [request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html) must:
- Use HTTP version 1.1.
- Specify a request URI consisting of a single forward slash character ("/")
indicating the server root. Requests using different URIs are reserved for
future protocol implementations.
indicating the server root. Requests using different URIs are reserved for
future protocol implementations.
- Use the [_HTTP/1.1 Upgrade_][upgrade_header] mechanism with additional custom
fields to communicate protocol specific information related to the upgrade.
fields to communicate protocol specific information related to the upgrade.
HTTP requests which do not conform to this requirements must generate an
appropriate HTTP error and result in the connection being closed.
@@ -72,7 +72,6 @@ Previous-Ledger: q4aKbP7sd5wv+EXArwCmQiWZhq9AwBl2p/hCtpGJNsc=
##### Example HTTP Upgrade Response (Success)
```
HTTP/1.1 101 Switching Protocols
Connection: Upgrade
@@ -102,9 +101,9 @@ Content-Type: application/json
#### Standard Fields
| Field Name | Request | Response |
|--------------------- |:-----------------: |:-----------------: |
| `User-Agent` | :heavy_check_mark: | |
| Field Name | Request | Response |
| ------------ | :----------------: | :------: |
| `User-Agent` | :heavy_check_mark: | |
The `User-Agent` field indicates the version of the software that the
peer that is making the HTTP request is using. No semantic meaning is
@@ -113,9 +112,9 @@ specify the version of the software that is used.
See [RFC2616 §14.43](https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.43).
| Field Name | Request | Response |
|--------------------- |:-----------------: |:-----------------: |
| `Server` | | :heavy_check_mark: |
| Field Name | Request | Response |
| ---------- | :-----: | :----------------: |
| `Server` | | :heavy_check_mark: |
The `Server` field indicates the version of the software that the
peer that is processing the HTTP request is using. No semantic meaning is
@@ -124,18 +123,18 @@ specify the version of the software that is used.
See [RFC2616 §14.38](https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.38).
| Field Name | Request | Response |
|--------------------- |:-----------------: |:-----------------: |
| `Connection` | :heavy_check_mark: | :heavy_check_mark: |
| Field Name | Request | Response |
| ------------ | :----------------: | :----------------: |
| `Connection` | :heavy_check_mark: | :heavy_check_mark: |
The `Connection` field should have a value of `Upgrade` to indicate that a
request to upgrade the connection is being performed.
See [RFC2616 §14.10](https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.10).
| Field Name | Request | Response |
|--------------------- |:-----------------: |:-----------------: |
| `Upgrade` | :heavy_check_mark: | :heavy_check_mark: |
| Field Name | Request | Response |
| ---------- | :----------------: | :----------------: |
| `Upgrade` | :heavy_check_mark: | :heavy_check_mark: |
The `Upgrade` field is part of the standard connection upgrade mechanism and
must be present in both requests and responses. It is used to negotiate the
@@ -156,12 +155,11 @@ equal to 2 and the minor is greater than or equal to 0.
See [RFC 2616 §14.42](https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.42)
#### Custom Fields
| Field Name | Request | Response |
|--------------------- |:-----------------: |:-----------------: |
| `Connect-As` | :heavy_check_mark: | :heavy_check_mark: |
| Field Name | Request | Response |
| ------------ | :----------------: | :----------------: |
| `Connect-As` | :heavy_check_mark: | :heavy_check_mark: |
The mandatory `Connect-As` field is used to specify that type of connection
that is being requested.
@@ -175,10 +173,9 @@ elements specified in the request. If a server processing a request does not
recognize any of the connection types, the request should fail with an
appropriate HTTP error code (e.g. by sending an HTTP 400 "Bad Request" response).
| Field Name | Request | Response |
|--------------------- |:-----------------: |:-----------------: |
| `Remote-IP` | :white_check_mark: | :white_check_mark: |
| Field Name | Request | Response |
| ----------- | :----------------: | :----------------: |
| `Remote-IP` | :white_check_mark: | :white_check_mark: |
The optional `Remote-IP` field contains the string representation of the IP
address of the remote end of the connection as seen from the peer that is
@@ -187,10 +184,9 @@ sending the field.
By observing values of this field from a sufficient number of different
servers, a peer making outgoing connections can deduce its own IP address.
| Field Name | Request | Response |
|--------------------- |:-----------------: |:-----------------: |
| `Local-IP` | :white_check_mark: | :white_check_mark: |
| Field Name | Request | Response |
| ---------- | :----------------: | :----------------: |
| `Local-IP` | :white_check_mark: | :white_check_mark: |
The optional `Local-IP` field contains the string representation of the IP
address that the peer sending the field believes to be its own.
@@ -198,10 +194,9 @@ address that the peer sending the field believes to be its own.
Servers receiving this field can detect IP address mismatches, which may
indicate a potential man-in-the-middle attack.
| Field Name | Request | Response |
|--------------------- |:-----------------: |:-----------------: |
| `Network-ID` | :white_check_mark: | :white_check_mark: |
| Field Name | Request | Response |
| ------------ | :----------------: | :----------------: |
| `Network-ID` | :white_check_mark: | :white_check_mark: |
The optional `Network-ID` can be used to identify to which of several
[parallel networks](https://xrpl.org/parallel-networks.html) the server
@@ -217,10 +212,9 @@ If a server configured to join one network receives a connection request from a
server configured to join another network, the request should fail with an
appropriate HTTP error code (e.g. by sending an HTTP 400 "Bad Request" response).
| Field Name | Request | Response |
|--------------------- |:-----------------: |:-----------------: |
| `Network-Time` | :white_check_mark: | :white_check_mark: |
| Field Name | Request | Response |
| -------------- | :----------------: | :----------------: |
| `Network-Time` | :white_check_mark: | :white_check_mark: |
The optional `Network-Time` field reports the current [time](https://xrpl.org/basic-data-types.html#specifying-time)
according to sender's internal clock.
@@ -232,20 +226,18 @@ each other with an appropriate HTTP error code (e.g. by sending an HTTP 400
It is highly recommended that servers synchronize their clocks using time
synchronization software. For more on this topic, please visit [ntp.org](http://www.ntp.org/).
| Field Name | Request | Response |
|--------------------- |:-----------------: |:-----------------: |
| `Public-Key` | :heavy_check_mark: | :heavy_check_mark: |
| Field Name | Request | Response |
| ------------ | :----------------: | :----------------: |
| `Public-Key` | :heavy_check_mark: | :heavy_check_mark: |
The mandatory `Public-Key` field identifies the sending server's public key,
encoded in base58 using the standard encoding for node public keys.
See: https://xrpl.org/base58-encodings.html
| Field Name | Request | Response |
|--------------------- |:-----------------: |:-----------------: |
| `Server-Domain` | :white_check_mark: | :white_check_mark: |
| Field Name | Request | Response |
| --------------- | :----------------: | :----------------: |
| `Server-Domain` | :white_check_mark: | :white_check_mark: |
The optional `Server-Domain` field allows a server to report the domain that
it is operating under. The value is configured by the server administrator in
@@ -259,10 +251,9 @@ under the specified domain and locating the public key of this server under the
Sending a malformed domain will prevent a connection from being established.
| Field Name | Request | Response |
|--------------------- |:-----------------: |:-----------------: |
| `Session-Signature` | :heavy_check_mark: | :heavy_check_mark: |
| Field Name | Request | Response |
| ------------------- | :----------------: | :----------------: |
| `Session-Signature` | :heavy_check_mark: | :heavy_check_mark: |
The `Session-Signature` field is mandatory and is used to secure the peer link
against certain types of attack. For more details see "Session Signature" below.
@@ -272,36 +263,35 @@ should support both **Base64** and **HEX** encoding for this value.
For more details on this field, please see **Session Signature** below.
| Field Name | Request | Response |
|--------------------- |:-----------------: |:-----------------: |
| `Crawl` | :white_check_mark: | :white_check_mark: |
| Field Name | Request | Response |
| ---------- | :----------------: | :----------------: |
| `Crawl` | :white_check_mark: | :white_check_mark: |
The optional `Crawl` field can be used by a server to indicate whether peers
should include it in crawl reports.
The field can take two values:
- **`Public`**: The server's IP address and port should be included in crawl
reports.
reports.
- **`Private`**: The server's IP address and port should not be included in
crawl reports. _This is the default, if the field is omitted._
crawl reports. _This is the default, if the field is omitted._
For more on the Peer Crawler, please visit https://xrpl.org/peer-crawler.html.
| Field Name | Request | Response |
|--------------------- |:-----------------: |:-----------------: |
| `Closed-Ledger` | :white_check_mark: | :white_check_mark: |
| Field Name | Request | Response |
| --------------- | :----------------: | :----------------: |
| `Closed-Ledger` | :white_check_mark: | :white_check_mark: |
If present, identifies the hash of the last ledger that the sending server
considers to be closed.
The value is encoded as **HEX**, but implementations should support both
**Base64** and **HEX** encoding for this value for legacy purposes.
| Field Name | Request | Response |
|--------------------- |:-----------------: |:-----------------: |
| `Previous-Ledger` | :white_check_mark: | :white_check_mark: |
| Field Name | Request | Response |
| ----------------- | :----------------: | :----------------: |
| `Previous-Ledger` | :white_check_mark: | :white_check_mark: |
If present, identifies the hash of the parent ledger that the sending server
considers to be closed.
@@ -317,7 +307,6 @@ and values in both requests and responses.
Implementations should not reject requests because of the presence of fields
that they do not understand.
### Session Signature
Even for SSL/TLS encrypted connections, it is possible for an attacker to mount
@@ -365,8 +354,7 @@ transferred between A and B and will not be able to intelligently tamper with th
message stream between Alice and Bob, although she may be still be able to inject
delays or terminate the link.
# Ripple Clustering #
# Ripple Clustering
A cluster consists of more than one Ripple server under common
administration that share load information, distribute cryptography
@@ -374,19 +362,19 @@ operations, and provide greater response consistency.
Cluster nodes are identified by their public node keys. Cluster nodes
exchange information about endpoints that are imposing load upon them.
Cluster nodes share information about their internal load status. Cluster
Cluster nodes share information about their internal load status. Cluster
nodes do not have to verify the cryptographic signatures on messages
received from other cluster nodes.
## Configuration ##
## Configuration
A server's public key can be determined from the output of the `server_info`
command. The key is in the `pubkey_node` value, and is a text string
beginning with the letter `n`. The key is maintained across runs in a
command. The key is in the `pubkey_node` value, and is a text string
beginning with the letter `n`. The key is maintained across runs in a
database.
Cluster members are configured in the `rippled.cfg` file under
`[cluster_nodes]`. Each member should be configured on a line beginning
`[cluster_nodes]`. Each member should be configured on a line beginning
with the node public key, followed optionally by a space and a friendly
name.
@@ -404,23 +392,23 @@ New spokes can be added as follows:
- Restart each hub, one by one
- Restart the spoke
## Transaction Behavior ##
## Transaction Behavior
When a transaction is received from a cluster member, several normal checks
are bypassed:
Signature checking is bypassed because we trust that a cluster member would
not relay a transaction with an incorrect signature. Validators may wish to
not relay a transaction with an incorrect signature. Validators may wish to
disable this feature, preferring the additional load to get the additional
security of having validators check each transaction.
Local checks for transaction checking are also bypassed. For example, a
server will not reject a transaction from a cluster peer because the fee
does not meet its current relay fee. It is preferable to keep the cluster
does not meet its current relay fee. It is preferable to keep the cluster
in agreement and permit confirmation from one cluster member to more
reliably indicate the transaction's acceptance by the cluster.
## Server Load Information ##
## Server Load Information
Cluster members exchange information on their server's load level. The load
level is essentially the amount by which the normal fee levels are multiplied
@@ -431,22 +419,22 @@ fee, is the highest of its local load level, the network load level, and the
cluster load level. The cluster load level is the median load level reported
by a cluster member.
## Gossip ##
## Gossip
Gossip is the mechanism by which cluster members share information about
endpoints (typically IPv4 addresses) that are imposing unusually high load
on them. The endpoint load manager takes into account gossip to reduce the
on them. The endpoint load manager takes into account gossip to reduce the
amount of load the endpoint is permitted to impose on the local server
before it is warned, disconnected, or banned.
Suppose, for example, that an attacker controls a large number of IP
addresses, and with these, he can send sufficient requests to overload a
server. Without gossip, he could use these same addresses to overload all
the servers in a cluster. With gossip, if he chooses to use the same IP
server. Without gossip, he could use these same addresses to overload all
the servers in a cluster. With gossip, if he chooses to use the same IP
address to impose load on more than one server, he will find that the amount
of load he can impose before getting disconnected is much lower.
## Monitoring ##
## Monitoring
The `peers` command will report on the status of the cluster. The `cluster`
object will contain one entry for each member of the cluster (either configured

View File

@@ -1,4 +1,3 @@
# PeerFinder
## Introduction
@@ -31,23 +30,23 @@ slots_.
PeerFinder has these responsibilities
* Maintain a persistent set of endpoint addresses suitable for bootstrapping
- Maintain a persistent set of endpoint addresses suitable for bootstrapping
into the peer to peer overlay, ranked by relative locally observed utility.
* Send and receive protocol messages for discovery of endpoint addresses.
- Send and receive protocol messages for discovery of endpoint addresses.
* Provide endpoint addresses to new peers that need them.
- Provide endpoint addresses to new peers that need them.
* Maintain connections to a configured set of fixed peers.
- Maintain connections to a configured set of fixed peers.
* Impose limits on the various slots consumed by peer connections.
- Impose limits on the various slots consumed by peer connections.
* Initiate outgoing connection attempts to endpoint addresses to maintain the
- Initiate outgoing connection attempts to endpoint addresses to maintain the
overlay connectivity and fixed peer policies.
* Verify the connectivity of neighbors who advertise inbound connection slots.
- Verify the connectivity of neighbors who advertise inbound connection slots.
* Prevent duplicate connections and connections to self.
- Prevent duplicate connections and connections to self.
---
@@ -79,28 +78,28 @@ The `Config` structure defines the operational parameters of the PeerFinder.
Some values come from the configuration file while others are calculated via
tuned heuristics. The fields are as follows:
* `autoConnect`
- `autoConnect`
A flag indicating whether or not the Autoconnect feature is enabled.
* `wantIncoming`
- `wantIncoming`
A flag indicating whether or not the peer desires inbound connections. When
this flag is turned off, a peer will not advertise itself in Endpoint
messages.
* `listeningPort`
- `listeningPort`
The port number to use when creating the listening socket for peer
connections.
* `maxPeers`
- `maxPeers`
The largest number of active peer connections to allow. This includes inbound
and outbound connections, but excludes fixed and cluster peers. There is an
implementation defined floor on this value.
* `outPeers`
- `outPeers`
The number of automatic outbound connections that PeerFinder will maintain
when the Autoconnect feature is enabled. The value is computed with fractional
@@ -161,8 +160,8 @@ Endpoint messages are received from the overlay over time.
The `Bootcache` stores IP addresses useful for gaining initial connections.
Each address is associated with the following metadata:
* **Valence**
- **Valence**
A signed integer which represents the number of successful
consecutive connection attempts when positive, and the number of
@@ -202,30 +201,30 @@ a slot. Slots have properties and state associated with them:
The slot state represents the current stage of the connection as it passes
through the business logic for establishing peer connections.
* `accept`
- `accept`
The accept state is an initial state resulting from accepting an incoming
connection request on a listening socket. The remote IP address and port
are known, and a handshake is expected next.
* `connect`
- `connect`
The connect state is an initial state used when actively establishing outbound
connection attempts. The desired remote IP address and port are known.
* `connected`
- `connected`
When an outbound connection attempt succeeds, it moves to the connected state.
The handshake is initiated but not completed.
* `active`
- `active`
The state becomes Active when a connection in either the Accepted or Connected
state completes the handshake process, and a slot is available based on the
properties. If no slot is available when the handshake completes, the socket
is gracefully closed.
* `closing`
- `closing`
The Closing state represents a connected socket in the process of being
gracefully closed.
@@ -234,13 +233,13 @@ through the business logic for establishing peer connections.
Slot properties may be combined and are not mutually exclusive.
* **Inbound**
- **Inbound**
An inbound slot is the condition of a socket which has accepted an incoming
connection request. A connection which is not inbound is by definition
outbound.
* **Fixed**
- **Fixed**
A fixed slot is a desired connection to a known peer identified by IP address,
usually entered manually in the configuration file. For the purpose of
@@ -248,14 +247,14 @@ Slot properties may be combined and are not mutually exclusive.
although only the IP address is checked to determine if the fixed peer is
already connected. Fixed slots do not count towards connection limits.
* **Cluster**
- **Cluster**
A cluster slot is a connection which has completed the handshake stage, whose
public key matches a known public key usually entered manually in the
configuration file or learned through overlay messages from other trusted
peers. Cluster slots do not count towards connection limits.
* **Superpeer** (forthcoming)
- **Superpeer** (forthcoming)
A superpeer slot is a connection to a peer which can accept incoming
connections, meets certain resource availaibility requirements (such as
@@ -279,7 +278,7 @@ Cluster slots are identified by the public key and set up during the
initialization of the manager or discovered upon receipt of messages in the
overlay from trusted connections.
--------------------------------------------------------------------------------
---
# Algorithms
@@ -295,8 +294,8 @@ This stage is invoked when the number of active fixed connections is below the
number of fixed connections specified in the configuration, and one of the
following is true:
* There are eligible fixed addresses to try
* Any outbound connection attempts are in progress
- There are eligible fixed addresses to try
- Any outbound connection attempts are in progress
Each fixed address is associated with a retry timer. On a fixed connection
failure, the timer is reset so that the address is not tried for some amount
@@ -317,8 +316,8 @@ The Livecache is invoked when Stage 1 is not active, autoconnect is enabled,
and the number of active outbound connections is below the number desired. The
stage remains active while:
* The Livecache has addresses to try
* Any outbound connection attempts are in progress
- The Livecache has addresses to try
- Any outbound connection attempts are in progress
PeerFinder makes its best effort to exhaust addresses in the Livecache before
moving on to the Bootcache, because Livecache addresses are highly likely
@@ -333,7 +332,7 @@ The Bootcache is invoked when Stage 1 and Stage 2 are not active, autoconnect
is enabled, and the number of active outbound connections is below the number
desired. The stage remains active while:
* There are addresses in the cache that have not been tried recently.
- There are addresses in the cache that have not been tried recently.
Entries in the Bootcache are ranked, with highly connectible addresses preferred
over others. Connection attempts to Bootcache addresses are very likely to
@@ -342,7 +341,7 @@ not have open slots. Before the remote peer closes the connection it will send
a handful of addresses from its Livecache to help the new peer coming online
obtain connections.
--------------------------------------------------------------------------------
---
# References
@@ -352,10 +351,11 @@ Much of the work in PeerFinder was inspired by earlier work in Gnutella:
_By Christopher Rohrs and Vincent Falco_
[Gnutella 0.6 Protocol:](http://rfc-gnutella.sourceforge.net/src/rfc-0_6-draft.html) Sections:
* 2.2.2 Ping (0x00)
* 2.2.3 Pong (0x01)
* 2.2.4 Use of Ping and Pong messages
* 2.2.4.1 A simple pong caching scheme
* 2.2.4.2 Other pong caching schemes
- 2.2.2 Ping (0x00)
- 2.2.3 Pong (0x01)
- 2.2.4 Use of Ping and Pong messages
- 2.2.4.1 A simple pong caching scheme
- 2.2.4.2 Other pong caching schemes
[overlay_network]: http://en.wikipedia.org/wiki/Overlay_network

View File

@@ -2,15 +2,16 @@
## Introduction.
By default, an RPC handler runs as an uninterrupted task on the JobQueue. This
By default, an RPC handler runs as an uninterrupted task on the JobQueue. This
is fine for commands that are fast to compute but might not be acceptable for
tasks that require multiple parts or are large, like a full ledger.
For this purpose, the rippled RPC handler allows *suspension with continuation*
For this purpose, the rippled RPC handler allows _suspension with continuation_
- a request to suspend execution of the RPC response and to continue it after
some function or job has been executed. A default continuation is supplied
which simply reschedules the job on the JobQueue, or the programmer can supply
their own.
some function or job has been executed. A default continuation is supplied
which simply reschedules the job on the JobQueue, or the programmer can supply
their own.
## The classes.
@@ -28,16 +29,16 @@ would prevent any other task from making forward progress when you call a
`Callback`.
A `Continuation` is a function that is given a `Callback` and promises to call
it later. A `Continuation` guarantees to call the `Callback` exactly once at
it later. A `Continuation` guarantees to call the `Callback` exactly once at
some point in the future, but it does not have to be immediately or even in the
current thread.
A `Suspend` is a function belonging to a `Coroutine`. A `Suspend` runs a
A `Suspend` is a function belonging to a `Coroutine`. A `Suspend` runs a
`Continuation`, passing it a `Callback` that continues execution of the
`Coroutine`.
And finally, a `Coroutine` is a `std::function` which is given a
`Suspend`. This is what the RPC handler gives to the coroutine manager,
`Suspend`. This is what the RPC handler gives to the coroutine manager,
expecting to get called back with a `Suspend` and to be able to start execution.
## The flow of control.

View File

@@ -1,4 +1,4 @@
# SHAMap Introduction #
# SHAMap Introduction
March 2020
@@ -30,20 +30,20 @@ The root node is always a SHAMapInnerNode.
A given `SHAMap` always stores only one of three kinds of data:
* Transactions with metadata
* Transactions without metadata, or
* Account states.
- Transactions with metadata
- Transactions without metadata, or
- Account states.
So all of the leaf nodes of a particular `SHAMap` will always have a uniform type.
The inner nodes carry no data other than the hash of the nodes beneath them.
All nodes are owned by shared_ptrs resident in either other nodes, or in case of
the root node, a shared_ptr in the `SHAMap` itself. The use of shared_ptrs
permits more than one `SHAMap` at a time to share ownership of a node. This
the root node, a shared_ptr in the `SHAMap` itself. The use of shared_ptrs
permits more than one `SHAMap` at a time to share ownership of a node. This
occurs (for example), when a copy of a `SHAMap` is made.
Copies are made with the `snapShot` function as opposed to the `SHAMap` copy
constructor. See the section on `SHAMap` creation for more details about
constructor. See the section on `SHAMap` creation for more details about
`snapShot`.
Sequence numbers are used to further customize the node ownership strategy. See
@@ -51,62 +51,62 @@ the section on sequence numbers for details on sequence numbers.
![node diagram](https://user-images.githubusercontent.com/46455409/77350005-1ef12c80-6cf9-11ea-9c8d-56410f442859.png)
## Mutability ##
## Mutability
There are two different ways of building and using a `SHAMap`:
1. A mutable `SHAMap` and
2. An immutable `SHAMap`
1. A mutable `SHAMap` and
2. An immutable `SHAMap`
The distinction here is not of the classic C++ immutable-means-unchanging sense.
An immutable `SHAMap` contains *nodes* that are immutable. Also, once a node has
An immutable `SHAMap` contains _nodes_ that are immutable. Also, once a node has
been located in an immutable `SHAMap`, that node is guaranteed to persist in that
`SHAMap` for the lifetime of the `SHAMap`.
So, somewhat counter-intuitively, an immutable `SHAMap` may grow as new nodes are
introduced. But an immutable `SHAMap` will never get smaller (until it entirely
evaporates when it is destroyed). Nodes, once introduced to the immutable
`SHAMap`, also never change their location in memory. So nodes in an immutable
introduced. But an immutable `SHAMap` will never get smaller (until it entirely
evaporates when it is destroyed). Nodes, once introduced to the immutable
`SHAMap`, also never change their location in memory. So nodes in an immutable
`SHAMap` can be handled using raw pointers (if you're careful).
One consequence of this design is that an immutable `SHAMap` can never be
"trimmed". There is no way to identify unnecessary nodes in an immutable `SHAMap`
that could be removed. Once a node has been brought into the in-memory `SHAMap`,
"trimmed". There is no way to identify unnecessary nodes in an immutable `SHAMap`
that could be removed. Once a node has been brought into the in-memory `SHAMap`,
that node stays in memory for the life of the `SHAMap`.
Most `SHAMap`s are immutable, in the sense that they don't modify or remove their
contained nodes.
An example where a mutable `SHAMap` is required is when we want to apply
transactions to the last closed ledger. To do so we'd make a mutable snapshot
transactions to the last closed ledger. To do so we'd make a mutable snapshot
of the state trie and then start applying transactions to it. Because the
snapshot is mutable, changes to nodes in the snapshot will not affect nodes in
other `SHAMap`s.
An example using a immutable ledger would be when there's an open ledger and
some piece of code wishes to query the state of the ledger. In this case we
some piece of code wishes to query the state of the ledger. In this case we
don't wish to change the state of the `SHAMap`, so we'd use an immutable snapshot.
## Sequence numbers ##
## Sequence numbers
Both `SHAMap`s and their nodes carry a sequence number. This is simply an
Both `SHAMap`s and their nodes carry a sequence number. This is simply an
unsigned number that indicates ownership or membership, or a non-membership.
`SHAMap`s sequence numbers normally start out as 1. However when a snap-shot of
`SHAMap`s sequence numbers normally start out as 1. However when a snap-shot of
a `SHAMap` is made, the copy's sequence number is 1 greater than the original.
The nodes of a `SHAMap` have their own copy of a sequence number. If the `SHAMap`
The nodes of a `SHAMap` have their own copy of a sequence number. If the `SHAMap`
is mutable, meaning it can change, then all of its nodes must have the
same sequence number as the `SHAMap` itself. This enforces an invariant that none
same sequence number as the `SHAMap` itself. This enforces an invariant that none
of the nodes are shared with other `SHAMap`s.
When a `SHAMap` needs to have a private copy of a node, not shared by any other
`SHAMap`, it first clones it and then sets the new copy to have a sequence number
equal to the `SHAMap` sequence number. The `unshareNode` is a private utility
equal to the `SHAMap` sequence number. The `unshareNode` is a private utility
which automates the task of first checking if the node is already sharable, and
if so, cloning it and giving it the proper sequence number. An example case
if so, cloning it and giving it the proper sequence number. An example case
where a private copy is needed is when an inner node needs to have a child
pointer altered. Any modification to a node will require a non-shared node.
pointer altered. Any modification to a node will require a non-shared node.
When a `SHAMap` decides that it is safe to share a node of its own, it sets the
node's sequence number to 0 (a `SHAMap` never has a sequence number of 0). This
@@ -116,40 +116,40 @@ Note that other objects in rippled also have sequence numbers (e.g. ledgers).
The `SHAMap` and node sequence numbers should not be confused with these other
sequence numbers (no relation).
## SHAMap Creation ##
## SHAMap Creation
A `SHAMap` is usually not created from vacuum. Once an initial `SHAMap` is
A `SHAMap` is usually not created from vacuum. Once an initial `SHAMap` is
constructed, later `SHAMap`s are usually created by calling snapShot(bool
isMutable) on the original `SHAMap`. The returned `SHAMap` has the expected
isMutable) on the original `SHAMap`. The returned `SHAMap` has the expected
characteristics (mutable or immutable) based on the passed in flag.
It is cheaper to make an immutable snapshot of a `SHAMap` than to make a mutable
snapshot. If the `SHAMap` snapshot is mutable then sharable nodes must be
snapshot. If the `SHAMap` snapshot is mutable then sharable nodes must be
copied before they are placed in the mutable map.
A new `SHAMap` is created with each new ledger round. Transactions not executed
A new `SHAMap` is created with each new ledger round. Transactions not executed
in the previous ledger populate the `SHAMap` for the new ledger.
## Storing SHAMap data in the database ##
## Storing SHAMap data in the database
When consensus is reached, the ledger is closed. As part of this process, the
When consensus is reached, the ledger is closed. As part of this process, the
`SHAMap` is stored to the database by calling `SHAMap::flushDirty`.
Both `unshare()` and `flushDirty` walk the `SHAMap` by calling
`SHAMap::walkSubTree`. As `unshare()` walks the trie, nodes are not written to
`SHAMap::walkSubTree`. As `unshare()` walks the trie, nodes are not written to
the database, and as `flushDirty` walks the trie nodes are written to the
database. `walkSubTree` visits every node in the trie. This process must ensure
that each node is only owned by this trie, and so "unshares" as it walks each
node (from the root down). This is done in the `preFlushNode` function by
ensuring that the node has a sequence number equal to that of the `SHAMap`. If
node (from the root down). This is done in the `preFlushNode` function by
ensuring that the node has a sequence number equal to that of the `SHAMap`. If
the node doesn't, it is cloned.
For each inner node encountered (starting with the root node), each of the
children are inspected (from 1 to 16). For each child, if it has a non-zero
sequence number (unshareable), the child is first copied. Then if the child is
an inner node, we recurse down to that node's children. Otherwise we've found a
leaf node and that node is written to the database. A count of each leaf node
that is visited is kept. The hash of the data in the leaf node is computed at
children are inspected (from 1 to 16). For each child, if it has a non-zero
sequence number (unshareable), the child is first copied. Then if the child is
an inner node, we recurse down to that node's children. Otherwise we've found a
leaf node and that node is written to the database. A count of each leaf node
that is visited is kept. The hash of the data in the leaf node is computed at
this time, and the child is reassigned back into the parent inner node just in
case the COW operation created a new pointer to this leaf node.
@@ -157,22 +157,22 @@ After processing each node, the node is then marked as sharable again by setting
its sequence number to 0.
After all of an inner node's children are processed, then its hash is updated
and the inner node is written to the database. Then this inner node is assigned
and the inner node is written to the database. Then this inner node is assigned
back into it's parent node, again in case the COW operation created a new
pointer to it.
## Walking a SHAMap ##
## Walking a SHAMap
The private function `SHAMap::walkTowardsKey` is a good example of *how* to walk
The private function `SHAMap::walkTowardsKey` is a good example of _how_ to walk
a `SHAMap`, and the various functions that call `walkTowardsKey` are good examples
of *why* one would want to walk a `SHAMap` (e.g. `SHAMap::findKey`).
of _why_ one would want to walk a `SHAMap` (e.g. `SHAMap::findKey`).
`walkTowardsKey` always starts at the root of the `SHAMap` and traverses down
through the inner nodes, looking for a leaf node along a path in the trie
designated by a `uint256`.
As one walks the trie, one can *optionally* keep a stack of nodes that one has
passed through. This isn't necessary for walking the trie, but many clients
will use the stack after finding the desired node. For example if one is
As one walks the trie, one can _optionally_ keep a stack of nodes that one has
passed through. This isn't necessary for walking the trie, but many clients
will use the stack after finding the desired node. For example if one is
deleting a node from the trie, the stack is handy for repairing invariants in
the trie after the deletion.
@@ -189,10 +189,10 @@ how we use a `SHAMapNodeID` to select a "branch" (child) by indexing into a
path at a given depth.
While the current node is an inner node, traversing down the trie from the root
continues, unless the path indicates a child that does not exist. And in this
continues, unless the path indicates a child that does not exist. And in this
case, `nullptr` is returned to indicate no leaf node along the given path
exists. Otherwise a leaf node is found and a (non-owning) pointer to it is
returned. At each step, if a stack is requested, a
exists. Otherwise a leaf node is found and a (non-owning) pointer to it is
returned. At each step, if a stack is requested, a
`pair<shared_ptr<SHAMapTreeNode>, SHAMapNodeID>` is pushed onto the stack.
When a child node is found by `selectBranch`, the traversal to that node
@@ -210,35 +210,35 @@ The first step consists of several attempts to find the node in various places:
If the node is not found in the trie, then it is installed into the trie as part
of the traversal process.
## Late-arriving Nodes ##
## Late-arriving Nodes
As we noted earlier, `SHAMap`s (even immutable ones) may grow. If a `SHAMap` is
As we noted earlier, `SHAMap`s (even immutable ones) may grow. If a `SHAMap` is
searching for a node and runs into an empty spot in the trie, then the `SHAMap`
looks to see if the node exists but has not yet been made part of the map. This
operation is performed in the `SHAMap::fetchNodeNT()` method. The *NT*
looks to see if the node exists but has not yet been made part of the map. This
operation is performed in the `SHAMap::fetchNodeNT()` method. The _NT_
is this case stands for 'No Throw'.
The `fetchNodeNT()` method goes through three phases:
1. By calling `cacheLookup()` we attempt to locate the missing node in the
TreeNodeCache. The TreeNodeCache is a cache of immutable SHAMapTreeNodes
1. By calling `cacheLookup()` we attempt to locate the missing node in the
TreeNodeCache. The TreeNodeCache is a cache of immutable SHAMapTreeNodes
that are shared across all `SHAMap`s.
Any SHAMapLeafNode that is immutable has a sequence number of zero
(sharable). When a mutable `SHAMap` is created then its SHAMapTreeNodes are
given non-zero sequence numbers (unsharable). But all nodes in the
given non-zero sequence numbers (unsharable). But all nodes in the
TreeNodeCache are immutable, so if one is found here, its sequence number
will be 0.
2. If the node is not in the TreeNodeCache, we attempt to locate the node
in the historic data stored by the data base. The call to to
2. If the node is not in the TreeNodeCache, we attempt to locate the node
in the historic data stored by the data base. The call to to
`fetchNodeFromDB(hash)` does that work for us.
3. Finally if a filter exists, we check if it can supply the node. This is
3. Finally if a filter exists, we check if it can supply the node. This is
typically the LedgerMaster which tracks the current ledger and ledgers
in the process of closing.
## Canonicalize ##
## Canonicalize
`canonicalize()` is called every time a node is introduced into the `SHAMap`.
@@ -251,51 +251,50 @@ by favoring the copy already in the `TreeNodeCache`.
By using `canonicalize()` we manage a thread race condition where two different
threads might both recognize the lack of a SHAMapLeafNode at the same time
(during a fetch). If they both attempt to insert the node into the `SHAMap`, then
(during a fetch). If they both attempt to insert the node into the `SHAMap`, then
`canonicalize` makes sure that the first node in wins and the slower thread
receives back a pointer to the node inserted by the faster thread. Recall
receives back a pointer to the node inserted by the faster thread. Recall
that these two `SHAMap`s will share the same `TreeNodeCache`.
## `TreeNodeCache` ##
## `TreeNodeCache`
The `TreeNodeCache` is a `std::unordered_map` keyed on the hash of the
`SHAMap` node. The stored type consists of `shared_ptr<SHAMapTreeNode>`,
`SHAMap` node. The stored type consists of `shared_ptr<SHAMapTreeNode>`,
`weak_ptr<SHAMapTreeNode>`, and a time point indicating the most recent
access of this node in the cache. The time point is based on
access of this node in the cache. The time point is based on
`std::chrono::steady_clock`.
The container uses a cryptographically secure hash that is randomly seeded.
The `TreeNodeCache` also carries with it various data used for statistics
and logging, and a target age for the contained nodes. When the target age
and logging, and a target age for the contained nodes. When the target age
for a node is exceeded, and there are no more references to the node, the
node is removed from the `TreeNodeCache`.
## `FullBelowCache` ##
## `FullBelowCache`
This cache remembers which trie keys have all of their children resident in a
`SHAMap`. This optimizes the process of acquiring a complete trie. This is used
when creating the missing nodes list. Missing nodes are those nodes that a
`SHAMap`. This optimizes the process of acquiring a complete trie. This is used
when creating the missing nodes list. Missing nodes are those nodes that a
`SHAMap` refers to but that are not stored in the local database.
As a depth-first walk of a `SHAMap` is performed, if an inner node answers true to
`isFullBelow()` then it is known that none of this node's children are missing
nodes, and thus that subtree does not need to be walked. These nodes are stored
in the FullBelowCache. Subsequent walks check the FullBelowCache first when
nodes, and thus that subtree does not need to be walked. These nodes are stored
in the FullBelowCache. Subsequent walks check the FullBelowCache first when
encountering a node, and ignore that subtree if found.
## `SHAMapTreeNode` ##
## `SHAMapTreeNode`
This is an abstract base class for the concrete node types. It holds the
This is an abstract base class for the concrete node types. It holds the
following common data:
1. A hash
2. An identifier used to perform copy-on-write operations
### `SHAMapInnerNode`
### `SHAMapInnerNode` ###
`SHAMapInnerNode` publicly inherits directly from `SHAMapTreeNode`. It holds
`SHAMapInnerNode` publicly inherits directly from `SHAMapTreeNode`. It holds
the following data:
1. Up to 16 child nodes, each held with a shared_ptr.
@@ -304,36 +303,34 @@ the following data:
4. An identifier used to determine whether the map below this node is
fully populated
### `SHAMapLeafNode` ###
### `SHAMapLeafNode`
`SHAMapLeafNode` is an abstract class which publicly inherits directly from
`SHAMapTreeNode`. It isIt holds the
`SHAMapTreeNode`. It isIt holds the
following data:
1. A shared_ptr to a const SHAMapItem.
#### `SHAMapAccountStateLeafNode` ####
#### `SHAMapAccountStateLeafNode`
`SHAMapAccountStateLeafNode` is a class which publicly inherits directly from
`SHAMapLeafNode`. It is used to represent entries (i.e. account objects, escrow
objects, trust lines, etc.) in a state map.
#### `SHAMapTxLeafNode` ####
#### `SHAMapTxLeafNode`
`SHAMapTxLeafNode` is a class which publicly inherits directly from
`SHAMapLeafNode`. It is used to represent transactions in a state map.
#### `SHAMapTxPlusMetaLeafNode` ####
#### `SHAMapTxPlusMetaLeafNode`
`SHAMapTxPlusMetaLeafNode` is a class which publicly inherits directly from
`SHAMapLeafNode`. It is used to represent transactions along with metadata
associated with this transaction in a state map.
## SHAMapItem ##
## SHAMapItem
This holds the following data:
1. uint256. The hash of the data.
2. vector<unsigned char>. The data (transactions, account info).
1. uint256. The hash of the data.
2. vector<unsigned char>. The data (transactions, account info).