chore: Run prettier on all files (#5657)

This commit is contained in:
Mayukha Vadari
2025-08-11 12:15:42 -04:00
committed by GitHub
parent abf12db788
commit 97f0747e10
60 changed files with 6244 additions and 6127 deletions

View File

@@ -30,7 +30,7 @@ the ledger (so the entire network has the same view). This will help the network
see which validators are **currently** unreliable, and adjust their quorum
calculation accordingly.
*Improving the liveness of the network is the main motivation for the negative UNL.*
_Improving the liveness of the network is the main motivation for the negative UNL._
### Targeted Faults
@@ -53,16 +53,17 @@ even if the number of remaining validators gets to 60%. Say we have a network
with 10 validators on the UNL and everything is operating correctly. The quorum
required for this network would be 8 (80% of 10). When validators fail, the
quorum required would be as low as 6 (60% of 10), which is the absolute
***minimum quorum***. We need the absolute minimum quorum to be strictly greater
**_minimum quorum_**. We need the absolute minimum quorum to be strictly greater
than 50% of the original UNL so that there cannot be two partitions of
well-behaved nodes headed in different directions. We arbitrarily choose 60% as
the minimum quorum to give a margin of safety.
Consider these events in the absence of negative UNL:
1. 1:00pm - validator1 fails, votes vs. quorum: 9 >= 8, we have quorum
1. 3:00pm - validator2 fails, votes vs. quorum: 8 >= 8, we have quorum
1. 5:00pm - validator3 fails, votes vs. quorum: 7 < 8, we dont have quorum
* **network cannot validate new ledgers with 3 failed validators**
- **network cannot validate new ledgers with 3 failed validators**
We're below 80% agreement, so new ledgers cannot be validated. This is how the
XRP Ledger operates today, but if the negative UNL was enabled, the events would
@@ -70,18 +71,20 @@ happen as follows. (Please note that the events below are from a simplified
version of our protocol.)
1. 1:00pm - validator1 fails, votes vs. quorum: 9 >= 8, we have quorum
1. 1:40pm - network adds validator1 to negative UNL, quorum changes to ceil(9 * 0.8), or 8
1. 1:40pm - network adds validator1 to negative UNL, quorum changes to ceil(9 \* 0.8), or 8
1. 3:00pm - validator2 fails, votes vs. quorum: 8 >= 8, we have quorum
1. 3:40pm - network adds validator2 to negative UNL, quorum changes to ceil(8 * 0.8), or 7
1. 3:40pm - network adds validator2 to negative UNL, quorum changes to ceil(8 \* 0.8), or 7
1. 5:00pm - validator3 fails, votes vs. quorum: 7 >= 7, we have quorum
1. 5:40pm - network adds validator3 to negative UNL, quorum changes to ceil(7 * 0.8), or 6
1. 5:40pm - network adds validator3 to negative UNL, quorum changes to ceil(7 \* 0.8), or 6
1. 7:00pm - validator4 fails, votes vs. quorum: 6 >= 6, we have quorum
* **network can still validate new ledgers with 4 failed validators**
- **network can still validate new ledgers with 4 failed validators**
## External Interactions
### Message Format Changes
This proposal will:
1. add a new pseudo-transaction type
1. add the negative UNL to the ledger data structure.
@@ -89,19 +92,20 @@ Any tools or systems that rely on the format of this data will have to be
updated.
### Amendment
This feature **will** need an amendment to activate.
## Design
This section discusses the following topics about the Negative UNL design:
* [Negative UNL protocol overview](#Negative-UNL-Protocol-Overview)
* [Validator reliability measurement](#Validator-Reliability-Measurement)
* [Format Changes](#Format-Changes)
* [Negative UNL maintenance](#Negative-UNL-Maintenance)
* [Quorum size calculation](#Quorum-Size-Calculation)
* [Filter validation messages](#Filter-Validation-Messages)
* [High level sequence diagram of code
- [Negative UNL protocol overview](#Negative-UNL-Protocol-Overview)
- [Validator reliability measurement](#Validator-Reliability-Measurement)
- [Format Changes](#Format-Changes)
- [Negative UNL maintenance](#Negative-UNL-Maintenance)
- [Quorum size calculation](#Quorum-Size-Calculation)
- [Filter validation messages](#Filter-Validation-Messages)
- [High level sequence diagram of code
changes](#High-Level-Sequence-Diagram-of-Code-Changes)
### Negative UNL Protocol Overview
@@ -114,9 +118,9 @@ with V in their UNL adjust the quorum and Vs validation message is not counte
when verifying if a ledger is fully validated. Vs flow of messages and network
interactions, however, will remain the same.
We define the ***effective UNL** = original UNL - negative UNL*, and the
***effective quorum*** as the quorum of the *effective UNL*. And we set
*effective quorum = Ceiling(80% * effective UNL)*.
We define the **\*effective UNL** = original UNL - negative UNL\*, and the
**_effective quorum_** as the quorum of the _effective UNL_. And we set
_effective quorum = Ceiling(80% _ effective UNL)\*.
### Validator Reliability Measurement
@@ -126,16 +130,16 @@ measure about its validators, but we have chosen ledger validation messages.
This is because every validator shall send one and only one signed validation
message per ledger. This keeps the measurement simple and removes
timing/clock-sync issues. A node will measure the percentage of agreeing
validation messages (*PAV*) received from each validator on the node's UNL. Note
validation messages (_PAV_) received from each validator on the node's UNL. Note
that the node will only count the validation messages that agree with its own
validations.
We define the **PAV** as the **P**ercentage of **A**greed **V**alidation
messages received for the last N ledgers, where N = 256 by default.
When the PAV drops below the ***low-water mark***, the validator is considered
When the PAV drops below the **_low-water mark_**, the validator is considered
unreliable, and is a candidate to be disabled by being added to the negative
UNL. A validator must have a PAV higher than the ***high-water mark*** to be
UNL. A validator must have a PAV higher than the **_high-water mark_** to be
re-enabled. The validator is re-enabled by removing it from the negative UNL. In
the implementation, we plan to set the low-water mark as 50% and the high-water
mark as 80%.
@@ -143,22 +147,24 @@ mark as 80%.
### Format Changes
The negative UNL component in a ledger contains three fields.
* ***NegativeUNL***: The current negative UNL, a list of unreliable validators.
* ***ToDisable***: The validator to be added to the negative UNL on the next
- **_NegativeUNL_**: The current negative UNL, a list of unreliable validators.
- **_ToDisable_**: The validator to be added to the negative UNL on the next
flag ledger.
* ***ToReEnable***: The validator to be removed from the negative UNL on the
- **_ToReEnable_**: The validator to be removed from the negative UNL on the
next flag ledger.
All three fields are optional. When the *ToReEnable* field exists, the
*NegativeUNL* field cannot be empty.
All three fields are optional. When the _ToReEnable_ field exists, the
_NegativeUNL_ field cannot be empty.
A new pseudo-transaction, ***UNLModify***, is added. It has three fields
* ***Disabling***: A flag indicating whether the modification is to disable or
A new pseudo-transaction, **_UNLModify_**, is added. It has three fields
- **_Disabling_**: A flag indicating whether the modification is to disable or
to re-enable a validator.
* ***Seq***: The ledger sequence number.
* ***Validator***: The validator to be disabled or re-enabled.
- **_Seq_**: The ledger sequence number.
- **_Validator_**: The validator to be disabled or re-enabled.
There would be at most one *disable* `UNLModify` and one *re-enable* `UNLModify`
There would be at most one _disable_ `UNLModify` and one _re-enable_ `UNLModify`
transaction per flag ledger. The full machinery is described further on.
### Negative UNL Maintenance
@@ -167,19 +173,19 @@ The negative UNL can only be modified on the flag ledgers. If a validator's
reliability status changes, it takes two flag ledgers to modify the negative
UNL. Let's see an example of the algorithm:
* Ledger seq = 100: A validator V goes offline.
* Ledger seq = 256: This is a flag ledger, and V's reliability measurement *PAV*
- Ledger seq = 100: A validator V goes offline.
- Ledger seq = 256: This is a flag ledger, and V's reliability measurement _PAV_
is lower than the low-water mark. Other validators add `UNLModify`
pseudo-transactions `{true, 256, V}` to the transaction set which goes through
the consensus. Then the pseudo-transaction is applied to the negative UNL
ledger component by setting `ToDisable = V`.
* Ledger seq = 257 ~ 511: The negative UNL ledger component is copied from the
- Ledger seq = 257 ~ 511: The negative UNL ledger component is copied from the
parent ledger.
* Ledger seq=512: This is a flag ledger, and the negative UNL is updated
- Ledger seq=512: This is a flag ledger, and the negative UNL is updated
`NegativeUNL = NegativeUNL + ToDisable`.
The negative UNL may have up to `MaxNegativeListed = floor(original UNL * 25%)`
validators. The 25% is because of 75% * 80% = 60%, where 75% = 100% - 25%, 80%
validators. The 25% is because of 75% \* 80% = 60%, where 75% = 100% - 25%, 80%
is the quorum of the effective UNL, and 60% is the absolute minimum quorum of
the original UNL. Adding more than 25% validators to the negative UNL does not
improve the liveness of the network, because adding more validators to the
@@ -187,52 +193,43 @@ negative UNL cannot lower the effective quorum.
The following is the detailed algorithm:
* **If** the ledger seq = x is a flag ledger
- **If** the ledger seq = x is a flag ledger
1. Compute `NegativeUNL = NegativeUNL + ToDisable - ToReEnable` if they
exist in the parent ledger
1. Compute `NegativeUNL = NegativeUNL + ToDisable - ToReEnable` if they
exist in the parent ledger
1. Try to find a candidate to disable if `sizeof NegativeUNL < MaxNegativeListed`
1. Try to find a candidate to disable if `sizeof NegativeUNL < MaxNegativeListed`
1. Find a validator V that has a _PAV_ lower than the low-water
mark, but is not in `NegativeUNL`.
1. Find a validator V that has a *PAV* lower than the low-water
mark, but is not in `NegativeUNL`.
1. If two or more are found, their public keys are XORed with the hash
of the parent ledger and the one with the lowest XOR result is chosen.
1. If V is found, create a `UNLModify` pseudo-transaction
`TxDisableValidator = {true, x, V}`
1. Try to find a candidate to re-enable if `sizeof NegativeUNL > 0`:
1. Find a validator U that is in `NegativeUNL` and has a _PAV_ higher
than the high-water mark.
1. If U is not found, try to find one in `NegativeUNL` but not in the
local _UNL_.
1. If two or more are found, their public keys are XORed with the hash
of the parent ledger and the one with the lowest XOR result is chosen.
1. If U is found, create a `UNLModify` pseudo-transaction
`TxReEnableValidator = {false, x, U}`
1. If two or more are found, their public keys are XORed with the hash
of the parent ledger and the one with the lowest XOR result is chosen.
1. If V is found, create a `UNLModify` pseudo-transaction
`TxDisableValidator = {true, x, V}`
1. Try to find a candidate to re-enable if `sizeof NegativeUNL > 0`:
1. Find a validator U that is in `NegativeUNL` and has a *PAV* higher
than the high-water mark.
1. If U is not found, try to find one in `NegativeUNL` but not in the
local *UNL*.
1. If two or more are found, their public keys are XORed with the hash
of the parent ledger and the one with the lowest XOR result is chosen.
1. If U is found, create a `UNLModify` pseudo-transaction
`TxReEnableValidator = {false, x, U}`
1. If any `UNLModify` pseudo-transactions are created, add them to the
transaction set. The transaction set goes through the consensus algorithm.
1. If have enough support, the `UNLModify` pseudo-transactions remain in the
transaction set agreed by the validators. Then the pseudo-transactions are
applied to the ledger:
1. If have `TxDisableValidator`, set `ToDisable=TxDisableValidator.V`.
Else clear `ToDisable`.
1. If have `TxReEnableValidator`, set
`ToReEnable=TxReEnableValidator.U`. Else clear `ToReEnable`.
* **Else** (not a flag ledger)
1. If any `UNLModify` pseudo-transactions are created, add them to the
transaction set. The transaction set goes through the consensus algorithm.
1. If have enough support, the `UNLModify` pseudo-transactions remain in the
transaction set agreed by the validators. Then the pseudo-transactions are
applied to the ledger:
1. Copy the negative UNL ledger component from the parent ledger
1. If have `TxDisableValidator`, set `ToDisable=TxDisableValidator.V`.
Else clear `ToDisable`.
1. If have `TxReEnableValidator`, set
`ToReEnable=TxReEnableValidator.U`. Else clear `ToReEnable`.
- **Else** (not a flag ledger)
1. Copy the negative UNL ledger component from the parent ledger
The negative UNL is stored on each ledger because we don't know when a validator
may reconnect to the network. If the negative UNL was stored only on every flag
@@ -273,31 +270,26 @@ not counted when checking if the ledger is fully validated.
The diagram below is the sequence of one round of consensus. Classes and
components with non-trivial changes are colored green.
* The `ValidatorList` class is modified to compute the quorum of the effective
- The `ValidatorList` class is modified to compute the quorum of the effective
UNL.
* The `Validations` class provides an interface for querying the validation
- The `Validations` class provides an interface for querying the validation
messages from trusted validators.
* The `ConsensusAdaptor` component:
* The `RCLConsensus::Adaptor` class is modified for creating `UNLModify`
Pseudo-Transactions.
* The `Change` class is modified for applying `UNLModify`
Pseudo-Transactions.
* The `Ledger` class is modified for creating and adjusting the negative UNL
ledger component.
* The `LedgerMaster` class is modified for filtering out validation messages
from negative UNL validators when verifying if a ledger is fully
validated.
- The `ConsensusAdaptor` component:
- The `RCLConsensus::Adaptor` class is modified for creating `UNLModify`
Pseudo-Transactions.
- The `Change` class is modified for applying `UNLModify`
Pseudo-Transactions.
- The `Ledger` class is modified for creating and adjusting the negative UNL
ledger component.
- The `LedgerMaster` class is modified for filtering out validation messages
from negative UNL validators when verifying if a ledger is fully
validated.
![Sequence diagram](./negativeUNL_highLevel_sequence.png?raw=true "Negative UNL
Changes")
## Roads Not Taken
### Use a Mechanism Like Fee Voting to Process UNLModify Pseudo-Transactions
@@ -311,7 +303,7 @@ and different quorums for the same ledger. As a result, the network's safety is
impacted.
This updated version does not impact safety though operates a bit more slowly.
The negative UNL modifications in the *UNLModify* pseudo-transaction approved by
The negative UNL modifications in the _UNLModify_ pseudo-transaction approved by
the consensus will take effect at the next flag ledger. The extra time of the
256 ledgers should be enough for nodes to be in sync of the negative UNL
modifications.
@@ -334,29 +326,28 @@ expiration approach cannot be simply applied.
### Validator Reliability Measurement and Flag Ledger Frequency
If the ledger time is about 4.5 seconds and the low-water mark is 50%, then in
the worst case, it takes 48 minutes *((0.5 * 256 + 256 + 256) * 4.5 / 60 = 48)*
the worst case, it takes 48 minutes _((0.5 _ 256 + 256 + 256) _ 4.5 / 60 = 48)_
to put an offline validator on the negative UNL. We considered lowering the flag
ledger frequency so that the negative UNL can be more responsive. We also
considered decoupling the reliability measurement and flag ledger frequency to
be more flexible. In practice, however, their benefits are not clear.
## New Attack Vectors
A group of malicious validators may try to frame a reliable validator and put it
on the negative UNL. But they cannot succeed. Because:
1. A reliable validator sends a signed validation message every ledger. A
sufficient peer-to-peer network will propagate the validation messages to other
validators. The validators will decide if another validator is reliable or not
only by its local observation of the validation messages received. So an honest
validators vote on another validators reliability is accurate.
sufficient peer-to-peer network will propagate the validation messages to other
validators. The validators will decide if another validator is reliable or not
only by its local observation of the validation messages received. So an honest
validators vote on another validators reliability is accurate.
1. Given the votes are accurate, and one vote per validator, an honest validator
will not create a UNLModify transaction of a reliable validator.
will not create a UNLModify transaction of a reliable validator.
1. A validator can be added to a negative UNL only through a UNLModify
transaction.
transaction.
Assuming the group of malicious validators is less than the quorum, they cannot
frame a reliable validator.
@@ -365,32 +356,32 @@ frame a reliable validator.
The bullet points below briefly summarize the current proposal:
* The motivation of the negative UNL is to improve the liveness of the network.
- The motivation of the negative UNL is to improve the liveness of the network.
* The targeted faults are the ones frequently observed in the production
- The targeted faults are the ones frequently observed in the production
network.
* Validators propose negative UNL candidates based on their local measurements.
- Validators propose negative UNL candidates based on their local measurements.
* The absolute minimum quorum is 60% of the original UNL.
- The absolute minimum quorum is 60% of the original UNL.
* The format of the ledger is changed, and a new *UNLModify* pseudo-transaction
- The format of the ledger is changed, and a new _UNLModify_ pseudo-transaction
is added. Any tools or systems that rely on the format of these data will have
to be updated.
* The negative UNL can only be modified on the flag ledgers.
- The negative UNL can only be modified on the flag ledgers.
* At most one validator can be added to the negative UNL at a flag ledger.
- At most one validator can be added to the negative UNL at a flag ledger.
* At most one validator can be removed from the negative UNL at a flag ledger.
- At most one validator can be removed from the negative UNL at a flag ledger.
* If a validator's reliability status changes, it takes two flag ledgers to
- If a validator's reliability status changes, it takes two flag ledgers to
modify the negative UNL.
* The quorum is the larger of 80% of the effective UNL and 60% of the original
- The quorum is the larger of 80% of the effective UNL and 60% of the original
UNL.
* If a validator is on the negative UNL, its validation messages are ignored
- If a validator is on the negative UNL, its validation messages are ignored
when the local node verifies if a ledger is fully validated.
## FAQ
@@ -415,7 +406,7 @@ lower quorum size while keeping the network safe.
validator removed from the negative UNL? </h3>
A validators reliability is measured by other validators. If a validator
becomes unreliable, at a flag ledger, other validators propose *UNLModify*
becomes unreliable, at a flag ledger, other validators propose _UNLModify_
pseudo-transactions which vote the validator to add to the negative UNL during
the consensus session. If agreed, the validator is added to the negative UNL at
the next flag ledger. The mechanism of removing a validator from the negative
@@ -423,32 +414,32 @@ UNL is the same.
### Question: Given a negative UNL, what happens if the UNL changes?
Answer: Lets consider the cases:
Answer: Lets consider the cases:
1. A validator is added to the UNL, and it is already in the negative UNL. This
case could happen when not all the nodes have the same UNL. Note that the
negative UNL on the ledger lists unreliable nodes that are not necessarily the
validators for everyone.
1. A validator is added to the UNL, and it is already in the negative UNL. This
case could happen when not all the nodes have the same UNL. Note that the
negative UNL on the ledger lists unreliable nodes that are not necessarily the
validators for everyone.
In this case, the liveness is affected negatively. Because the minimum
quorum could be larger but the usable validators are not increased.
In this case, the liveness is affected negatively. Because the minimum
quorum could be larger but the usable validators are not increased.
1. A validator is removed from the UNL, and it is in the negative UNL.
1. A validator is removed from the UNL, and it is in the negative UNL.
In this case, the liveness is affected positively. Because the quorum could
be smaller but the usable validators are not reduced.
1. A validator is added to the UNL, and it is not in the negative UNL.
1. A validator is removed from the UNL, and it is not in the negative UNL.
1. A validator is added to the UNL, and it is not in the negative UNL.
1. A validator is removed from the UNL, and it is not in the negative UNL.
Case 3 and 4 are not affected by the negative UNL protocol.
### Question: Can we simply lower the quorum to 60% without the negative UNL?
### Question: Can we simply lower the quorum to 60% without the negative UNL?
Answer: No, because the negative UNL approach is safer.
First lets compare the two approaches intuitively, (1) the *negative UNL*
approach, and (2) *lower quorum*: simply lowering the quorum from 80% to 60%
First lets compare the two approaches intuitively, (1) the _negative UNL_
approach, and (2) _lower quorum_: simply lowering the quorum from 80% to 60%
without the negative UNL. The negative UNL approach uses consensus to come up
with a list of unreliable validators, which are then removed from the effective
UNL temporarily. With this approach, the list of unreliable validators is agreed
@@ -462,75 +453,75 @@ Next we compare the two approaches quantitatively with examples, and apply
Theorem 8 of [Analysis of the XRP Ledger Consensus
Protocol](https://arxiv.org/abs/1802.07242) paper:
*XRP LCP guarantees fork safety if **O<sub>i,j</sub> > n<sub>j</sub> / 2 +
_XRP LCP guarantees fork safety if **O<sub>i,j</sub> > n<sub>j</sub> / 2 +
n<sub>i</sub> q<sub>i</sub> + t<sub>i,j</sub>** for every pair of nodes
P<sub>i</sub>, P<sub>j</sub>,*
P<sub>i</sub>, P<sub>j</sub>,_
where *O<sub>i,j</sub>* is the overlapping requirement, n<sub>j</sub> and
where _O<sub>i,j</sub>_ is the overlapping requirement, n<sub>j</sub> and
n<sub>i</sub> are UNL sizes, q<sub>i</sub> is the quorum size of P<sub>i</sub>,
*t<sub>i,j</sub> = min(t<sub>i</sub>, t<sub>j</sub>, O<sub>i,j</sub>)*, and
_t<sub>i,j</sub> = min(t<sub>i</sub>, t<sub>j</sub>, O<sub>i,j</sub>)_, and
t<sub>i</sub> and t<sub>j</sub> are the number of faults can be tolerated by
P<sub>i</sub> and P<sub>j</sub>.
We denote *UNL<sub>i</sub>* as *P<sub>i</sub>'s UNL*, and *|UNL<sub>i</sub>|* as
the size of *P<sub>i</sub>'s UNL*.
We denote _UNL<sub>i</sub>_ as _P<sub>i</sub>'s UNL_, and _|UNL<sub>i</sub>|_ as
the size of _P<sub>i</sub>'s UNL_.
Assuming *|UNL<sub>i</sub>| = |UNL<sub>j</sub>|*, let's consider the following
Assuming _|UNL<sub>i</sub>| = |UNL<sub>j</sub>|_, let's consider the following
three cases:
1. With 80% quorum and 20% faults, *O<sub>i,j</sub> > 100% / 2 + 100% - 80% +
20% = 90%*. I.e. fork safety requires > 90% UNL overlaps. This is one of the
results in the analysis paper.
1. With 80% quorum and 20% faults, _O<sub>i,j</sub> > 100% / 2 + 100% - 80% +
20% = 90%_. I.e. fork safety requires > 90% UNL overlaps. This is one of the
results in the analysis paper.
1. If the quorum is 60%, the relationship between the overlapping requirement
and the faults that can be tolerated is *O<sub>i,j</sub> > 90% +
t<sub>i,j</sub>*. Under the same overlapping condition (i.e. 90%), to guarantee
the fork safety, the network cannot tolerate any faults. So under the same
overlapping condition, if the quorum is simply lowered, the network can tolerate
fewer faults.
1. If the quorum is 60%, the relationship between the overlapping requirement
and the faults that can be tolerated is _O<sub>i,j</sub> > 90% +
t<sub>i,j</sub>_. Under the same overlapping condition (i.e. 90%), to guarantee
the fork safety, the network cannot tolerate any faults. So under the same
overlapping condition, if the quorum is simply lowered, the network can tolerate
fewer faults.
1. With the negative UNL approach, we want to argue that the inequation
*O<sub>i,j</sub> > n<sub>j</sub> / 2 + n<sub>i</sub> q<sub>i</sub> +
t<sub>i,j</sub>* is always true to guarantee fork safety, while the negative UNL
protocol runs, i.e. the effective quorum is lowered without weakening the
network's fault tolerance. To make the discussion easier, we rewrite the
inequation as *O<sub>i,j</sub> > n<sub>j</sub> / 2 + (n<sub>i</sub>
q<sub>i</sub>) + min(t<sub>i</sub>, t<sub>j</sub>)*, where O<sub>i,j</sub> is
dropped from the definition of t<sub>i,j</sub> because *O<sub>i,j</sub> >
min(t<sub>i</sub>, t<sub>j</sub>)* always holds under the parameters we will
use. Assuming a validator V is added to the negative UNL, now let's consider the
4 cases:
1. With the negative UNL approach, we want to argue that the inequation
_O<sub>i,j</sub> > n<sub>j</sub> / 2 + n<sub>i</sub> q<sub>i</sub> +
t<sub>i,j</sub>_ is always true to guarantee fork safety, while the negative UNL
protocol runs, i.e. the effective quorum is lowered without weakening the
network's fault tolerance. To make the discussion easier, we rewrite the
inequation as _O<sub>i,j</sub> > n<sub>j</sub> / 2 + (n<sub>i</sub>
q<sub>i</sub>) + min(t<sub>i</sub>, t<sub>j</sub>)_, where O<sub>i,j</sub> is
dropped from the definition of t<sub>i,j</sub> because _O<sub>i,j</sub> >
min(t<sub>i</sub>, t<sub>j</sub>)_ always holds under the parameters we will
use. Assuming a validator V is added to the negative UNL, now let's consider the
4 cases:
1. V is not on UNL<sub>i</sub> nor UNL<sub>j</sub>
1. V is not on UNL<sub>i</sub> nor UNL<sub>j</sub>
The inequation holds because none of the variables change.
The inequation holds because none of the variables change.
1. V is on UNL<sub>i</sub> but not on UNL<sub>j</sub>
1. V is on UNL<sub>i</sub> but not on UNL<sub>j</sub>
The value of *(n<sub>i</sub> q<sub>i</sub>)* is smaller. The value of
*min(t<sub>i</sub>, t<sub>j</sub>)* could be smaller too. Other
variables do not change. Overall, the left side of the inequation does
not change, but the right side is smaller. So the inequation holds.
1. V is not on UNL<sub>i</sub> but on UNL<sub>j</sub>
The value of *(n<sub>i</sub> q<sub>i</sub>)* is smaller. The value of
*min(t<sub>i</sub>, t<sub>j</sub>)* could be smaller too. Other
variables do not change. Overall, the left side of the inequation does
not change, but the right side is smaller. So the inequation holds.
The value of *n<sub>j</sub> / 2* is smaller. The value of
*min(t<sub>i</sub>, t<sub>j</sub>)* could be smaller too. Other
variables do not change. Overall, the left side of the inequation does
not change, but the right side is smaller. So the inequation holds.
1. V is on both UNL<sub>i</sub> and UNL<sub>j</sub>
1. V is not on UNL<sub>i</sub> but on UNL<sub>j</sub>
The value of *O<sub>i,j</sub>* is reduced by 1. The values of
*n<sub>j</sub> / 2*, *(n<sub>i</sub> q<sub>i</sub>)*, and
*min(t<sub>i</sub>, t<sub>j</sub>)* are reduced by 0.5, 0.2, and 1
respectively. The right side is reduced by 1.7. Overall, the left side
of the inequation is reduced by 1, and the right side is reduced by 1.7.
So the inequation holds.
The value of *n<sub>j</sub> / 2* is smaller. The value of
*min(t<sub>i</sub>, t<sub>j</sub>)* could be smaller too. Other
variables do not change. Overall, the left side of the inequation does
not change, but the right side is smaller. So the inequation holds.
The inequation holds for all the cases. So with the negative UNL approach,
the network's fork safety is preserved, while the quorum is lowered that
increases the network's liveness.
1. V is on both UNL<sub>i</sub> and UNL<sub>j</sub>
The value of *O<sub>i,j</sub>* is reduced by 1. The values of
*n<sub>j</sub> / 2*, *(n<sub>i</sub> q<sub>i</sub>)*, and
*min(t<sub>i</sub>, t<sub>j</sub>)* are reduced by 0.5, 0.2, and 1
respectively. The right side is reduced by 1.7. Overall, the left side
of the inequation is reduced by 1, and the right side is reduced by 1.7.
So the inequation holds.
The inequation holds for all the cases. So with the negative UNL approach,
the network's fork safety is preserved, while the quorum is lowered that
increases the network's liveness.
<h3> Question: We have observed that occasionally a validator wanders off on its
own chain. How is this case handled by the negative UNL algorithm? </h3>
@@ -565,11 +556,11 @@ will be used after that. We want to see the test cases still pass with real
network delay. A test case specifies:
1. a UNL with different number of validators for different test cases,
1. a network with zero or more non-validator nodes,
1. a network with zero or more non-validator nodes,
1. a sequence of validator reliability change events (by killing/restarting
nodes, or by running modified rippled that does not send all validation
messages),
1. the correct outcomes.
1. the correct outcomes.
For all the test cases, the correct outcomes are verified by examining logs. We
will grep the log to see if the correct negative UNLs are generated, and whether
@@ -579,6 +570,7 @@ timing parameters of rippled will be changed to have faster ledger time. Most if
not all test cases do not need client transactions.
For example, the test cases for the prototype:
1. A 10-validator UNL.
1. The network does not have other nodes.
1. The validators will be started from the genesis. Once they start to produce
@@ -587,11 +579,11 @@ For example, the test cases for the prototype:
1. A sequence of events (or the lack of events) such as a killed validator is
added to the negative UNL.
#### Roads Not Taken: Test with Extended CSF
#### Roads Not Taken: Test with Extended CSF
We considered testing with the current unit test framework, specifically the
[Consensus Simulation
Framework](https://github.com/ripple/rippled/blob/develop/src/test/csf/README.md)
(CSF). However, the CSF currently can only test the generic consensus algorithm
as in the paper: [Analysis of the XRP Ledger Consensus
Protocol](https://arxiv.org/abs/1802.07242).
Protocol](https://arxiv.org/abs/1802.07242).

View File

@@ -82,7 +82,9 @@ pattern and the way coroutines are implemented, where every yield saves the spot
in the code where it left off and every resume jumps back to that spot.
### Sequence Diagram
![Sequence diagram](./ledger_replay_sequence.png?raw=true "A successful ledger replay")
### Class Diagram
![Class diagram](./ledger_replay_classes.png?raw=true "Ledger replay classes")

View File

@@ -16,5 +16,5 @@
## Function
- Minimize external dependencies
* Pass options in the ctor instead of using theConfig
* Use as few other classes as possible
- Pass options in the ctor instead of using theConfig
- Use as few other classes as possible

View File

@@ -1,18 +1,18 @@
# Coding Standards
Coding standards used here gradually evolve and propagate through
Coding standards used here gradually evolve and propagate through
code reviews. Some aspects are enforced more strictly than others.
## Rules
These rules only apply to our own code. We can't enforce any sort of
These rules only apply to our own code. We can't enforce any sort of
style on the external repositories and libraries we include. The best
guideline is to maintain the standards that are used in those libraries.
* Tab inserts 4 spaces. No tab characters.
* Braces are indented in the [Allman style][1].
* Modern C++ principles. No naked ```new``` or ```delete```.
* Line lengths limited to 80 characters. Exceptions limited to data and tables.
- Tab inserts 4 spaces. No tab characters.
- Braces are indented in the [Allman style][1].
- Modern C++ principles. No naked `new` or `delete`.
- Line lengths limited to 80 characters. Exceptions limited to data and tables.
## Guidelines
@@ -21,17 +21,17 @@ why you're doing it. Think, use common sense, and consider that this
your changes will probably need to be maintained long after you've
moved on to other projects.
* Use white space and blank lines to guide the eye and keep your intent clear.
* Put private data members at the top of a class, and the 6 public special
members immediately after, in the following order:
* Destructor
* Default constructor
* Copy constructor
* Copy assignment
* Move constructor
* Move assignment
* Don't over-inline by defining large functions within the class
declaration, not even for template classes.
- Use white space and blank lines to guide the eye and keep your intent clear.
- Put private data members at the top of a class, and the 6 public special
members immediately after, in the following order:
- Destructor
- Default constructor
- Copy constructor
- Copy assignment
- Move constructor
- Move assignment
- Don't over-inline by defining large functions within the class
declaration, not even for template classes.
## Formatting
@@ -39,44 +39,44 @@ The goal of source code formatting should always be to make things as easy to
read as possible. White space is used to guide the eye so that details are not
overlooked. Blank lines are used to separate code into "paragraphs."
* Always place a space before and after all binary operators,
- Always place a space before and after all binary operators,
especially assignments (`operator=`).
* The `!` operator should be preceded by a space, but not followed by one.
* The `~` operator should be preceded by a space, but not followed by one.
* The `++` and `--` operators should have no spaces between the operator and
- The `!` operator should be preceded by a space, but not followed by one.
- The `~` operator should be preceded by a space, but not followed by one.
- The `++` and `--` operators should have no spaces between the operator and
the operand.
* A space never appears before a comma, and always appears after a comma.
* Don't put spaces after a parenthesis. A typical member function call might
- A space never appears before a comma, and always appears after a comma.
- Don't put spaces after a parenthesis. A typical member function call might
look like this: `foobar (1, 2, 3);`
* In general, leave a blank line before an `if` statement.
* In general, leave a blank line after a closing brace `}`.
* Do not place code on the same line as any opening or
- In general, leave a blank line before an `if` statement.
- In general, leave a blank line after a closing brace `}`.
- Do not place code on the same line as any opening or
closing brace.
* Do not write `if` statements all-on-one-line. The exception to this is when
- Do not write `if` statements all-on-one-line. The exception to this is when
you've got a sequence of similar `if` statements, and are aligning them all
vertically to highlight their similarities.
* In an `if-else` statement, if you surround one half of the statement with
- In an `if-else` statement, if you surround one half of the statement with
braces, you also need to put braces around the other half, to match.
* When writing a pointer type, use this spacing: `SomeObject* myObject`.
- When writing a pointer type, use this spacing: `SomeObject* myObject`.
Technically, a more correct spacing would be `SomeObject *myObject`, but
it makes more sense for the asterisk to be grouped with the type name,
since being a pointer is part of the type, not the variable name. The only
time that this can lead to any problems is when you're declaring multiple
pointers of the same type in the same statement - which leads on to the next
rule:
* When declaring multiple pointers, never do so in a single statement, e.g.
- When declaring multiple pointers, never do so in a single statement, e.g.
`SomeObject* p1, *p2;` - instead, always split them out onto separate lines
and write the type name again, to make it quite clear what's going on, and
avoid the danger of missing out any vital asterisks.
* The previous point also applies to references, so always put the `&` next to
- The previous point also applies to references, so always put the `&` next to
the type rather than the variable, e.g. `void foo (Thing const& thing)`. And
don't put a space on both sides of the `*` or `&` - always put a space after
it, but never before it.
* The word `const` should be placed to the right of the thing that it modifies,
- The word `const` should be placed to the right of the thing that it modifies,
for consistency. For example `int const` refers to an int which is const.
`int const*` is a pointer to an int which is const. `int *const` is a const
pointer to an int.
* Always place a space in between the template angle brackets and the type
- Always place a space in between the template angle brackets and the type
name. Template code is already hard enough to read!
[1]: http://en.wikipedia.org/wiki/Indent_style#Allman_style

View File

@@ -31,7 +31,7 @@ and header under /opt/local/include:
$ scons clang profile-jemalloc=/opt/local
----------------------
---
## Using the jemalloc library from within the code
@@ -60,4 +60,3 @@ Linking against the jemalloc library will override
the system's default `malloc()` and related functions with jemalloc's
implementation. This is the case even if the code is not instrumented
to use jemalloc's specific API.

View File

@@ -7,7 +7,6 @@ Install these dependencies:
- [Doxygen](http://www.doxygen.nl): All major platforms have [official binary
distributions](http://www.doxygen.nl/download.html#srcbin), or you can
build from [source](http://www.doxygen.nl/download.html#srcbin).
- MacOS: We recommend installing via Homebrew: `brew install doxygen`.
The executable will be installed in `/usr/local/bin` which is already
in the default `PATH`.
@@ -21,18 +20,15 @@ Install these dependencies:
$ ln -s /Applications/Doxygen.app/Contents/Resources/doxygen /usr/local/bin/doxygen
```
- [PlantUML](http://plantuml.com):
- [PlantUML](http://plantuml.com):
1. Install a functioning Java runtime, if you don't already have one.
2. Download [`plantuml.jar`](http://sourceforge.net/projects/plantuml/files/plantuml.jar/download).
- [Graphviz](https://www.graphviz.org):
- Linux: Install from your package manager.
- Windows: Use an [official installer](https://graphviz.gitlab.io/_pages/Download/Download_windows.html).
- MacOS: Install via Homebrew: `brew install graphviz`.
## Docker
Instead of installing the above dependencies locally, you can use the official
@@ -40,14 +36,16 @@ build environment Docker image, which has all of them installed already.
1. Install [Docker](https://docs.docker.com/engine/installation/)
2. Pull the image:
```
sudo docker pull rippleci/rippled-ci-builder:2944b78d22db
```
3. Run the image from the project folder:
```
sudo docker run -v $PWD:/opt/rippled --rm rippleci/rippled-ci-builder:2944b78d22db
```
```
sudo docker pull rippleci/rippled-ci-builder:2944b78d22db
```
3. Run the image from the project folder:
```
sudo docker run -v $PWD:/opt/rippled --rm rippleci/rippled-ci-builder:2944b78d22db
```
## Build

14
docs/build/conan.md vendored
View File

@@ -5,7 +5,6 @@ we should first understand _why_ we use Conan,
and to understand that,
we need to understand how we use CMake.
### CMake
Technically, you don't need CMake to build this project.
@@ -33,9 +32,9 @@ Parameters include:
- where to find the compiler and linker
- where to find dependencies, e.g. libraries and headers
- how to link dependencies, e.g. any special compiler or linker flags that
need to be used with them, including preprocessor definitions
need to be used with them, including preprocessor definitions
- how to compile translation units, e.g. with optimizations, debug symbols,
position-independent code, etc.
position-independent code, etc.
- on Windows, which runtime library to link with
For some of these parameters, like the build system and compiler,
@@ -54,7 +53,6 @@ Most humans prefer to put them into a configuration file, once, that
CMake can read every time it is configured.
For CMake, that file is a [toolchain file][toolchain].
### Conan
These next few paragraphs on Conan are going to read much like the ones above
@@ -79,10 +77,10 @@ Those files include:
- A single toolchain file.
- For every dependency, a CMake [package configuration file][pcf],
[package version file][pvf], and for every build type, a package
targets file.
Together, these files implement version checking and define `IMPORTED`
targets for the dependencies.
[package version file][pvf], and for every build type, a package
targets file.
Together, these files implement version checking and define `IMPORTED`
targets for the dependencies.
The toolchain file itself amends the search path
([`CMAKE_PREFIX_PATH`][prefix_path]) so that [`find_package()`][find_package]

View File

@@ -2,8 +2,7 @@ We recommend two different methods to depend on libxrpl in your own [CMake][]
project.
Both methods add a CMake library target named `xrpl::libxrpl`.
## Conan requirement
## Conan requirement
The first method adds libxrpl as a [Conan][] requirement.
With this method, there is no need for a Git [submodule][].
@@ -48,7 +47,6 @@ cmake \
cmake --build . --parallel
```
## CMake subdirectory
The second method adds the [rippled][] project as a CMake
@@ -90,7 +88,6 @@ cmake \
cmake --build . --parallel
```
[add_subdirectory]: https://cmake.org/cmake/help/latest/command/add_subdirectory.html
[submodule]: https://git-scm.com/book/en/v2/Git-Tools-Submodules
[rippled]: https://github.com/ripple/rippled

View File

@@ -5,7 +5,6 @@ platforms: Linux, macOS, or Windows.
[BUILD.md]: ../../BUILD.md
## Linux
Package ecosystems vary across Linux distributions,
@@ -53,11 +52,10 @@ clang --version
### Install Xcode Specific Version (Optional)
If you develop other applications using XCode you might be consistently updating to the newest version of Apple Clang.
If you develop other applications using XCode you might be consistently updating to the newest version of Apple Clang.
This will likely cause issues building rippled. You may want to install a specific version of Xcode:
1. **Download Xcode**
- Visit [Apple Developer Downloads](https://developer.apple.com/download/more/)
- Sign in with your Apple Developer account
- Search for an Xcode version that includes **Apple Clang (Expected Version)**

42
docs/build/install.md vendored
View File

@@ -6,7 +6,6 @@ like CentOS.
Installing from source is an option for all platforms,
and the only supported option for installing custom builds.
## From source
From a source build, you can install rippled and libxrpl using CMake's
@@ -21,25 +20,23 @@ The default [prefix][1] is typically `/usr/local` on Linux and macOS and
[1]: https://cmake.org/cmake/help/latest/variable/CMAKE_INSTALL_PREFIX.html
## With the APT package manager
1. Update repositories:
1. Update repositories:
sudo apt update -y
2. Install utilities:
2. Install utilities:
sudo apt install -y apt-transport-https ca-certificates wget gnupg
3. Add Ripple's package-signing GPG key to your list of trusted keys:
3. Add Ripple's package-signing GPG key to your list of trusted keys:
sudo mkdir /usr/local/share/keyrings/
wget -q -O - "https://repos.ripple.com/repos/api/gpg/key/public" | gpg --dearmor > ripple-key.gpg
sudo mv ripple-key.gpg /usr/local/share/keyrings
4. Check the fingerprint of the newly-added key:
4. Check the fingerprint of the newly-added key:
gpg /usr/local/share/keyrings/ripple-key.gpg
@@ -51,37 +48,34 @@ The default [prefix][1] is typically `/usr/local` on Linux and macOS and
uid TechOps Team at Ripple <techops+rippled@ripple.com>
sub rsa3072 2019-02-14 [E] [expires: 2026-02-17]
In particular, make sure that the fingerprint matches. (In the above example, the fingerprint is on the third line, starting with `C001`.)
4. Add the appropriate Ripple repository for your operating system version:
5. Add the appropriate Ripple repository for your operating system version:
echo "deb [signed-by=/usr/local/share/keyrings/ripple-key.gpg] https://repos.ripple.com/repos/rippled-deb focal stable" | \
sudo tee -a /etc/apt/sources.list.d/ripple.list
The above example is appropriate for **Ubuntu 20.04 Focal Fossa**. For other operating systems, replace the word `focal` with one of the following:
- `jammy` for **Ubuntu 22.04 Jammy Jellyfish**
- `bionic` for **Ubuntu 18.04 Bionic Beaver**
- `bullseye` for **Debian 11 Bullseye**
- `buster` for **Debian 10 Buster**
If you want access to development or pre-release versions of `rippled`, use one of the following instead of `stable`:
- `unstable` - Pre-release builds ([`release` branch](https://github.com/ripple/rippled/tree/release))
- `nightly` - Experimental/development builds ([`develop` branch](https://github.com/ripple/rippled/tree/develop))
**Warning:** Unstable and nightly builds may be broken at any time. Do not use these builds for production servers.
5. Fetch the Ripple repository.
6. Fetch the Ripple repository.
sudo apt -y update
6. Install the `rippled` software package:
7. Install the `rippled` software package:
sudo apt -y install rippled
7. Check the status of the `rippled` service:
8. Check the status of the `rippled` service:
systemctl status rippled.service
@@ -89,24 +83,22 @@ The default [prefix][1] is typically `/usr/local` on Linux and macOS and
sudo systemctl start rippled.service
8. Optional: allow `rippled` to bind to privileged ports.
9. Optional: allow `rippled` to bind to privileged ports.
This allows you to serve incoming API requests on port 80 or 443. (If you want to do so, you must also update the config file's port settings.)
sudo setcap 'cap_net_bind_service=+ep' /opt/ripple/bin/rippled
## With the YUM package manager
1. Install the Ripple RPM repository:
1. Install the Ripple RPM repository:
Choose the appropriate RPM repository for the stability of releases you want:
- `stable` for the latest production release (`master` branch)
- `unstable` for pre-release builds (`release` branch)
- `nightly` for experimental/development builds (`develop` branch)
*Stable*
_Stable_
cat << REPOFILE | sudo tee /etc/yum.repos.d/ripple.repo
[ripple-stable]
@@ -118,7 +110,7 @@ The default [prefix][1] is typically `/usr/local` on Linux and macOS and
gpgkey=https://repos.ripple.com/repos/rippled-rpm/stable/repodata/repomd.xml.key
REPOFILE
*Unstable*
_Unstable_
cat << REPOFILE | sudo tee /etc/yum.repos.d/ripple.repo
[ripple-unstable]
@@ -130,7 +122,7 @@ The default [prefix][1] is typically `/usr/local` on Linux and macOS and
gpgkey=https://repos.ripple.com/repos/rippled-rpm/unstable/repodata/repomd.xml.key
REPOFILE
*Nightly*
_Nightly_
cat << REPOFILE | sudo tee /etc/yum.repos.d/ripple.repo
[ripple-nightly]
@@ -142,18 +134,18 @@ The default [prefix][1] is typically `/usr/local` on Linux and macOS and
gpgkey=https://repos.ripple.com/repos/rippled-rpm/nightly/repodata/repomd.xml.key
REPOFILE
2. Fetch the latest repo updates:
2. Fetch the latest repo updates:
sudo yum -y update
3. Install the new `rippled` package:
3. Install the new `rippled` package:
sudo yum install -y rippled
4. Configure the `rippled` service to start on boot:
4. Configure the `rippled` service to start on boot:
sudo systemctl enable rippled.service
5. Start the `rippled` service:
5. Start the `rippled` service:
sudo systemctl start rippled.service

View File

@@ -3,7 +3,7 @@
**This section is a work in progress!!**
Consensus is the task of reaching agreement within a distributed system in the
presence of faulty or even malicious participants. This document outlines the
presence of faulty or even malicious participants. This document outlines the
[XRP Ledger Consensus Algorithm](https://arxiv.org/abs/1802.07242)
as implemented in [rippled](https://github.com/ripple/rippled), but
focuses on its utility as a generic consensus algorithm independent of the
@@ -15,38 +15,38 @@ collectively trusted subnetworks.
## Distributed Agreement
A challenge for distributed systems is reaching agreement on changes in shared
state. For the Ripple network, the shared state is the current ledger--account
information, account balances, order books and other financial data. We will
state. For the Ripple network, the shared state is the current ledger--account
information, account balances, order books and other financial data. We will
refer to shared distributed state as a /ledger/ throughout the remainder of this
document.
![Ledger Chain](images/consensus/ledger_chain.png "Ledger Chain")
As shown above, new ledgers are made by applying a set of transactions to the
prior ledger. For the Ripple network, transactions include payments,
prior ledger. For the Ripple network, transactions include payments,
modification of account settings, updates to offers and more.
In a centralized system, generating the next ledger is trivial since there is a
single unique arbiter of which transactions to include and how to apply them to
a ledger. For decentralized systems, participants must resolve disagreements on
a ledger. For decentralized systems, participants must resolve disagreements on
the set of transactions to include, the order to apply those transactions, and
even the resulting ledger after applying the transactions. This is even more
even the resulting ledger after applying the transactions. This is even more
difficult when some participants are faulty or malicious.
The Ripple network is a decentralized and **trust-full** network. Anyone is free
The Ripple network is a decentralized and **trust-full** network. Anyone is free
to join and participants are free to choose a subset of peers that are
collectively trusted to not collude in an attempt to defraud the participant.
Leveraging this network of trust, the Ripple algorithm has two main components.
* *Consensus* in which network participants agree on the transactions to apply
- _Consensus_ in which network participants agree on the transactions to apply
to a prior ledger, based on the positions of their chosen peers.
* *Validation* in which network participants agree on what ledger was
- _Validation_ in which network participants agree on what ledger was
generated, based on the ledgers generated by chosen peers.
These phases are continually repeated to process transactions submitted to the
network, generating successive ledgers and giving rise to the blockchain ledger
history depicted below. In this diagram, time is flowing to the right, but
links between ledgers point backward to the parent. Also note the alternate
history depicted below. In this diagram, time is flowing to the right, but
links between ledgers point backward to the parent. Also note the alternate
Ledger 2 that was generated by some participants, but which failed validation
and was abandoned.
@@ -54,7 +54,7 @@ and was abandoned.
The remainder of this section describes the Consensus and Validation algorithms
in more detail and is meant as a companion guide to understanding the generic
implementation in `rippled`. The document **does not** discuss correctness,
implementation in `rippled`. The document **does not** discuss correctness,
fault-tolerance or liveness properties of the algorithms or the full details of
how they integrate within `rippled` to support the Ripple Consensus Ledger.
@@ -62,76 +62,76 @@ how they integrate within `rippled` to support the Ripple Consensus Ledger.
### Definitions
* The *ledger* is the shared distributed state. Each ledger has a unique ID to
distinguish it from all other ledgers. During consensus, the *previous*,
*prior* or *last-closed* ledger is the most recent ledger seen by consensus
- The _ledger_ is the shared distributed state. Each ledger has a unique ID to
distinguish it from all other ledgers. During consensus, the _previous_,
_prior_ or _last-closed_ ledger is the most recent ledger seen by consensus
and is the basis upon which it will build the next ledger.
* A *transaction* is an instruction for an atomic change in the ledger state. A
- A _transaction_ is an instruction for an atomic change in the ledger state. A
unique ID distinguishes a transaction from other transactions.
* A *transaction set* is a set of transactions under consideration by consensus.
The goal of consensus is to reach agreement on this set. The generic
- A _transaction set_ is a set of transactions under consideration by consensus.
The goal of consensus is to reach agreement on this set. The generic
consensus algorithm does not rely on an ordering of transactions within the
set, nor does it specify how to apply a transaction set to a ledger to
generate a new ledger. A unique ID distinguishes a set of transactions from
generate a new ledger. A unique ID distinguishes a set of transactions from
all other sets of transactions.
* A *node* is one of the distributed actors running the consensus algorithm. It
- A _node_ is one of the distributed actors running the consensus algorithm. It
has a unique ID to distinguish it from all other nodes.
* A *peer* of a node is another node that it has chosen to follow and which it
believes will not collude with other chosen peers. The choice of peers is not
- A _peer_ of a node is another node that it has chosen to follow and which it
believes will not collude with other chosen peers. The choice of peers is not
symmetric, since participants can decide on their chosen sets independently.
* A /position/ is the current belief of the next ledger's transaction set and
- A /position/ is the current belief of the next ledger's transaction set and
close time. Position can refer to the node's own position or the position of a
peer.
* A *proposal* is one of a sequence of positions a node shares during consensus.
- A _proposal_ is one of a sequence of positions a node shares during consensus.
An initial proposal contains the starting position taken by a node before it
considers any peer positions. If a node subsequently updates its position in
response to its peers, it will issue an updated proposal. A proposal is
considers any peer positions. If a node subsequently updates its position in
response to its peers, it will issue an updated proposal. A proposal is
uniquely identified by the ID of the proposing node, the ID of the position
taken, the ID of the prior ledger the proposal is for, and the sequence number
of the proposal.
* A *dispute* is a transaction that is either not part of a node's position or
- A _dispute_ is a transaction that is either not part of a node's position or
not in a peer's position. During consensus, the node will add or remove
disputed transactions from its position based on that transaction's support
amongst its peers.
Note that most types have an ID as a lightweight identifier of instances of that
type. Consensus often operates on the IDs directly since the underlying type is
potentially expensive to share over the network. For example, proposal's only
contain the ID of the position of a peer. Since many peers likely have the same
type. Consensus often operates on the IDs directly since the underlying type is
potentially expensive to share over the network. For example, proposal's only
contain the ID of the position of a peer. Since many peers likely have the same
position, this reduces the need to send the full transaction set multiple times.
Instead, a node can request the transaction set from the network if necessary.
### Overview
### Overview
![Consensus Overview](images/consensus/consensus_overview.png "Consensus Overview")
The diagram above is an overview of the consensus process from the perspective
of a single participant. Recall that during a single consensus round, a node is
of a single participant. Recall that during a single consensus round, a node is
trying to agree with its peers on which transactions to apply to its prior
ledger when generating the next ledger. It also attempts to agree on the
[network time when the ledger closed](#effective_close_time). There are
ledger when generating the next ledger. It also attempts to agree on the
[network time when the ledger closed](#effective_close_time). There are
3 main phases to a consensus round:
* A call to `startRound` places the node in the `Open` phase. In this phase,
the node is waiting for transactions to include in its open ledger.
* At some point, the node will `Close` the open ledger and transition to the
`Establish` phase. In this phase, the node shares/receives peer proposals on
which transactions should be accepted in the closed ledger.
* At some point, the node determines it has reached consensus with its peers on
which transactions to include. It transitions to the `Accept` phase. In this
phase, the node works on applying the transactions to the prior ledger to
generate a new closed ledger. Once the new ledger is completed, the node shares
the validated ledger hash with the network and makes a call to `startRound` to
start the cycle again for the next ledger.
- A call to `startRound` places the node in the `Open` phase. In this phase,
the node is waiting for transactions to include in its open ledger.
- At some point, the node will `Close` the open ledger and transition to the
`Establish` phase. In this phase, the node shares/receives peer proposals on
which transactions should be accepted in the closed ledger.
- At some point, the node determines it has reached consensus with its peers on
which transactions to include. It transitions to the `Accept` phase. In this
phase, the node works on applying the transactions to the prior ledger to
generate a new closed ledger. Once the new ledger is completed, the node shares
the validated ledger hash with the network and makes a call to `startRound` to
start the cycle again for the next ledger.
Throughout, a heartbeat timer calls `timerEntry` at a regular frequency to drive
the process forward. Although the `startRound` call occurs at arbitrary times
based on when the initial round began and the time it takes to apply
transactions, the transitions from `Open` to `Establish` and `Establish` to
`Accept` only occur during calls to `timerEntry`. Similarly, transactions can
`Accept` only occur during calls to `timerEntry`. Similarly, transactions can
arrive at arbitrary times, independent of the heartbeat timer. Transactions
received after the `Open` to `Close` transition and not part of peer proposals
won't be considered until the next consensus round. They are represented above
won't be considered until the next consensus round. They are represented above
by the light green triangles.
Peer proposals are issued by a node during a `timerEntry` call, but since peers
@@ -139,16 +139,16 @@ do not synchronize `timerEntry` calls, they are received by other peers at
arbitrary times. Peer proposals are only considered if received prior to the
`Establish` to `Accept` transition, and only if the peer is working on the same
prior ledger. Peer proposals received after consensus is reached will not be
meaningful and are represented above by the circle with the X in it. Only
meaningful and are represented above by the circle with the X in it. Only
proposals from chosen peers are considered.
### Effective Close Time ### {#effective_close_time}
### Effective Close Time ### {#effective_close_time}
In addition to agreeing on a transaction set, each consensus round tries to
agree on the time the ledger closed. Each node calculates its own close time
when it closes the open ledger. This exact close time is rounded to the nearest
multiple of the current *effective close time resolution*. It is this
*effective close time* that nodes seek to agree on. This allows servers to
agree on the time the ledger closed. Each node calculates its own close time
when it closes the open ledger. This exact close time is rounded to the nearest
multiple of the current _effective close time resolution_. It is this
_effective close time_ that nodes seek to agree on. This allows servers to
derive a common time for a ledger without the need for perfectly synchronized
clocks. As depicted below, the 3 pink arrows represent exact close times from 3
consensus nodes that round to the same effective close time given the current
@@ -158,9 +158,9 @@ different effective close time given the current resolution.
![Effective Close Time](images/consensus/EffCloseTime.png "Effective Close Time")
The effective close time is part of the node's position and is shared with peers
in its proposals. Just like the position on the consensus transaction set, a
in its proposals. Just like the position on the consensus transaction set, a
node will update its close time position in response to its peers' effective
close time positions. Peers can agree to disagree on the close time, in which
close time positions. Peers can agree to disagree on the close time, in which
case the effective close time is taken as 1 second past the prior close.
The close time resolution is itself dynamic, decreasing (coarser) resolution in
@@ -173,12 +173,12 @@ reach close time consensus.
Internally, a node operates under one of the following consensus modes. Either
of the first two modes may be chosen when a consensus round starts.
* *Proposing* indicates the node is a full-fledged consensus participant. It
- _Proposing_ indicates the node is a full-fledged consensus participant. It
takes on positions and sends proposals to its peers.
* *Observing* indicates the node is a passive consensus participant. It
- _Observing_ indicates the node is a passive consensus participant. It
maintains a position internally, but does not propose that position to its
peers. Instead, it receives peer proposals and updates its position
to track the majority of its peers. This may be preferred if the node is only
to track the majority of its peers. This may be preferred if the node is only
being used to track the state of the network or during a start-up phase while
it is still synchronizing with the network.
@@ -186,14 +186,14 @@ The other two modes are set internally during the consensus round when the node
believes it is no longer working on the dominant ledger chain based on peer
validations. It checks this on every call to `timerEntry`.
* *Wrong Ledger* indicates the node is not working on the correct prior ledger
and does not have it available. It requests that ledger from the network, but
continues to work towards consensus this round while waiting. If it had been
*proposing*, it will send a special "bowout" proposal to its peers to indicate
- _Wrong Ledger_ indicates the node is not working on the correct prior ledger
and does not have it available. It requests that ledger from the network, but
continues to work towards consensus this round while waiting. If it had been
_proposing_, it will send a special "bowout" proposal to its peers to indicate
its change in mode for the rest of this round. For the duration of the round,
it defers to peer positions for determining the consensus outcome as if it
were just *observing*.
* *Switch Ledger* indicates that the node has acquired the correct prior ledger
were just _observing_.
- _Switch Ledger_ indicates that the node has acquired the correct prior ledger
from the network. Although it now has the correct prior ledger, the fact that
it had the wrong one at some point during this round means it is likely behind
and should defer to peer positions for determining the consensus outcome.
@@ -201,7 +201,7 @@ validations. It checks this on every call to `timerEntry`.
![Consensus Modes](images/consensus/consensus_modes.png "Consensus Modes")
Once either wrong ledger or switch ledger are reached, the node cannot
return to proposing or observing until the next consensus round. However,
return to proposing or observing until the next consensus round. However,
the node could change its view of the correct prior ledger, so going from
switch ledger to wrong ledger and back again is possible.
@@ -215,16 +215,16 @@ decide how best to generate the next ledger once it declares consensus.
### Phases
As depicted in the overview diagram, consensus is best viewed as a progression
through 3 phases. There are 4 public methods of the generic consensus algorithm
through 3 phases. There are 4 public methods of the generic consensus algorithm
that determine this progression
* `startRound` begins a consensus round.
* `timerEntry` is called at a regular frequency (`LEDGER_MIN_CLOSE`) and is the
only call to consensus that can change the phase from `Open` to `Establish`
- `startRound` begins a consensus round.
- `timerEntry` is called at a regular frequency (`LEDGER_MIN_CLOSE`) and is the
only call to consensus that can change the phase from `Open` to `Establish`
or `Accept`.
* `peerProposal` is called whenever a peer proposal is received and is what
- `peerProposal` is called whenever a peer proposal is received and is what
allows a node to update its position in a subsequent `timerEntry` call.
* `gotTxSet` is called when a transaction set is received from the network. This
- `gotTxSet` is called when a transaction set is received from the network. This
is typically in response to a prior request from the node to acquire the
transaction set corresponding to a disagreeing peer's position.
@@ -234,13 +234,13 @@ actions are taken in response to these calls.
#### Open
The `Open` phase is a quiescent period to allow transactions to build up in the
node's open ledger. The duration is a trade-off between latency and throughput.
node's open ledger. The duration is a trade-off between latency and throughput.
A shorter window reduces the latency to generating the next ledger, but also
reduces transaction throughput due to fewer transactions accepted into the
ledger.
A call to `startRound` would forcibly begin the next consensus round, skipping
completion of the current round. This is not expected during normal operation.
completion of the current round. This is not expected during normal operation.
Calls to `peerProposal` or `gotTxSet` simply store the proposal or transaction
set for use in the coming `Establish` phase.
@@ -254,28 +254,27 @@ the ledger.
Under normal circumstances, the open ledger period ends when one of the following
is true
* if there are transactions in the open ledger and more than `LEDGER_MIN_CLOSE`
have elapsed. This is the typical behavior.
* if there are no open transactions and a suitably longer idle interval has
elapsed. This increases the opportunity to get some transaction into
- if there are transactions in the open ledger and more than `LEDGER_MIN_CLOSE`
have elapsed. This is the typical behavior.
- if there are no open transactions and a suitably longer idle interval has
elapsed. This increases the opportunity to get some transaction into
the next ledger and avoids doing useless work closing an empty ledger.
* if more than half the number of prior round peers have already closed or finished
- if more than half the number of prior round peers have already closed or finished
this round. This indicates the node is falling behind and needs to catch up.
When closing the ledger, the node takes its initial position based on the
transactions in the open ledger and uses the current time as
its initial close time estimate. If in the proposing mode, the node shares its
initial position with peers. Now that the node has taken a position, it will
consider any peer positions for this round that arrived earlier. The node
its initial close time estimate. If in the proposing mode, the node shares its
initial position with peers. Now that the node has taken a position, it will
consider any peer positions for this round that arrived earlier. The node
generates disputed transactions for each transaction not in common with a peer's
position. The node also records the vote of each peer for each disputed
position. The node also records the vote of each peer for each disputed
transaction.
In the example below, we suppose our node has closed with transactions 1,2 and 3. It creates disputes
In the example below, we suppose our node has closed with transactions 1,2 and 3. It creates disputes
for transactions 2,3 and 4, since at least one peer position differs on each.
##### disputes ##### {#disputes_image}
##### disputes ##### {#disputes_image}
![Disputes](images/consensus/disputes.png "Disputes")
@@ -286,22 +285,22 @@ exchanges proposals with peers in an attempt to reach agreement on the consensus
transactions and effective close time.
A call to `startRound` would forcibly begin the next consensus round, skipping
completion of the current round. This is not expected during normal operation.
completion of the current round. This is not expected during normal operation.
Calls to `peerProposal` or `gotTxSet` that reflect new positions will generate
disputed transactions for any new disagreements and will update the peer's vote
for all disputed transactions.
A call to `timerEntry` first checks that the node is working from the correct
prior ledger. If not, the node will update the mode and request the correct
ledger. Otherwise, the node updates the node's position and considers whether
to switch to the `Accepted` phase and declare consensus reached. However, at
least `LEDGER_MIN_CONSENSUS` time must have elapsed before doing either. This
prior ledger. If not, the node will update the mode and request the correct
ledger. Otherwise, the node updates the node's position and considers whether
to switch to the `Accepted` phase and declare consensus reached. However, at
least `LEDGER_MIN_CONSENSUS` time must have elapsed before doing either. This
allows peers an opportunity to take an initial position and share it.
##### Update Position
In order to achieve consensus, the node is looking for a transaction set that is
supported by a super-majority of peers. The node works towards this set by
supported by a super-majority of peers. The node works towards this set by
adding or removing disputed transactions from its position based on an
increasing threshold for inclusion.
@@ -310,23 +309,23 @@ increasing threshold for inclusion.
By starting with a lower threshold, a node initially allows a wide set of
transactions into its position. If the establish round continues and the node is
"stuck", a higher threshold can focus on accepting transactions with the most
support. The constants that define the thresholds and durations at which the
support. The constants that define the thresholds and durations at which the
thresholds change are given by `AV_XXX_CONSENSUS_PCT` and
`AV_XXX_CONSENSUS_TIME` respectively, where `XXX` is `INIT`,`MID`,`LATE` and
`STUCK`. The effective close time position is updated using the same
`STUCK`. The effective close time position is updated using the same
thresholds.
Given the [example disputes above](#disputes_image) and an initial threshold
of 50%, our node would retain its position since transaction 1 was not in
dispute and transactions 2 and 3 have 75% support. Since its position did not
change, it would not need to send a new proposal to peers. Peer C would not
dispute and transactions 2 and 3 have 75% support. Since its position did not
change, it would not need to send a new proposal to peers. Peer C would not
change either. Peer A would add transaction 3 to its position and Peer B would
remove transaction 4 from its position; both would then send an updated
position.
Conversely, if the diagram reflected a later call to =timerEntry= that occurs in
the stuck region with a threshold of say 95%, our node would remove transactions
2 and 3 from its candidate set and send an updated position. Likewise, all the
2 and 3 from its candidate set and send an updated position. Likewise, all the
other peers would end up with only transaction 1 in their position.
Lastly, if our node were not in the proposing mode, it would not include its own
@@ -336,7 +335,7 @@ our node would maintain its position of transactions 1, 2 and 3.
##### Checking Consensus
After updating its position, the node checks for supermajority agreement with
its peers on its current position. This agreement is of the exact transaction
its peers on its current position. This agreement is of the exact transaction
set, not just the support of individual transactions. That is, if our position
is a subset of a peer's position, that counts as a disagreement. Also recall
that effective close time agreement allows a supermajority of participants
@@ -344,10 +343,10 @@ agreeing to disagree.
Consensus is declared when the following 3 clauses are true:
* `LEDGER_MIN_CONSENSUS` time has elapsed in the establish phase
* At least 75% of the prior round proposers have proposed OR this establish
- `LEDGER_MIN_CONSENSUS` time has elapsed in the establish phase
- At least 75% of the prior round proposers have proposed OR this establish
phase is `LEDGER_MIN_CONSENSUS` longer than the last round's establish phase
* `minimumConsensusPercentage` of ourself and our peers share the same position
- `minimumConsensusPercentage` of ourself and our peers share the same position
The middle condition ensures slower peers have a chance to share positions, but
prevents waiting too long on peers that have disconnected. Additionally, a node
@@ -364,22 +363,22 @@ logic.
Once consensus is reached (or moved on), the node switches to the `Accept` phase
and signals to the implementing code that the round is complete. That code is
responsible for using the consensus transaction set to generate the next ledger
and calling `startRound` to begin the next round. The implementation has total
and calling `startRound` to begin the next round. The implementation has total
freedom on ordering transactions, deciding what to do if consensus moved on,
determining whether to retry or abandon local transactions that did not make the
consensus set and updating any internal state based on the consensus progress.
#### Accept
The `Accept` phase is the terminal phase of the consensus algorithm. Calls to
The `Accept` phase is the terminal phase of the consensus algorithm. Calls to
`timerEntry`, `peerProposal` and `gotTxSet` will not change the internal
consensus state while in the accept phase. The expectation is that the
consensus state while in the accept phase. The expectation is that the
application specific code is working to generate the new ledger based on the
consensus outcome. Once complete, that code should make a call to `startRound`
to kick off the next consensus round. The `startRound` call includes the new
prior ledger, prior ledger ID and whether the round should begin in the
proposing or observing mode. After setting some initial state, the phase
transitions to `Open`. The node will also check if the provided prior ledger
proposing or observing mode. After setting some initial state, the phase
transitions to `Open`. The node will also check if the provided prior ledger
and ID are correct, updating the mode and requesting the proper ledger from the
network if necessary.
@@ -448,9 +447,9 @@ struct TxSet
### Ledger
The `Ledger` type represents the state shared amongst the
distributed participants. Notice that the details of how the next ledger is
distributed participants. Notice that the details of how the next ledger is
generated from the prior ledger and the consensus accepted transaction set is
not part of the interface. Within the generic code, this type is primarily used
not part of the interface. Within the generic code, this type is primarily used
to know that peers are working on the same tip of the ledger chain and to
provide some basic timing data for consensus.
@@ -626,7 +625,7 @@ struct Adaptor
// Called when consensus operating mode changes
void onModeChange(ConsensuMode before, ConsensusMode after);
// Called when ledger closes. Implementation should generate an initial Result
// with position based on the current open ledger's transactions.
ConsensusResult onClose(Ledger const &, Ledger const & prev, ConsensusMode mode);
@@ -657,27 +656,24 @@ struct Adaptor
The implementing class hides many details of the peer communication
model from the generic code.
* The `share` member functions are responsible for sharing the given type with a
- The `share` member functions are responsible for sharing the given type with a
node's peers, but are agnostic to the mechanism. Ideally, messages are delivered
faster than `LEDGER_GRANULARITY`.
* The generic code does not specify how transactions are submitted by clients,
faster than `LEDGER_GRANULARITY`.
- The generic code does not specify how transactions are submitted by clients,
propagated through the network or stored in the open ledger. Indeed, the open
ledger is only conceptual from the perspective of the generic code---the
initial position and transaction set are opaquely generated in a
`Consensus::Result` instance returned from the `onClose` callback.
* The calls to `acquireLedger` and `acquireTxSet` only have non-trivial return
if the ledger or transaction set of interest is available. The implementing
- The calls to `acquireLedger` and `acquireTxSet` only have non-trivial return
if the ledger or transaction set of interest is available. The implementing
class is free to block while acquiring, or return the empty option while
servicing the request asynchronously. Due to legacy reasons, the two calls
servicing the request asynchronously. Due to legacy reasons, the two calls
are not symmetric. `acquireTxSet` requires the host application to call
`gotTxSet` when an asynchronous `acquire` completes. Conversely,
`acquireLedger` will be called again later by the consensus code if it still
desires the ledger with the hope that the asynchronous acquisition is
complete.
## Validation
Coming Soon!