The existing logic involves every server sending every transaction
that it receives to all its peers (except the one that it received
a transaction from).
This commit instead uses a randomized algorithm, where a node will
randomly select peers to relay a given transaction to, caching the
list of transaction hashes that are not relayed and forwading them
to peers once every second. Peers can then determine whether there
are transactions that they have not seen and can request them from
the node which has them.
It is expected that this feature will further reduce the bandwidth
needed to operate a server.
The following changes were made:
- Removed dependency on template defined in beast detail namespace.
- Removed Section::find() method which had an obsolete interface.
- Made Section::get<>() easier to use for the common case of
retrieving a std::string. The revised get() method replaces old
calls to Section::find().
- Provided a default template parameter to free function
get<>(Section config, std::string name) so it stays similar to
Section::get<>().
Then the rest of the code was adapted to these changes.
- Calls to Section::find() were replaced with calls to Section::get.
- Unnecessary get<std::string>() arguments were reduced to get().
These changes dug up an interesting artifact in the SHAMap unit
tests. I'm not sure why the tests were working before, but there
was a problem with the case of a Section key. The unit test is
fixed.
The `[node_size]` configuration parameter is used to tune various
parameters based on the hardware that the code is running on. The
parameter can take five distinct values: `tiny`, `small`, `medium`,
`large` and `huge`.
The default value in the code is `tiny` but the default configuration
file sets the value to `medium`. This commit attempts to detect the
amount of RAM on the system and adjusts the node size default value
based on the amount of RAM and the number of hardware execution
threads on the system.
The decision matrix currently used is:
| | 1 | 2 or 3 | ≥ 4 |
|:-------:|:----:|:------:|:------:|
| > ~8GB | tiny | tiny | tiny |
| > ~12GB | tiny | small | small |
| > ~16GB | tiny | small | medium |
| > ~24GB | tiny | small | large |
| > ~32GB | tiny | small | huge |
Some systems exclude memory reserved by the the hardware, the kernel
or the underlying hypervisor so the automatic detection code may end
up determining the node_size to be one less than "appropriate" given
the above table.
The detection algorithm is simplistic and does not take into account
other relevant factors. Therefore, for production-quality servers it
is recommended that server operators examine the system holistically
and determine what the appropriate size is instead of relying on the
automatic detection code.
To aid server operators, the node size will now be reported in the
`server_info` API as `node_size` when the command is invoked in
'admin' mode.
* Add a new operating mode to rippled called reporting mode
* Add ETL mechanism for a reporting node to extract data from a p2p node
* Add new gRPC methods to faciliate ETL
* Use Postgres in place of SQLite in reporting mode
* Add Cassandra as a nodestore option
* Update logic of RPC handlers when running in reporting mode
* Add ability to forward RPCs to a p2p node
- Add validation/proposal reduce-relay feature negotiation to
the handshake
- Make squelch duration proportional to a number of peers that
can be squelched
- Refactor makeRequest()/makeResponse() to facilitate handshake
unit-testing
- Fix compression enable flag for inbound peer
- Fix compression algorithm parsing in the header parser
- Fix squelch duration in onMessage(TMSquelch)
This commit fixes 3624, fixes 3639 and fixes 3641
This comment explains this patch and the associated patches
that should be folded into it. This paragraph should be removed
when the patches are folded after review.
This change significantly improves ledger sync and fetch
times while reducing memory consumption. The change affects
the code from that begins with SHAMap::getMissingNodes and runs
through to Database::threadEntry.
The existing code issues a number of async fetches which are then
handed off to the Database's pool of read threads to execute.
The results of each read are placed in the Database's positive
and negative caches. The caller waits for all reads to complete
and then retrieves the results out of these caches.
Among other issues, this means that the results of the first read
cannot be processed until the last read completes. Additionally,
all the results must sit in memory.
This patch changes the behavior so that each read operation has a
completion handler associated with it. The completion of the read
calls the handler, allowing the results of each read to be
processed as it completes. As this was the only reason the
negative and positive caches were needed, they can now be removed.
The read generation code is also no longer needed and is removed.
The batch fetch logic was never implemented or supported and is
removed.
When evaluating the fitness and usefulness of an outbound peer, the code
would incorrectly calculate the amount of time that the peer spent in
a non-useful state.
This commit, if merged, corrects the calculation and makes the timeout
values configurable by server operators.
Two new options are introduced in the 'overlay' stanza of the config
file. The default values, in seconds, are:
[overlay]
max_unknown_time = 600
max_diverged_time = 300
This commit replaces the `peers_max` configuration element which had
a predetermined split between incoming and outgoing connections with
two new configuration options, `peers_in_max` and `peers_out_max`,
which server operators can use to explicitly control the number of
incoming and outgoing peer slots.
This commit introduces a new configuration option that server
operators can set. The value is communicated to other servers
and is also reported via the `server_info` API.
The value is meant to allow third-party applications or tools
to group servers together. For example, a tool that visualizes
the network's topology can group servers together.
Similar to the "Domain" field in validator manifests, an operator
can claim any domain. Prior to relying on the value returned, the
domain should be verified by retrieving the xrp-ledger.toml file
from the domain and looking for the server's public key in the
`nodes` array.
The job queue can impose limits of how many jobs of a particular
type can be queued.
This commit makes the previously hard-coded limit associated with
transactions configurable by the server's operator. Servers that
have increased memory capacity or which expect to see an influx
of transactions can increase the number of transactions their
server will be able to queue.
This commit fixes#3556.
* Distinguish between recent and historical shards
* Allow multiple storage paths for historical shards
* Add documentation for this feature
* Add unit tests
With few exceptions, servers will typically receive multiple copies
of any given message from its directly connected peers. For servers
with several peers this can impact the processing latency and force
it to do redundant work. Proposal and validation messages are often
relayed with extremely high redundancy.
This commit, if merged, introduces experimental code that attempts
to optimize the relaying of proposals and validations by allowing
servers to instruct their peers to "squelch" delivery of selected
proposals and validations. Servers making squelching decisions by
a process that evaluates the fitness and performance of a given
server and randomly selecting a subset of the best candidates.
The experimental code is presently disabled and must be explicitly
enabled by server operators that wish to test it.
The checkpointer class had assumed that the database would exist for the
lifetime of the application. This is no long true. These changes resolve bugs
involving dangling pointers.
* Document delete_batch, back_off_milliseconds, age_threshold_seconds.
* Convert those time values to chrono types.
* Fix bug that ignored age_threshold_seconds.
* Add a "recovery buffer" to the config that gives the node a chance to
recover before aborting online delete.
* Add begin/end log messages around the SQL queries.
* Add a new configuration section: [sqlite] to allow tuning the sqlite
database operations. Ignored on full/large history servers.
* Update documentation of [node_db] and [sqlite] in the
rippled-example.cfg file.
Resolves#3321
* The amendment ballot counting code contained a minor technical
flaw, caused by the use of integer arithmetic and rounding
semantics, that could allow amendments to reach majority with
slightly less than 80% support. This commit introduces an
amendment which, if enabled, will ensure that activation
requires at least 80% support.
* This commit also introduces a configuration option to adjust
the amendment activation hysteresis. This option is useful on
test networks, but should not be used on the main network as
is a network-wide consensus parameter that should not be
changed on a per-server basis; doing so can result in a
hard-fork.
Fixes#3396
In deciding whether to relay a proposal or validation, a server would
consider whether it was issued by a validator on that server's UNL.
While both trusted proposals and validations were always relayed,
the code prioritized relaying of untrusted proposals over untrusted
validations. While not technically incorrect, validations are
generally more "valuable" because they are required during the
consensus process, whereas proposals are not, strictly, required.
The commit introduces two new configuration options, allowing server
operators to fine-tune the relaying behavior:
The `[relay_proposals]` option controls the relaying behavior for
proposals received by this server. It has two settings: "trusted"
and "all" and the default is "trusted".
The `[relay_validations]` options controls the relaying behavior for
validations received by this server. It has two settings: "trusted"
and "all" and the default is "all".
This change does not require an amendment as it does not affect
transaction processing.
This commit introduces no functional changes but cleans up the
code and shrinks the surface area by removing dead and unused
code, leveraging std:: alternatives to hand-rolled code and
improving comments and documentation.
This commit removes obsolete comments, dead or no longer useful
code, and workarounds for several issues that were present in older
compilers that we no longer support.
Specifically:
- It improves the transaction metadata handling class, simplifying
its use and making it less error-prone.
- It reduces the footprint of the Serializer class by consolidating
code and leveraging templates.
- It cleanups the ST* class hierarchy, removing dead code, improving
and consolidating code to reduce complexity and code duplication.
- It shores up the handling of currency codes and the conversation
between 160-bit currency codes and their string representation.
- It migrates beast::secure_erase to the ripple namespace and uses
a call to OpenSSL_cleanse instead of the custom implementation.
- Add missing `#include` in `ripple/core/JobTypeInfo.h`
- Protect version string from clang-format in
`ripple/protocol/impl/BuildInfo.cpp`.
`Builds/CMake/RippledVersion.cmake` searches for this line by pattern.
* scoped_lock is now a std name with subtly different semantics
compared to lock_guard. Namely it can be used to lock 0 or
more mutexes. This is valuable, but can also be accidentally
used to lock 0 mutexes when 1 was intended, creating a
run-time error.
Therefore, if and when we use scoped_lock, extra care needs to
be taken in reviewing that code to ensure it doesn't
accidentally lock 0 mutexes when 1 was intended. To aid in
such careful reviewing, the use of the name scoped_lock should
be limited to those cases where the number of mutexes is not
exactly one.
* Peers negotiate compression via HTTP Header "X-Offer-Compression: lz4"
* Messages greater than 70 bytes and protocol type messages MANIFESTS,
ENDPOINTS, TRANSACTION, GET_LEDGER, LEDGER_DATA, GET_OBJECT,
and VALIDATORLIST are compressed
* If the compressed message is larger than the uncompressed message
then the uncompressed message is sent
* Compression flag and the compression algorithm type are included
in the message header
* Only LZ4 block compression is currently supported
* Make ShardArchiveHandler a singleton.
* Add state database for ShardArchiveHandler.
* Use temporary database for SSLHTTPDownloader downloads.
* Make ShardArchiveHandler a Stoppable class.
* Automatically resume interrupted downloads at server start.
* Reduce lock scope on all public functions
* Use TaskQueue to process shard finalization in separate thread
* Store shard last ledger hash and other info in backend
* Use temp SQLite DB versus control file when acquiring
* Remove boost serialization from cmake files
Prior to this commit, the queue and execution times for individual jobs
were reported indepedently and could, potentially, be out of sync. This
change reports both values when either one of the exceeds the reporting
threshold.
Treat all `#` characters in config files as comments (and remove)
*unless* the `#` is immediately preceded by `\`. Write a warning
to log file when trailing comments are found/ignored in the config
to let operators know that the treatment of trailing `#` has changed.
Fixes#3121
The `node_size` configuration option is used to automatically
configure various parameters (cache sizes, timeouts, etc) for
the server.
A previous commit included changes that caused incorrect values
to be returned which can result in sub-optimal performance that
can manifest as difficulty syncing to the network, or increased
disk I/O and/or memory usage. The problem was introduced with
commit 66fad62e66.
This commit, if merged, fixes the code to ensure that the correct
values are returned and introduces a compile-time check to prevent
this issue from reoccurring.
The existing platform detection code was derived from the old Beast
library, which was, itself, derived from JUCE.
This commit removes that code and replaces it with the Boost.Predef
library which defines a consistent set of compiler, architecture,
operating system, library, and other version numbers.
For more on Boost.Predef, please see the Boost documentation. The
documentation for the current version as of this writing is at:
https://www.boost.org/doc/libs/1_71_0/doc/html/predef.html
* replace boost::beast::detail::iequals with boost::iequals
* replace deprecated `buffers` function with `make_printable`
* replace boost::beast::detail::ascii_tolower with lambda
* add missing includes