While there should never be a missing node when copying the SHAMap,
rippled should not terminate when there's an error rotating the
database. This patch aborts the database rotation rather than aborting rippled.
This commit addresses minor bugs introduced with commit
6faaa91850:
- The number of threads used by the database engine was
incorrectly clamped to the lower possible value, such
that the database was effectively operating in single
threaded mode.
- The number of requests to extract at once was so high
that it could result in increased latency. The bundle
size is now limited to 4 and can be adjusted by a new
configuration option `rq_bundle` in the `[node_db]`
stanza. This is an advanced tunable and adjusting it
should not be needed.
The nodestore includes a built-in cache to reduce the disk I/O
load but, by default, this cache was not initialized unless it
was explicitly configured by the server operator.
This commit introduces sensible defaults based on the server's
configured node size.
It remains possible to completely disable the cache if desired
by explicitly configuring it the cache size and age parameters
to 0:
[node_db]
...
cache_size = 0
cache_age = 0
- Only duplicate records from archive to writable during online_delete.
- Log duration of nodestore reads.
- Include nodestore counters in perf_log output.
- Remove gratuitous nodestore activity counting.
- Report initial sync duration in server_info and perfLog.
- Report state_accounting in perfLog.
- Make state_accounting durations more accurate.
- Parallel ledger loader.
- Config parameter to load ledgers on start.
The following changes were made:
- Removed dependency on template defined in beast detail namespace.
- Removed Section::find() method which had an obsolete interface.
- Made Section::get<>() easier to use for the common case of
retrieving a std::string. The revised get() method replaces old
calls to Section::find().
- Provided a default template parameter to free function
get<>(Section config, std::string name) so it stays similar to
Section::get<>().
Then the rest of the code was adapted to these changes.
- Calls to Section::find() were replaced with calls to Section::get.
- Unnecessary get<std::string>() arguments were reduced to get().
These changes dug up an interesting artifact in the SHAMap unit
tests. I'm not sure why the tests were working before, but there
was a problem with the case of a Section key. The unit test is
fixed.
* Add a new operating mode to rippled called reporting mode
* Add ETL mechanism for a reporting node to extract data from a p2p node
* Add new gRPC methods to faciliate ETL
* Use Postgres in place of SQLite in reporting mode
* Add Cassandra as a nodestore option
* Update logic of RPC handlers when running in reporting mode
* Add ability to forward RPCs to a p2p node
This comment explains this patch and the associated patches
that should be folded into it. This paragraph should be removed
when the patches are folded after review.
This change significantly improves ledger sync and fetch
times while reducing memory consumption. The change affects
the code from that begins with SHAMap::getMissingNodes and runs
through to Database::threadEntry.
The existing code issues a number of async fetches which are then
handed off to the Database's pool of read threads to execute.
The results of each read are placed in the Database's positive
and negative caches. The caller waits for all reads to complete
and then retrieves the results out of these caches.
Among other issues, this means that the results of the first read
cannot be processed until the last read completes. Additionally,
all the results must sit in memory.
This patch changes the behavior so that each read operation has a
completion handler associated with it. The completion of the read
calls the handler, allowing the results of each read to be
processed as it completes. As this was the only reason the
negative and positive caches were needed, they can now be removed.
The read generation code is also no longer needed and is removed.
The batch fetch logic was never implemented or supported and is
removed.
This commit combines a number of cleanups, targeting both the
code structure and the code logic. Large changes include:
- Using more strongly-typed classes for SHAMap nodes, instead of relying
on runtime-time detection of class types. This change saves 16 bytes
of memory per node.
- Improving the interface of SHAMap::addGiveItem and SHAMap::addItem to
avoid the need for passing two bool arguments.
- Documenting the "copy-on-write" semantics that SHAMap uses to
efficiently track changes in individual nodes.
- Removing unused code and simplifying several APIs.
- Improving function naming.
* Document delete_batch, back_off_milliseconds, age_threshold_seconds.
* Convert those time values to chrono types.
* Fix bug that ignored age_threshold_seconds.
* Add a "recovery buffer" to the config that gives the node a chance to
recover before aborting online delete.
* Add begin/end log messages around the SQL queries.
* Add a new configuration section: [sqlite] to allow tuning the sqlite
database operations. Ignored on full/large history servers.
* Update documentation of [node_db] and [sqlite] in the
rippled-example.cfg file.
Resolves#3321
Commit e257a22 introduced changes in the logic used to acquire historical
ledgers. The logic could cause historical ledgers to be acquired only since
the last online deletion interval instead of the configured value to allow
deletion.
* Reduce lock scope on all public functions
* Use TaskQueue to process shard finalization in separate thread
* Store shard last ledger hash and other info in backend
* Use temp SQLite DB versus control file when acquiring
* Remove boost serialization from cmake files
The `node_size` configuration option is used to automatically
configure various parameters (cache sizes, timeouts, etc) for
the server.
A previous commit included changes that caused incorrect values
to be returned which can result in sub-optimal performance that
can manifest as difficulty syncing to the network, or increased
disk I/O and/or memory usage. The problem was introduced with
commit 66fad62e66.
This commit, if merged, fixes the code to ensure that the correct
values are returned and introduces a compile-time check to prevent
this issue from reoccurring.
* replace boost::beast::detail::iequals with boost::iequals
* replace deprecated `buffers` function with `make_printable`
* replace boost::beast::detail::ascii_tolower with lambda
* add missing includes
The DatabaseImp has threads that asynchronously call JobQueue to
perform database reads. Formerly these threads had the same
lifespan as Database, which was until the end-of-life of
ApplicationImp. During shutdown these threads could call JobQueue
after JobQueue had already stopped. Or, even worse, occasionally
call JobQueue after JobQueue's destructor had run.
To avoid these shutdown conditions, Database is made a Stoppable,
with JobQueue as its parent. When Database stops, it shuts down
its asynchronous read threads. This prevents Database from
accessing JobQueue after JobQueue has stopped, but allows
Database to perform stores for the remainder of shutdown.
During development it was noted that the Database::close()
method was never called. So that method is removed from Database
and all derived classes.
Stoppable is also adjusted so it can be constructed using either
a char const* or a std::string.
For those files touched for other reasons, unneeded #includes
are removed.
All uses of beast::Thread were previously removed from the code
base, so beast::Thread is removed. One piece of beast::Thread
needed to be preserved: the ability to set the current thread's
name. So there's now a beast::CurrentThreadName that allows the
current thread's name to be set and returned.
Thread naming is also cleaned up a bit. ThreadName.h and .cpp
are removed since beast::CurrentThreadName does a better job.
ThreadEntry is also removed, but its terminateHandler() is
preserved in TerminateHandler.cpp. The revised terminateHandler()
uses beast::CurrentThreadName to recover the name of the running
thread.
Finally, the NO_LOG_UNHANDLED_EXCEPTIONS #define is removed since
it was discovered that the MacOS debugger preserves the stack
of the original throw even if the terminateHandler() rethrows.