Add Reporting Mode

* Add a new operating mode to rippled called reporting mode
* Add ETL mechanism for a reporting node to extract data from a p2p node
* Add new gRPC methods to faciliate ETL
* Use Postgres in place of SQLite in reporting mode
* Add Cassandra as a nodestore option
* Update logic of RPC handlers when running in reporting mode
* Add ability to forward RPCs to a p2p node
This commit is contained in:
CJ Cobb
2020-09-15 16:54:43 -04:00
committed by manojsdoshi
parent b0a39c5f86
commit 27543170d0
136 changed files with 9943 additions and 755 deletions

10
cfg/initdb.sh Executable file
View File

@@ -0,0 +1,10 @@
#!/bin/sh
# Execute this script with a running Postgres server on the current host.
# It should work with the most generic installation of Postgres,
# and is necessary for rippled to store data in Postgres.
# usage: sudo -u postgres ./initdb.sh
psql -c "CREATE USER rippled"
psql -c "CREATE DATABASE rippled WITH OWNER = rippled"

View File

@@ -13,15 +13,17 @@
#
# 4. HTTPS Client
#
# 5. Database
# 5. Reporting Mode
#
# 6. Diagnostics
# 6. Database
#
# 7. Voting
# 7. Diagnostics
#
# 8. Misc Settings
# 8. Voting
#
# 9. Example Settings
# 9. Misc Settings
#
# 10. Example Settings
#
#-------------------------------------------------------------------------------
#
@@ -827,18 +829,122 @@
# certificates that the server will accept for verifying HTTP servers.
# Used only for outbound HTTPS client connections.
#
#-------------------------------------------------------------------------------
#
# 5. Reporting Mode
#
#------------
#
# rippled has an optional operating mode called Reporting Mode. In Reporting
# Mode, rippled does not connect to the peer to peer network. Instead, rippled
# will continuously extract data from one or more rippled servers that are
# connected to the peer to peer network (referred to as an ETL source).
# Reporting mode servers will forward RPC requests that require access to the
# peer to peer network (submit, fee, etc) to an ETL source.
#
# [reporting] Settings for Reporting Mode. If and only if this section is
# present, rippled will start in reporting mode. This section
# contains a list of ETL source names, and key-value pairs. The
# ETL source names each correspond to a configuration file
# section; the names must match exactly. The key-value pairs are
# optional.
#
#
# [<name>]
#
# A series of key/value pairs that specify an ETL source.
#
# source_ip = <IP-address>
#
# Required. IP address of the ETL source
#
# source_ws_port = <number>
#
# Required. Port on which ETL source is accepting unencrypted websocket
# connections.
#
# source_grpc_port = <number>
#
# Required for ETL. Port on which ETL source is accepting gRPC requests.
# If this option is ommitted, this ETL source cannot actually be used for
# ETL; the Reporting Mode server can still forward RPCs to this ETL
# source, but cannot extract data from this ETL source.
#
#
# Key-value pairs (all optional):
#
# read_only Valid values: 0, 1. Default is 0. If set to 1, the server
# will start in strict read-only mode, and will not perform
# ETL. The server will still handle RPC requests, and will
# still forward RPC requests that require access to the p2p
# network.
#
# start_sequence
# Sequence of first ledger to extract if the database is empty.
# ETL extracts ledgers in order. If this setting is absent and
# the database is empty, ETL will start with the next ledger
# validated by the network. If this setting is present and the
# database is not empty, an exception is thrown.
#
# num_markers Degree of parallelism used during the initial ledger
# download. Only used if the database is empty. Valid values
# are 1-256. A higher degree of parallelism results in a
# faster download, but puts more load on the ETL source.
# Default is 2.
#
# Example:
#
# [reporting]
# etl_source1
# etl_source2
# read_only=0
# start_sequence=32570
# num_markers=8
#
# [etl_source1]
# source_ip=1.2.3.4
# source_ws_port=6005
# source_grpc_port=50051
#
# [etl_source2]
# source_ip=5.6.7.8
# source_ws_port=6005
# source_grpc_port=50051
#
# Minimal Example:
#
# [reporting]
# etl_source1
#
# [etl_source1]
# source_ip=1.2.3.4
# source_ws_port=6005
# source_grpc_port=50051
#
#
# Notes:
#
# Reporting Mode requires Postgres (instead of SQLite). The Postgres
# connection info is specified under the [ledger_tx_tables] config section;
# see the Database section for further documentation.
#
#
#-------------------------------------------------------------------------------
#
# 5. Database
# 6. Database
#
#------------
#
# rippled creates 4 SQLite database to hold bookkeeping information
# about transactions, local credentials, and various other things.
# It also creates the NodeDB, which holds all the objects that
# make up the current and historical ledgers.
# make up the current and historical ledgers. In Reporting Mode, rippled
# uses a Postgres database instead of SQLite.
#
# The simplest way to work with Postgres is to install it locally.
# When it is running, execute the initdb.sh script in the current
# directory as: sudo -u postgres ./initdb.sh
# This will create the rippled user and an empty database of the same name.
#
# The size of the NodeDB grows in proportion to the amount of new data and the
# amount of historical data (a configurable setting) so the performance of the
@@ -880,12 +986,52 @@
# keeping full history is not advised, and using online delete is
# recommended.
#
# Required keys:
# path Location to store the database (all types)
# type = Cassandra
#
# Optional keys:
# Apache Cassandra is an open-source, distributed key-value store - see
# https://cassandra.apache.org/ for more details.
#
# These keys are possible for any type of backend:
# Cassandra is an alternative backend to be used only with Reporting Mode.
# See the Reporting Mode section for more details about Reporting Mode.
#
# Required keys for NuDB and RocksDB:
#
# path Location to store the database
#
# Required keys for Cassandra:
#
# contact_points IP of a node in the Cassandra cluster
#
# port CQL Native Transport Port
#
# secure_connect_bundle
# Absolute path to a secure connect bundle. When using
# a secure connect bundle, contact_points and port are
# not required.
#
# keyspace Name of Cassandra keyspace to use
#
# table_name Name of table in above keyspace to use
#
# Optional keys
#
# cache_size Size of cache for database records. Default is 16384.
# Setting this value to 0 will use the default value.
#
# cache_age Length of time in minutes to keep database records
# cached. Default is 5 minutes. Setting this value to
# 0 will use the default value.
#
# Note: if neither cache_size nor cache_age is
# specified, the cache for database records will not
# be created. If only one of cache_size or cache_age
# is specified, the cache will be created using the
# default value for the unspecified parameter.
#
# Note: the cache will not be created if online_delete
# is specified, or if shards are used.
#
# Optional keys for NuDB or RocksDB:
#
# earliest_seq The default is 32570 to match the XRP ledger
# network's earliest allowed sequence. Alternate
@@ -947,6 +1093,22 @@
# delete process is unable to finish.
# Default is unset.
#
# Optional keys for Cassandra:
#
# username Username to use if Cassandra cluster requires
# authentication
#
# password Password to use if Cassandra cluster requires
# authentication
#
# max_requests_outstanding
# Limits the maximum number of concurrent database
# writes. Default is 10 million. For slower clusters,
# large numbers of concurrent writes can overload the
# cluster. Setting this option can help eliminate
# write timeouts and other write errors due to the
# cluster being overloaded.
#
# Notes:
# The 'node_db' entry configures the primary, persistent storage.
#
@@ -1068,12 +1230,45 @@
# This setting may not be combined with the
# "safety_level" setting.
#
# [ledger_tx_tables] (optional)
#
# conninfo Info for connecting to Postgres. Format is
# postgres://[username]:[password]@[ip]/[database].
# The database and user must already exist. If this
# section is missing and rippled is running in
# Reporting Mode, rippled will connect as the
# user running rippled to a database with the
# same name. On Linux and Mac OS X, the connection
# will take place using the server's UNIX domain
# socket. On Windows, through the localhost IP
# address. Default is empty.
#
# use_tx_tables Valid values: 1, 0
# The default is 1 (true). Determines whether to use
# the SQLite transaction database. If set to 0,
# rippled will not write to the transaction database,
# and will reject tx, account_tx and tx_history RPCs.
# In Reporting Mode, this setting is ignored.
#
# max_connections Valid values: any positive integer up to 64 bit
# storage length. This configures the maximum
# number of concurrent connections to postgres.
# Default is the maximum possible value to
# fit in a 64 bit integer.
#
# timeout Number of seconds after which idle postgres
# connections are discconnected. If set to 0,
# connections never timeout. Default is 600.
#
#
# remember_ip Value values: 1, 0
# Default is 1 (true). Whether to cache host and
# port connection settings.
#
#
#-------------------------------------------------------------------------------
#
# 6. Diagnostics
# 7. Diagnostics
#
#---------------
#
@@ -1147,7 +1342,7 @@
#
#-------------------------------------------------------------------------------
#
# 7. Voting
# 8. Voting
#
#----------
#
@@ -1198,7 +1393,7 @@
#
#-------------------------------------------------------------------------------
#
# 8. Misc Settings
# 9. Misc Settings
#
#-----------------
#
@@ -1285,7 +1480,7 @@
#
#-------------------------------------------------------------------------------
#
# 9. Example Settings
# 10. Example Settings
#
#--------------------
#
@@ -1409,6 +1604,16 @@ advisory_delete=0
[database_path]
/var/lib/rippled/db
# To use Postgres, uncomment this section and fill in the appropriate connection
# info. Postgres can only be used in Reporting Mode.
# To disable writing to the transaction database, uncomment this section, and
# set use_tx_tables=0
# [ledger_tx_tables]
# conninfo = postgres://[username]:[password]@[ip]/[database]
# use_tx_tables=1
# This needs to be an absolute directory reference, not a relative one.
# Modify this value as required.
[debug_logfile]
@@ -1442,3 +1647,16 @@ validators.txt
# set to ssl_verify to 0.
[ssl_verify]
1
# To run in Reporting Mode, uncomment this section and fill in the appropriate
# connection info for one or more ETL sources.
# [reporting]
# etl_source
#
#
# [etl_source]
# source_grpc_port=50051
# source_ws_port=6005
# source_ip=127.0.0.1