Add support for deterministic database shards (#2688):

Add support to allow multiple indepedent nodes to produce a binary identical
shard for a given range of ledgers. The advantage is that servers can use
content-addressable storage, and can more efficiently retrieve shards by
downloading from multiple peers at once and then verifying the integrity of
a shard by cross-checking its checksum with the checksum other servers report.
This commit is contained in:
cdy20
2020-04-01 18:30:40 -04:00
committed by manojsdoshi
parent 06bd16c928
commit f91b568069
10 changed files with 892 additions and 54 deletions

View File

@@ -517,6 +517,7 @@ target_sources (rippled PRIVATE
src/ripple/nodestore/impl/DatabaseNodeImp.cpp
src/ripple/nodestore/impl/DatabaseRotatingImp.cpp
src/ripple/nodestore/impl/DatabaseShardImp.cpp
src/ripple/nodestore/impl/DeterministicShard.cpp
src/ripple/nodestore/impl/DecodedBlob.cpp
src/ripple/nodestore/impl/DummyScheduler.cpp
src/ripple/nodestore/impl/EncodedBlob.cpp

View File

@@ -154,6 +154,7 @@ test.nodestore > ripple.basics
test.nodestore > ripple.beast
test.nodestore > ripple.core
test.nodestore > ripple.nodestore
test.nodestore > ripple.protocol
test.nodestore > ripple.unity
test.nodestore > test.jtx
test.nodestore > test.toplevel

View File

@@ -88,6 +88,21 @@ public:
virtual bool
isOpen() = 0;
/** Open the backend.
@param createIfMissing Create the database files if necessary.
@param appType Deterministic appType used to create a backend.
@param uid Deterministic uid used to create a backend.
@param salt Deterministic salt used to create a backend.
@throws std::runtime_error is function is called not for NuDB backend.
*/
virtual void
open(bool createIfMissing, uint64_t appType, uint64_t uid, uint64_t salt)
{
Throw<std::runtime_error>(
"Deterministic appType/uid/salt not supported by backend " +
getName());
}
/** Close the backend.
This allows the caller to catch exceptions.
*/

View File

@@ -0,0 +1,163 @@
# Deterministic Database Shards
This doc describes the standard way to assemble the database shard.
A shard assembled using this approach becomes deterministic i.e.
if two independent sides assemble a shard consisting of the same ledgers,
accounts and transactions, then they will obtain the same shard files
`nudb.dat` and `nudb.key`. The approach deals with the `NuDB` database
format only, refer to `https://github.com/vinniefalco/NuDB`.
## Headers
Due to NuDB database definition, the following headers are used for
database files:
nudb.key:
```
char[8] Type The characters "nudb.key"
uint16 Version Holds the version number
uint64 UID Unique ID generated on creation
uint64 Appnum Application defined constant
uint16 KeySize Key size in bytes
uint64 Salt A random seed
uint64 Pepper The salt hashed
uint16 BlockSize Size of a file block in bytes
uint16 LoadFactor Target fraction in 65536ths
uint8[56] Reserved Zeroes
uint8[] Reserved Zero-pad to block size
```
nudb.dat:
```
char[8] Type The characters "nudb.dat"
uint16 Version Holds the version number
uint64 UID Unique ID generated on creation
uint64 Appnum Application defined constant
uint16 KeySize Key size in bytes
uint8[64] (reserved) Zeroes
```
All of these fields are saved using network byte order
(bigendian: most significant byte first).
To make the shard deterministic the following parameters are used
as values of header field both for `nudb.key` and `nudb.dat` files.
```
Version 2
UID digest(0)
Appnum digest(2) | 0x5348524400000000 /* 'SHRD' */
KeySize 32
Salt digest(1)
Pepper XXH64(Salt)
BlockSize 0x1000 (4096 bytes)
LoadFactor 0.5 (numeric 0x8000)
```
Note: XXH64() is well-known hash algorithm.
The `digest(i)` mentioned above defined as the follows:
First, RIPEMD160 hash `H` calculated of the following structure
(the same as final Key of the shard):
```
uint32 version Version of shard, 2 at the present
uint32 firstSeq Sequence number of first ledger in the shard
uint32 lastSeq Sequence number of last ledger in the shard
uint256 lastHash Hash of last ledger in shard
```
there all 32-bit integers are hashed in network byte order
(bigendian: most significant byte first).
Then, `digest(i)` is defined as the following part of the above hash `H`:
```
digest(0) = H[0] << 56 | H[1] << 48 | ... | H[7] << 0,
digest(1) = H[8] << 56 | H[9] << 48 | ... | H[15] << 0,
digest(2) = H[16] << 24 | H[17] << 16 | ... | H[19] << 0,
```
where `H[i]` denotes `i`-th byte of hash `H`.
## Contents
After deterministic shard is created using the above mentioned headers,
it filled with objects using the following steps.
1. All objects within the shard are visited in the order described in the
next section. Here the objects are: ledger headers, SHAmap tree nodes
including state and transaction nodes, final key.
2. Set of all visited objects is divided into groups. Each group except of
the last contains 16384 objects in the order of their visiting. Last group
may contain less than 16384 objects.
3. All objects within each group are sorted in according to their hashes.
Objects are sorted by increasing of their hashes, precisely, by increasing
of hex representations of hashes in lexicographic order. For example,
the following is an example of sorted hashes in their hex representation:
```
0000000000000000000000000000000000000000000000000000000000000000
154F29A919B30F50443A241C466691B046677C923EE7905AB97A4DBE8A5C2429
2231553FC01D37A66C61BBEEACBB8C460994493E5659D118E19A8DDBB1444273
272DCBFD8E4D5D786CF11A5444B30FB35435933B5DE6C660AA46E68CF0F5C441
3C062FD9F0BCDCA31ACEBCD8E530D0BDAD1F1D1257B89C435616506A3EE6CB9E
58A0E5AE427CDDC1C7C06448E8C3E4BF718DE036D827881624B20465C3E1336F
...
```
4. Finally, objects added to the deterministic shard group by group in the
sorted order within each group from low to high hashes.
## Order of visiting objects
The shard consists of 16384 ledgers and the final key with the hash 0.
Each ledger has the header object and two SMAmaps: state and transaction.
SHAmap is a rooted tree in which each node has maximum of 16 descendants
enumerating by indexes 0..15. Visiting each node in the SHAmap
is performing by functions visitNodes and visitDifferences implemented
in the file `ripple/shamap/impl/ShaMapSync.cpp`.
Here is how the function visitNodes works: it visit the root at first.
Then it visit all nodes in the 1st layer, i. e. the nodes which are
immediately descendants of the root sequentially from index 0 to 15.
Then it visit all nodes in 2nd layer i.e. the nodes which are immediately
descendants the nodes from 1st layer. The order of visiting 2nd layer nodes
is the following. First, descendants of the 1st layer node with index 0
are visited sequintially from index 0 to 15. Then descendents of 1st layer
node with index 1 are visited etc. After visiting all nodes of 2nd layer
the nodes from 3rd layer are visited etc.
The function visitDifferences works similar to visitNodes with the following
exceptions. The first exception is that visitDifferences get 2 arguments:
current SHAmap and previous SHAmap and visit only the nodes from current
SHAmap which and not present in previous SHAmap. The second exception is
that visitDifferences visits all non-leaf nodes in the order of visitNodes
function, but all leaf nodes are visited immedeately after visiting of their
parent node sequentially from index 0 to 15.
Finally, all objects within the shard are visited in the following order.
All ledgers are visited from the ledger with high index to the ledger with
low index in descending order. For each ledger the state SHAmap is visited
first using visitNode function for the ledger with highest index and
visitDifferences function for other ledgers. Then transaction SHAmap is visited
using visitNodes function. At last, the ledger header object is visited.
Final key of the shard is visited at the end.
## Tests
To perform test to deterministic shards implementation one can enter
the following command:
```
rippled --unittest ripple.NodeStore.DatabaseShard
```
The following is the right output of deterministic shards test:
```
ripple.NodeStore.DatabaseShard DatabaseShard deterministic_shard
with backend nudb
Iteration 0: RIPEMD160[nudb.key] = F96BF2722AB2EE009FFAE4A36AAFC4F220E21951
Iteration 0: RIPEMD160[nudb.dat] = FAE6AE84C15968B0419FDFC014931EA12A396C71
Iteration 1: RIPEMD160[nudb.key] = F96BF2722AB2EE009FFAE4A36AAFC4F220E21951
Iteration 1: RIPEMD160[nudb.dat] = FAE6AE84C15968B0419FDFC014931EA12A396C71
```

View File

@@ -38,7 +38,11 @@ namespace NodeStore {
class NuDBBackend : public Backend
{
public:
static constexpr std::size_t currentType = 1;
static constexpr std::uint64_t currentType = 1;
static constexpr std::uint64_t deterministicMask = 0xFFFFFFFF00000000ull;
/* "SHRD" in ASCII */
static constexpr std::uint64_t deterministicType = 0x5348524400000000ull;
beast::Journal const j_;
size_t const keyBytes_;
@@ -98,7 +102,8 @@ public:
}
void
open(bool createIfMissing) override
open(bool createIfMissing, uint64_t appType, uint64_t uid, uint64_t salt)
override
{
using namespace boost::filesystem;
if (db_.is_open())
@@ -119,8 +124,9 @@ public:
dp,
kp,
lp,
currentType,
nudb::make_salt(),
appType,
uid,
salt,
keyBytes_,
nudb::block_size(kp),
0.50,
@@ -133,7 +139,17 @@ public:
db_.open(dp, kp, lp, ec);
if (ec)
Throw<nudb::system_error>(ec);
if (db_.appnum() != currentType)
/** Old value currentType is accepted for appnum in traditional
* databases, new value is used for deterministic shard databases.
* New 64-bit value is constructed from fixed and random parts.
* Fixed part is bounded by bitmask deterministicMask,
* and the value of fixed part is deterministicType.
* Random part depends on the contents of the shard and may be any.
* The contents of appnum field should match either old or new rule.
*/
if (db_.appnum() != currentType &&
(db_.appnum() & deterministicMask) != deterministicType)
Throw<std::runtime_error>("nodestore: unknown appnum");
db_.set_burst(burstSize_);
}
@@ -144,6 +160,12 @@ public:
return db_.is_open();
}
void
open(bool createIfMissing) override
{
open(createIfMissing, currentType, nudb::make_uid(), nudb::make_salt());
}
void
close() override
{

View File

@@ -0,0 +1,216 @@
//------------------------------------------------------------------------------
/*
This file is part of rippled: https://github.com/ripple/rippled
Copyright (c) 2020 Ripple Labs Inc.
Permission to use, copy, modify, and/or distribute this software for any
purpose with or without fee is hereby granted, provided that the above
copyright notice and this permission notice appear in all copies.
THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
ANY SPECIAL , DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
*/
//==============================================================================
#include <ripple/app/main/Application.h>
#include <ripple/beast/hash/hash_append.h>
#include <ripple/core/ConfigSections.h>
#include <ripple/nodestore/Manager.h>
#include <ripple/nodestore/impl/DeterministicShard.h>
#include <ripple/nodestore/impl/Shard.h>
#include <ripple/protocol/digest.h>
#include <fstream>
#include <nudb/detail/format.hpp>
#include <nudb/nudb.hpp>
#include <openssl/ripemd.h>
namespace ripple {
namespace NodeStore {
DeterministicShard::DeterministicShard(
Application& app,
boost::filesystem::path const& dir,
std::uint32_t index,
beast::Journal j)
: app_(app)
, index_(index)
, dir_(dir / "tmp")
, ctx_(std::make_unique<nudb::context>())
, j_(j)
, curMemObjs_(0)
, maxMemObjs_(
app_.getShardStore()->ledgersPerShard() <= 256 ? maxMemObjsTest
: maxMemObjsDefault)
{
}
DeterministicShard::~DeterministicShard()
{
close(true);
}
bool
DeterministicShard::init(Serializer const& finalKey)
{
auto db = app_.getShardStore();
auto fail = [&](std::string const& msg) {
JLOG(j_.error()) << "deterministic shard " << index_
<< " not created: " << msg;
backend_.reset();
try
{
remove_all(dir_);
}
catch (std::exception const& e)
{
JLOG(j_.error()) << "deterministic shard " << index_
<< ". Exception caught in function " << __func__
<< ". Error: " << e.what();
}
return false;
};
if (!db)
return fail("shard store not exists");
if (index_ < db->earliestShardIndex())
return fail("Invalid shard index");
Config const& config{app_.config()};
Section section{config.section(ConfigSection::shardDatabase())};
auto const type{get<std::string>(section, "type", "nudb")};
auto const factory{Manager::instance().find(type)};
if (!factory)
return fail("failed to find factory for " + type);
section.set("path", dir_.string());
backend_ = factory->createInstance(
NodeObject::keyBytes, section, 1, scheduler_, *ctx_, j_);
if (!backend_)
return fail("failed to create database");
ripemd160_hasher h;
h(finalKey.data(), finalKey.size());
auto const result{static_cast<ripemd160_hasher::result_type>(h)};
auto const hash{uint160::fromVoid(result.data())};
auto digest = [&](int n) {
auto const data{hash.data()};
std::uint64_t result{0};
switch (n)
{
case 0:
case 1:
// Construct 64 bits from sequential eight bytes
for (int i = 0; i < 8; i++)
result = (result << 8) + data[n * 8 + i];
break;
case 2:
// Construct 64 bits using the last four bytes of data
result = (static_cast<std::uint64_t>(data[16]) << 24) +
(static_cast<std::uint64_t>(data[17]) << 16) +
(static_cast<std::uint64_t>(data[18]) << 8) +
(static_cast<std::uint64_t>(data[19]));
break;
}
return result;
};
auto const uid{digest(0)};
auto const salt{digest(1)};
auto const appType{digest(2) | deterministicType};
// Open or create the NuDB key/value store
try
{
if (exists(dir_))
remove_all(dir_);
backend_->open(true, appType, uid, salt);
}
catch (std::exception const& e)
{
return fail(
std::string(". Exception caught in function ") + __func__ +
". Error: " + e.what());
}
return true;
}
std::shared_ptr<DeterministicShard>
make_DeterministicShard(
Application& app,
boost::filesystem::path const& shardDir,
std::uint32_t shardIndex,
Serializer const& finalKey,
beast::Journal j)
{
std::shared_ptr<DeterministicShard> dShard(
new DeterministicShard(app, shardDir, shardIndex, j));
if (!dShard->init(finalKey))
return {};
return dShard;
}
void
DeterministicShard::close(bool cancel)
{
try
{
if (cancel)
{
backend_.reset();
remove_all(dir_);
}
else
{
ctx_->flush();
curMemObjs_ = 0;
backend_.reset();
}
}
catch (std::exception const& e)
{
JLOG(j_.error()) << "deterministic shard " << index_
<< ". Exception caught in function " << __func__
<< ". Error: " << e.what();
}
}
bool
DeterministicShard::store(std::shared_ptr<NodeObject> const& nodeObject)
{
try
{
backend_->store(nodeObject);
// Flush to the backend if at threshold
if (++curMemObjs_ >= maxMemObjs_)
{
ctx_->flush();
curMemObjs_ = 0;
}
}
catch (std::exception const& e)
{
JLOG(j_.error()) << "deterministic shard " << index_
<< ". Exception caught in function " << __func__
<< ". Error: " << e.what();
return false;
}
return true;
}
} // namespace NodeStore
} // namespace ripple

View File

@@ -0,0 +1,174 @@
//------------------------------------------------------------------------------
/*
This file is part of rippled: https://github.com/ripple/rippled
Copyright (c) 2020 Ripple Labs Inc.
Permission to use, copy, modify, and/or distribute this software for any
purpose with or without fee is hereby granted, provided that the above
copyright notice and this permission notice appear in all copies.
THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
ANY SPECIAL , DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
*/
//==============================================================================
#ifndef RIPPLE_NODESTORE_DETERMINISTICSHARD_H_INCLUDED
#define RIPPLE_NODESTORE_DETERMINISTICSHARD_H_INCLUDED
#include <ripple/nodestore/DatabaseShard.h>
#include <ripple/nodestore/DummyScheduler.h>
#include <nudb/nudb.hpp>
#include <set>
namespace ripple {
namespace NodeStore {
/** DeterministicShard class.
*
* 1. The init() method creates temporary folder dir_,
* and the deterministic shard is initialized in that folder.
* 2. The store() method adds object to memory pool.
* 3. The flush() method stores all objects from memory pool to the shard
* located in dir_ in sorted order.
* 4. The close(true) method closes the backend and removes the directory.
*/
class DeterministicShard
{
constexpr static std::uint32_t maxMemObjsDefault = 16384u;
constexpr static std::uint32_t maxMemObjsTest = 16u;
/* "SHRD" in ASCII */
constexpr static std::uint64_t deterministicType = 0x5348524400000000ll;
private:
DeterministicShard(DeterministicShard const&) = delete;
DeterministicShard&
operator=(DeterministicShard const&) = delete;
/** Creates the object for shard database
*
* @param app Application object
* @param dir Directory where shard is located
* @param index Index of the shard
* @param j Journal to logging
*/
DeterministicShard(
Application& app,
boost::filesystem::path const& dir,
std::uint32_t index,
beast::Journal j);
/** Initializes the deterministic shard.
*
* @param finalKey Serializer of shard's final key which consists of:
* shard version (32 bit)
* first ledger sequence in the shard (32 bit)
* last ledger sequence in the shard (32 bit)
* hash of last ledger (256 bits)
* @return true if no error, false if error
*/
bool
init(Serializer const& finalKey);
public:
~DeterministicShard();
/** Finalizes and closes the shard.
*/
void
close()
{
close(false);
}
[[nodiscard]] boost::filesystem::path const&
getDir() const
{
return dir_;
}
/** Store a node object in memory.
*
* @param nodeObject The node object to store
* @return true on success.
* @note Flushes all objects in memory to the backend when the number
* of node objects held in memory exceed a threshold
*/
[[nodiscard]] bool
store(std::shared_ptr<NodeObject> const& nodeObject);
private:
/** Finalizes and closes the shard.
*
* @param cancel True if reject the shard and delete all files,
* false if finalize the shard and store them
*/
void
close(bool cancel);
// Application reference
Application& app_;
// Shard Index
std::uint32_t const index_;
// Path to temporary database files
boost::filesystem::path const dir_;
// Dummy scheduler for deterministic write
DummyScheduler scheduler_;
// NuDB context
std::unique_ptr<nudb::context> ctx_;
// NuDB key/value store for node objects
std::shared_ptr<Backend> backend_;
// Journal
beast::Journal const j_;
// Current number of in-cache objects
std::uint32_t curMemObjs_;
// Maximum number of in-cache objects
std::uint32_t const maxMemObjs_;
friend std::shared_ptr<DeterministicShard>
make_DeterministicShard(
Application& app,
boost::filesystem::path const& shardDir,
std::uint32_t shardIndex,
Serializer const& finalKey,
beast::Journal j);
};
/** Creates shared pointer to deterministic shard and initializes it.
*
* @param app Application object
* @param shardDir Directory where shard is located
* @param shardIndex Index of the shard
* @param finalKey Serializer of shard's ginal key which consists of:
* shard version (32 bit)
* first ledger sequence in the shard (32 bit)
* last ledger sequence in the shard (32 bit)
* hash of last ledger (256 bits)
* @param j Journal to logging
* @return Shared pointer to deterministic shard or {} in case of error.
*/
std::shared_ptr<DeterministicShard>
make_DeterministicShard(
Application& app,
boost::filesystem::path const& shardDir,
std::uint32_t shardIndex,
Serializer const& finalKey,
beast::Journal j);
} // namespace NodeStore
} // namespace ripple
#endif

View File

@@ -22,6 +22,7 @@
#include <ripple/basics/StringUtilities.h>
#include <ripple/core/ConfigSections.h>
#include <ripple/nodestore/Manager.h>
#include <ripple/nodestore/impl/DeterministicShard.h>
#include <ripple/nodestore/impl/Shard.h>
#include <ripple/protocol/digest.h>
@@ -558,10 +559,14 @@ Shard::isLegacy() const
}
bool
Shard::finalize(
bool const writeSQLite,
boost::optional<uint256> const& expectedHash)
Shard::finalize(bool writeSQLite, boost::optional<uint256> const& referenceHash)
{
auto const scopedCount{makeBackendCount()};
if (!scopedCount)
return false;
state_ = finalizing;
uint256 hash{0};
std::uint32_t ledgerSeq{0};
auto fail = [&](std::string const& msg) {
@@ -575,14 +580,8 @@ Shard::finalize(
return false;
};
auto const scopedCount{makeBackendCount()};
if (!scopedCount)
return false;
try
{
state_ = finalizing;
/*
TODO MP
A lock is required when calling the NuDB verify function. Because
@@ -613,7 +612,7 @@ Shard::finalize(
else
{
// In the absence of a final key, an acquire SQLite database
// must be present in order to validate the shard
// must be present in order to verify the shard
if (!acquireInfo_)
return fail("missing acquire SQLite database");
@@ -660,12 +659,12 @@ Shard::finalize(
". Error: " + e.what());
}
// Validate the last ledger hash of a downloaded shard
// Verify the last ledger hash of a downloaded shard
// using a ledger hash obtained from the peer network
if (expectedHash && *expectedHash != hash)
if (referenceHash && *referenceHash != hash)
return fail("invalid last ledger hash");
// Validate every ledger stored in the backend
// Verify every ledger stored in the backend
Config const& config{app_.config()};
std::shared_ptr<Ledger> ledger;
std::shared_ptr<Ledger const> next;
@@ -678,6 +677,17 @@ Shard::finalize(
fullBelowCache->reset();
treeNodeCache->reset();
Serializer s;
s.add32(version);
s.add32(firstSeq_);
s.add32(lastSeq_);
s.addBitString(lastLedgerHash);
std::shared_ptr<DeterministicShard> dShard{
make_DeterministicShard(app_, dir_, index_, s, j_)};
if (!dShard)
return fail("Failed to create deterministic shard");
// Start with the last ledger in the shard and walk backwards from
// child to parent until we reach the first ledger
ledgerSeq = lastSeq_;
@@ -714,8 +724,11 @@ Shard::finalize(
return fail("missing root TXN node");
}
if (!verifyLedger(ledger, next))
return fail("failed to validate ledger");
if (!verifyLedger(ledger, next, dShard))
return fail("failed to verify ledger");
if (!dShard->store(nodeObject))
return fail("failed to store node object");
if (writeSQLite && !storeSQLite(ledger))
return fail("failed storing to SQLite databases");
@@ -764,20 +777,32 @@ Shard::finalize(
}
*/
// Store final key's value, may already be stored
Serializer s;
s.add32(version);
s.add32(firstSeq_);
s.add32(lastSeq_);
s.addBitString(lastLedgerHash);
auto nodeObject{
auto const nodeObject{
NodeObject::createObject(hotUNKNOWN, std::move(s.modData()), finalKey)};
if (!dShard->store(nodeObject))
return fail("failed to store node object");
try
{
// Store final key's value, may already be stored
backend_->store(nodeObject);
// Do not allow all other threads work with the shard
busy_ = true;
// Wait until all other threads leave the shard
while (backendCount_ > 1)
std::this_thread::yield();
std::lock_guard lock(mutex_);
// Close original backend
backend_->close();
// Close SQL databases
lgrSQLiteDB_.reset();
txSQLiteDB_.reset();
// Remove the acquire SQLite database
if (acquireInfo_)
{
@@ -785,13 +810,21 @@ Shard::finalize(
remove_all(dir_ / AcquireShardDBName);
}
lastAccess_ = std::chrono::steady_clock::now();
state_ = final;
// Close deterministic backend
dShard->close();
if (!initSQLite(lock))
return fail("failed to initialize SQLite databases");
// Replace original backend with deterministic backend
remove(dir_ / "nudb.key");
remove(dir_ / "nudb.dat");
rename(dShard->getDir() / "nudb.key", dir_ / "nudb.key");
rename(dShard->getDir() / "nudb.dat", dir_ / "nudb.dat");
setFileStats(lock);
// Re-open deterministic shard
if (!open(lock))
return false;
// Allow all other threads work with the shard
busy_ = false;
}
catch (std::exception const& e)
{
@@ -1184,7 +1217,8 @@ Shard::setFileStats(std::lock_guard<std::mutex> const&)
bool
Shard::verifyLedger(
std::shared_ptr<Ledger const> const& ledger,
std::shared_ptr<Ledger const> const& next) const
std::shared_ptr<Ledger const> const& next,
std::shared_ptr<DeterministicShard> const& dShard) const
{
auto fail = [j = j_, index = index_, &ledger](std::string const& msg) {
JLOG(j.error()) << "shard " << index << ". " << msg
@@ -1203,15 +1237,18 @@ Shard::verifyLedger(
return fail("Invalid ledger account hash");
bool error{false};
auto visit = [this, &error](SHAMapTreeNode const& node) {
auto visit = [this, &error, &dShard](SHAMapTreeNode const& node) {
if (stop_)
return false;
if (!verifyFetch(node.getHash().as_uint256()))
auto nodeObject{verifyFetch(node.getHash().as_uint256())};
if (!nodeObject || !dShard->store(nodeObject))
error = true;
return !error;
};
// Validate the state map
// Verify the state map
if (ledger->stateMap().getHash().isNonZero())
{
if (!ledger->stateMap().isValid())
@@ -1237,7 +1274,7 @@ Shard::verifyLedger(
return fail("Invalid state map");
}
// Validate the transaction map
// Verify the transaction map
if (ledger->info().txHash.isNonZero())
{
if (!ledger->txMap().isValid())
@@ -1253,6 +1290,7 @@ Shard::verifyLedger(
std::string(". Exception caught in function ") + __func__ +
". Error: " + e.what());
}
if (stop_)
return false;
if (error)
@@ -1303,7 +1341,7 @@ Shard::verifyFetch(uint256 const& hash) const
Shard::Count
Shard::makeBackendCount()
{
if (stop_)
if (stop_ || busy_)
return {nullptr};
std::lock_guard lock(mutex_);

View File

@@ -26,6 +26,7 @@
#include <ripple/core/DatabaseCon.h>
#include <ripple/nodestore/NodeObject.h>
#include <ripple/nodestore/Scheduler.h>
#include <ripple/nodestore/impl/DeterministicShard.h>
#include <boost/filesystem.hpp>
#include <nudb/nudb.hpp>
@@ -176,7 +177,8 @@ public:
[[nodiscard]] bool
isLegacy() const;
/** Finalize shard by walking its ledgers and verifying each Merkle tree.
/** Finalize shard by walking its ledgers, verifying each Merkle tree and
creating a deterministic backend.
@param writeSQLite If true, SQLite entries will be rewritten using
verified backend data.
@@ -184,9 +186,7 @@ public:
of the last ledger in the shard.
*/
[[nodiscard]] bool
finalize(
bool const writeSQLite,
boost::optional<uint256> const& referenceHash);
finalize(bool writeSQLite, boost::optional<uint256> const& referenceHash);
/** Enables removal of the shard directory on destruction.
*/
@@ -298,6 +298,9 @@ private:
// Determines if the shard needs to stop processing for shutdown
std::atomic<bool> stop_{false};
// Determines if the shard busy with replacing by deterministic one
std::atomic<bool> busy_{false};
std::atomic<State> state_{State::acquire};
// Determines if the shard directory should be removed in the destructor
@@ -324,11 +327,13 @@ private:
void
setFileStats(std::lock_guard<std::mutex> const&);
// Validate this ledger by walking its SHAMaps and verifying Merkle trees
// Verify this ledger by walking its SHAMaps and verifying its Merkle trees
// Every node object verified will be stored in the deterministic shard
[[nodiscard]] bool
verifyLedger(
std::shared_ptr<Ledger const> const& ledger,
std::shared_ptr<Ledger const> const& next) const;
std::shared_ptr<Ledger const> const& next,
std::shared_ptr<DeterministicShard> const& dShard) const;
// Fetches from backend and log errors based on status codes
[[nodiscard]] std::shared_ptr<NodeObject>

View File

@@ -19,20 +19,143 @@
#include <ripple/app/ledger/LedgerMaster.h>
#include <ripple/app/ledger/LedgerToJson.h>
#include <ripple/beast/hash/hash_append.h>
#include <ripple/beast/utility/temp_dir.h>
#include <ripple/core/ConfigSections.h>
#include <ripple/nodestore/DatabaseShard.h>
#include <ripple/nodestore/DummyScheduler.h>
#include <ripple/nodestore/impl/DecodedBlob.h>
#include <ripple/nodestore/impl/Shard.h>
#include <ripple/protocol/digest.h>
#include <boost/algorithm/hex.hpp>
#include <chrono>
#include <fstream>
#include <iostream>
#include <numeric>
#include <openssl/ripemd.h>
#include <test/jtx.h>
#include <test/nodestore/TestBase.h>
namespace ripple {
namespace NodeStore {
/** std::uniform_int_distribution is platform dependent.
* Unit test for deterministic shards is the following: it generates
* predictable accounts and transactions, packs them into ledgers
* and makes the shard. The hash of this shard should be equal to the
* given value. On different platforms (precisely, Linux and Mac)
* hashes of the resulting shard was different. It was unvestigated
* that the problem is in the class std::uniform_int_distribution
* which generates different pseudorandom sequences on different
* platforms, but we need predictable sequence.
*/
template <class IntType = int>
struct uniformIntDistribution
{
using resultType = IntType;
const resultType A, B;
struct paramType
{
const resultType A, B;
paramType(resultType aa, resultType bb) : A(aa), B(bb)
{
}
};
explicit uniformIntDistribution(
const resultType a = 0,
const resultType b = std::numeric_limits<resultType>::max())
: A(a), B(b)
{
}
explicit uniformIntDistribution(const paramType& params)
: A(params.A), B(params.B)
{
}
template <class Generator>
resultType
operator()(Generator& g) const
{
return rnd(g, A, B);
}
template <class Generator>
resultType
operator()(Generator& g, const paramType& params) const
{
return rnd(g, params.A, params.B);
}
resultType
a() const
{
return A;
}
resultType
b() const
{
return B;
}
resultType
min() const
{
return A;
}
resultType
max() const
{
return B;
}
private:
template <class Generator>
resultType
rnd(Generator& g, const resultType a, const resultType b) const
{
static_assert(
std::is_convertible<typename Generator::result_type, resultType>::
value,
"Ups...");
static_assert(
Generator::min() == 0, "If non-zero we have handle the offset");
const resultType range = b - a + 1;
assert(Generator::max() >= range); // Just for safety
const resultType rejectLim = g.max() % range;
resultType n;
do
n = g();
while (n <= rejectLim);
return (n % range) + a;
}
};
template <class Engine, class Integral>
Integral
randInt(Engine& engine, Integral min, Integral max)
{
assert(max > min);
// This should have no state and constructing it should
// be very cheap. If that turns out not to be the case
// it could be hand-optimized.
return uniformIntDistribution<Integral>(min, max)(engine);
}
template <class Engine, class Integral>
Integral
randInt(Engine& engine, Integral max)
{
return randInt(engine, Integral(0), max);
}
// Tests DatabaseShard class
//
class DatabaseShard_test : public TestBase
@@ -87,7 +210,7 @@ class DatabaseShard_test : public TestBase
{
int p;
if (n >= 2)
p = rand_int(rng_, 2 * dataSize);
p = randInt(rng_, 2 * dataSize);
else
p = 0;
@@ -99,27 +222,27 @@ class DatabaseShard_test : public TestBase
int from, to;
do
{
from = rand_int(rng_, n - 1);
to = rand_int(rng_, n - 1);
from = randInt(rng_, n - 1);
to = randInt(rng_, n - 1);
} while (from == to);
pay.push_back(std::make_pair(from, to));
}
n += !rand_int(rng_, nLedgers / dataSize);
n += !randInt(rng_, nLedgers / dataSize);
if (n > accounts_.size())
{
char str[9];
for (int j = 0; j < 8; ++j)
str[j] = 'a' + rand_int(rng_, 'z' - 'a');
str[j] = 'a' + randInt(rng_, 'z' - 'a');
str[8] = 0;
accounts_.emplace_back(str);
}
nAccounts_.push_back(n);
payAccounts_.push_back(std::move(pay));
xrpAmount_.push_back(rand_int(rng_, 90) + 10);
xrpAmount_.push_back(randInt(rng_, 90) + 10);
}
}
@@ -663,7 +786,7 @@ class DatabaseShard_test : public TestBase
for (std::uint32_t i = 0; i < nTestShards * 2; ++i)
{
std::uint32_t n = rand_int(data.rng_, nTestShards - 1) + 1;
std::uint32_t n = randInt(data.rng_, nTestShards - 1) + 1;
if (bitMask & (1ll << n))
{
db->removePreShard(n);
@@ -968,6 +1091,85 @@ class DatabaseShard_test : public TestBase
}
}
std::string
ripemd160File(std::string filename)
{
using beast::hash_append;
std::ifstream input(filename, std::ios::in | std::ios::binary);
char buf[4096];
ripemd160_hasher h;
while (input.read(buf, 4096), input.gcount() > 0)
hash_append(h, buf, input.gcount());
auto const binResult = static_cast<ripemd160_hasher::result_type>(h);
const auto charDigest = binResult.data();
std::string result;
boost::algorithm::hex(
charDigest,
charDigest + sizeof(binResult),
std::back_inserter(result));
return result;
}
void
testDeterministicShard(std::uint64_t const seedValue)
{
testcase("Deterministic shards");
using namespace test::jtx;
std::string ripemd160Key("B2F9DB61F714A82889966F097CD615C36DB2B01D"),
ripemd160Dat("6DB1D02CD019F09198FE80DB5A7D707F0C6BFF4C");
for (int i = 0; i < 2; i++)
{
beast::temp_dir shardDir;
{
Env env{*this, testConfig(shardDir.path())};
DatabaseShard* db = env.app().getShardStore();
BEAST_EXPECT(db);
TestData data(seedValue, 4);
if (!BEAST_EXPECT(data.makeLedgers(env)))
return;
if (createShard(data, *db) < 0)
return;
}
{
Env env{*this, testConfig(shardDir.path())};
DatabaseShard* db = env.app().getShardStore();
BEAST_EXPECT(db);
TestData data(seedValue, 4);
if (!BEAST_EXPECT(data.makeLedgers(env)))
return;
waitShard(*db, 1);
for (std::uint32_t j = 0; j < ledgersPerShard; ++j)
checkLedger(data, *db, *data.ledgers_[j]);
}
boost::filesystem::path path(shardDir.path());
path /= "1";
boost::filesystem::path keypath = path / "nudb.key";
std::string key = ripemd160File(keypath.string());
boost::filesystem::path datpath = path / "nudb.dat";
std::string dat = ripemd160File(datpath.string());
std::cerr << "Iteration " << i << ": RIPEMD160[nudb.key] = " << key
<< std::endl;
std::cerr << "Iteration " << i << ": RIPEMD160[nudb.dat] = " << dat
<< std::endl;
BEAST_EXPECT(key == ripemd160Key);
BEAST_EXPECT(dat == ripemd160Dat);
}
}
void
testImportWithHistoricalPaths(std::uint64_t const seedValue)
{
@@ -1339,9 +1541,10 @@ public:
testCorruptedDatabase(seedValue + 50);
testIllegalFinalKey(seedValue + 60);
testImport(seedValue + 70);
testImportWithHistoricalPaths(seedValue + 80);
testPrepareWithHistoricalPaths(seedValue + 90);
testOpenShardManagement(seedValue + 100);
testDeterministicShard(seedValue + 80);
testImportWithHistoricalPaths(seedValue + 90);
testPrepareWithHistoricalPaths(seedValue + 100);
testOpenShardManagement(seedValue + 110);
}
};