mirror of
https://github.com/XRPLF/rippled.git
synced 2025-12-06 17:27:55 +00:00
Add a new mem-table representation based on cuckoo hash.
Summary:
= Major Changes =
* Add a new mem-table representation, HashCuckooRep, which is based cuckoo hash.
Cuckoo hash uses multiple hash functions. This allows each key to have multiple
possible locations in the mem-table.
- Put: When insert a key, it will try to find whether one of its possible
locations is vacant and store the key. If none of its possible
locations are available, then it will kick out a victim key and
store at that location. The kicked-out victim key will then be
stored at a vacant space of its possible locations or kick-out
another victim. In this diff, the kick-out path (known as
cuckoo-path) is found using BFS, which guarantees to be the shortest.
- Get: Simply tries all possible locations of a key --- this guarantees
worst-case constant time complexity.
- Time complexity: O(1) for Get, and average O(1) for Put if the
fullness of the mem-table is below 80%.
- Default using two hash functions, the number of hash functions used
by the cuckoo-hash may dynamically increase if it fails to find a
short-enough kick-out path.
- Currently, HashCuckooRep does not support iteration and snapshots,
as our current main purpose of this is to optimize point access.
= Minor Changes =
* Add IsSnapshotSupported() to DB to indicate whether the current DB
supports snapshots. If it returns false, then DB::GetSnapshot() will
always return nullptr.
Test Plan:
Run existing tests. Will develop a test specifically for cuckoo hash in
the next diff.
Reviewers: sdong, haobo
Reviewed By: sdong
CC: leveldb, dhruba, igor
Differential Revision: https://reviews.facebook.net/D16155
This commit is contained in:
@@ -275,6 +275,9 @@ class DB {
|
||||
// this handle will all observe a stable snapshot of the current DB
|
||||
// state. The caller must call ReleaseSnapshot(result) when the
|
||||
// snapshot is no longer needed.
|
||||
//
|
||||
// nullptr will be returned if the DB fails to take a snapshot or does
|
||||
// not support snapshot.
|
||||
virtual const Snapshot* GetSnapshot() = 0;
|
||||
|
||||
// Release a previously acquired snapshot. The caller must not
|
||||
|
||||
@@ -152,6 +152,14 @@ class MemTableRep {
|
||||
// a Seek might only include keys with the same prefix as the target key.
|
||||
virtual Iterator* GetDynamicPrefixIterator() { return GetIterator(); }
|
||||
|
||||
// Return true if the current MemTableRep supports merge operator.
|
||||
// Default: true
|
||||
virtual bool IsMergeOperatorSupported() const { return true; }
|
||||
|
||||
// Return true if the current MemTableRep supports snapshot
|
||||
// Default: true
|
||||
virtual bool IsSnapshotSupported() const { return true; }
|
||||
|
||||
protected:
|
||||
// When *key is an internal key concatenated with the value, returns the
|
||||
// user key.
|
||||
@@ -219,6 +227,39 @@ extern MemTableRepFactory* NewHashSkipListRepFactory(
|
||||
extern MemTableRepFactory* NewHashLinkListRepFactory(
|
||||
size_t bucket_count = 50000);
|
||||
|
||||
// This factory creates a cuckoo-hashing based mem-table representation.
|
||||
// Cuckoo-hash is a closed-hash strategy, in which all key/value pairs
|
||||
// are stored in the bucket array itself intead of in some data structures
|
||||
// external to the bucket array. In addition, each key in cuckoo hash
|
||||
// has a constant number of possible buckets in the bucket array. These
|
||||
// two properties together makes cuckoo hash more memory efficient and
|
||||
// a constant worst-case read time. Cuckoo hash is best suitable for
|
||||
// point-lookup workload.
|
||||
//
|
||||
// When inserting a key / value, it first checks whether one of its possible
|
||||
// buckets is empty. If so, the key / value will be inserted to that vacant
|
||||
// bucket. Otherwise, one of the keys originally stored in one of these
|
||||
// possible buckets will be "kicked out" and move to one of its possible
|
||||
// buckets (and possibly kicks out another victim.) In the current
|
||||
// implementation, such "kick-out" path is bounded. If it cannot find a
|
||||
// "kick-out" path for a specific key, this key will be stored in a backup
|
||||
// structure, and the current memtable to be forced to immutable.
|
||||
//
|
||||
// Note that currently this mem-table representation does not support
|
||||
// snapshot (i.e., it only queries latest state) and iterators. In addition,
|
||||
// MultiGet operation might also lose its atomicity due to the lack of
|
||||
// snapshot support.
|
||||
//
|
||||
// Parameters:
|
||||
// write_buffer_size: the write buffer size in bytes.
|
||||
// average_data_size: the average size of key + value in bytes. This value
|
||||
// together with write_buffer_size will be used to compute the number
|
||||
// of buckets.
|
||||
// hash_function_count: the number of hash functions that will be used by
|
||||
// the cuckoo-hash. The number also equals to the number of possible
|
||||
// buckets each key will have.
|
||||
extern MemTableRepFactory* NewHashCuckooRepFactory(
|
||||
size_t write_buffer_size, size_t average_data_size = 64,
|
||||
unsigned int hash_function_count = 4);
|
||||
#endif // ROCKSDB_LITE
|
||||
|
||||
} // namespace rocksdb
|
||||
|
||||
Reference in New Issue
Block a user