Make rocksdb-deletes faster using bloom filter

Summary:
Wrote a new function in db_impl.c-CheckKeyMayExist that calls Get but with a new parameter turned on which makes Get return false only if bloom filters can guarantee that key is not in database. Delete calls this function and if the option- deletes_use_filter is turned on and CheckKeyMayExist returns false, the delete will be dropped saving:
1. Put of delete type
2. Space in the db,and
3. Compaction time

Test Plan:
make all check;
will run db_stress and db_bench and enhance unit-test once the basic design gets approved

Reviewers: dhruba, haobo, vamsi

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D11607
This commit is contained in:
Mayank Agarwal
2013-07-05 18:49:18 -07:00
parent 8a5341ec7d
commit 2a986919d6
21 changed files with 166 additions and 22 deletions

View File

@@ -120,6 +120,11 @@ class DB {
const std::vector<Slice>& keys,
std::vector<std::string>* values) = 0;
// If the key definitely does not exist in the database, then this method
// returns false. Otherwise return true. This check is potentially
// lighter-weight than invoking DB::Get(). No IO is performed
virtual bool KeyMayExist(const Slice& key) = 0;
// Return a heap-allocated iterator over the contents of the database.
// The result of NewIterator() is initially invalid (caller must
// call one of the Seek methods on the iterator before using it).

View File

@@ -465,6 +465,15 @@ struct Options {
// Default: 0
uint64_t bytes_per_sync;
// Use bloom-filter for deletes when this is true.
// db->Delete first calls KeyMayExist which checks memtable,immutable-memtable
// and bloom-filters to determine if the key does not exist in the database.
// If the key definitely does not exist, then the delete is a noop.KeyMayExist
// only incurs in-memory look up. This optimization avoids writing the delete
// to storage when appropriate.
// Default: false
bool deletes_check_filter_first;
};
// Options that control read operations

View File

@@ -58,7 +58,9 @@ enum Tickers {
NUMBER_MULTIGET_KEYS_READ = 19,
NUMBER_MULTIGET_BYTES_READ = 20,
TICKER_ENUM_MAX = 21
NUMBER_FILTERED_DELETES = 21,
TICKER_ENUM_MAX = 22
};
const std::vector<std::pair<Tickers, std::string>> TickersNameMap = {
@@ -82,7 +84,8 @@ const std::vector<std::pair<Tickers, std::string>> TickersNameMap = {
{ NO_ITERATORS, "rocksdb.num.iterators" },
{ NUMBER_MULTIGET_CALLS, "rocksdb.number.multiget.get" },
{ NUMBER_MULTIGET_KEYS_READ, "rocksdb.number.multiget.keys.read" },
{ NUMBER_MULTIGET_BYTES_READ, "rocksdb.number.multiget.bytes.read" }
{ NUMBER_MULTIGET_BYTES_READ, "rocksdb.number.multiget.bytes.read" },
{ NUMBER_FILTERED_DELETES, "rocksdb.number.deletes.filtered" }
};
/**