dynamic_bloom: replace some divide (remainder) operations with shifts in locality mode, and other improvements

Summary: This patch changes meaning of options.bloom_locality: 0 means disable cache line optimization and any positive number means use CACHE_LINE_SIZE as block size (the previous behavior is the block size will be CACHE_LINE_SIZE*options.bloom_locality). By doing it, the divide operations inside a block can be replaced by a shift. Performance is improved: https://reviews.facebook.net/P471 Also, improve the basic algorithm in two ways: (1) make sure num of blocks is an odd number (2) rotate bytes after every probe in locality mode. Since the divider is 2^n, unless doing it, we are never able to use all the bits. Improvements of false positive: https://reviews.facebook.net/P459 Test Plan: make all check Reviewers: ljin, haobo Reviewed By: haobo Subscribers: dhruba, yhchiang, igor, leveldb Differential Revision: https://reviews.facebook.net/D18843
2025-12-06 17:27:55 +00:00 · 2014-06-02 16:52:29 -07:00
parent 91ddd587cc
commit 462796697c
4 changed files with 58 additions and 45 deletions
--- a/include/rocksdb/options.h
+++ b/include/rocksdb/options.h
@@ -547,12 +547,9 @@ struct ColumnFamilyOptions {

  // Control locality of bloom filter probes to improve cache miss rate.
  // This option only applies to memtable prefix bloom and plaintable
-  // prefix bloom. It essentially limits the max number of cache lines each
-  // bloom filter check can touch.
-  // This optimization is turned off when set to 0. The number should never
-  // be greater than number of probes. This option can boost performance
-  // for in-memory workload but should use with care since it can cause
-  // higher false positive rate.
+  // prefix bloom. It essentially limits every bloom checking to one cache line.
+  // This optimization is turned off when set to 0, and positive number to turn
+  // it on.
  // Default: 0
  uint32_t bloom_locality;