Member-only story

Implementing Bloom Filters: Probabilistic Data Structures

In my last post, we talked about what Bloom Filters are, how they work and what are the factors influencing the false positivity rate of Bloom Filters. In this post, we’ll dive into the implementation details and then look into how we can alter different input parameters of Bloom Filter to offer lower false positives!

Implementation

  1. Let’s define our Bloom Filter. It comprises the attributes, size(m) and hashCount(k), along with the bit array to store the values.
type BloomFilter struct {
bitArray []bool
size int
hashCount int
}

2. Each Bloom filter has two functions, one to add an element to the Bloom filter and the other to validate if an element exists in the Bloom filter.

func (bf *BloomFilter) Add(str string) {
for i := 0; i < bf.hashCount; i++ {
hash := hash(str, i) % bf.size
bf.bitArray[hash] = true
}
}

func (bf *BloomFilter) Contains(str string) bool {
for i := 0; i < bf.hashCount; i++ {
hash := hash(str, i) % bf.size
if !bf.bitArray[hash] {
return false
}
}
return true
}

3. Let’s define a basic hash function that we’ll be using to hash the elements and map them to the bits. You can choose to have different hash functions or use something other than murmur3, it is completely…

--

--

Pratik Pandey - https://pratikpandey.substack.com
Pratik Pandey - https://pratikpandey.substack.com

Written by Pratik Pandey - https://pratikpandey.substack.com

Senior Engineer with experience in designing and architecting large scale distributed systems.

No responses yet