-
Notifications
You must be signed in to change notification settings - Fork 14
Open
Description
Implementing the following test wouldn't make the test passing anymore.
--- FAIL: TestBugFalsePositives (1.24s)
bloom_test.go:330: False positive probability is too high at 20.18335038131724 % vs 0.602097140729787 %
FAIL
exit status 1
FAIL github.com/DCSO/bloom 1.763s
func TestBugFalsePositives(t *testing.T) {
// this capacity + p would produce a power of 2 bit size
capacity := uint64(109397)
p := float64(0.01)
fillingFactor := 0.9
N := uint64(float64(capacity) * fillingFactor)
filter, _ := GenerateExampleFilter(capacity, p, N)
pAcceptable := math.Pow(1-math.Exp(-float64(filter.k)*float64(N)/float64(filter.m)), float64(filter.k))
fingerprint := make([]uint64, filter.k)
cnt := 0.0
matches := 0.0
for {
cnt++
value := GenerateTestValue(100)
filter.Fingerprint(value, fingerprint)
if filter.CheckFingerprint(fingerprint) {
matches++
}
if cnt > float64(capacity)*10 {
break
}
}
//this might still fail sometimes...
//we allow for a probability that is two times higher than the normally acceptable probability
if matches/cnt > pAcceptable*2 {
t.Error("False positive probability is too high at ", matches/cnt*100, "% vs ", pAcceptable*100, "%")
}
}After analysis, it is possible there is a flaw in the fingerprint generation.
Here is my theory (not verified mathematically). Your algorithm generates fingerprints with:
for i := uint64(0); i < s.k; i++ {
hn = (hn * g) % m
fingerprint[i] = uint64(hn % s.m)However the value of m used (i.e. 0xffffffffffffffc5) is so big that what the code does is equivalent for any x = (hn * g); x < m (which is very likely as we are working with uint64) to the following
for i := uint64(0); i < s.k; i++ {
hn = hn * g
fingerprint[i] = uint64(hn % s.m)So under those conditions hn is always a multiple of h0 (fnv hash) multiplied by g power i, very likely creating lots of collisions for very specific cases, such as this one.
NB: I did not verify if other bit sizes create the same behavior
Metadata
Metadata
Assignees
Labels
No labels