rachelb (6) [Avatar] Offline
#1
Few notes:

* worth mentioning that the range of the hash functions must match m
* for the ESTIMATED COUNT formula, h1, h2, h3, h4 appear to be the names of the counters. However, hash 1 needs to be applied and then looked up in counter h1, ditto hash 2. This is not made clear by the formula
* "probability of the estimate to be correct is 99.6%". I don't think this is the case. It is likely to be that the probability that the relative error is less than 1.5% is 99.6%. I could not see these figures in the original paper.
* The figures for a width 8, count 128 count min sketch don't make much sense without quoting at least the number of elements consumed from the stream and the number of distinct elements in the stream.
* A discussion on how adjusting the width and count effects the error bound and the probability that the error lies within that bound would also be useful.
* Worth mentioning that count min sketch will never under count - but could over count.
andrew.psaltis (33) [Avatar] Offline
#2
Thank you for your notes -- all of this has been taken into consideration and will appear in the final copy of the book. Sorry it took me this long to respond.