Forthcoming Events
Solving the Classic Distinct Element Problem with a Strikingly Simple Algorithm that Captivated Donald Knuth
Prof Sourav Chakraborty (ISI Kolkata)
Location : AB2-5B
Abstract: We will present a very simple streaming algorithm on F_0 estimation that also caught the eye of Donald E. Knuth. In a recent article, Donald E. Knuth started with the following two paragraphs:
" Sourav Chakraborty, N. V. Vinodchandran, and Kuldeep S. Meel have recently proposed an interesting algorithm for the following problem: A stream of elements (a1, a2,...,am) is input, one at a time, and we want to know how many of them are distinct. In other words, if A = {a1, a2,...,am} is the set of elements in the stream, with multiplicities ignored, we want to know |A|, the size of that set. But we don’t have much memory; in fact, |A| is probably a lot larger than the number of elements that we can hold in memory at any one time. What is a good strategy for computing an unbiased estimate of |A|?
Their algorithm is not only interesting, it is extremely simple. Furthermore, it’s wonderfully suited to teaching students who are learning the basics of computer science. (Indeed, ever since I saw it, a few days ago, I’ve been unable to resist trying to explain the ideas to just about everybody I meet.) Therefore I’m pretty sure that something like this will eventually become a standard textbook topic. This note is an initial approximation to what I might write about it if I were preparing a textbook about data streams."
Their algorithm is not only interesting, it is extremely simple. Furthermore, it’s wonderfully suited to teaching students who are learning the basics of computer science. (Indeed, ever since I saw it, a few days ago, I’ve been unable to resist trying to explain the ideas to just about everybody I meet.) Therefore I’m pretty sure that something like this will eventually become a standard textbook topic. This note is an initial approximation to what I might write about it if I were preparing a textbook about data streams."