Cardinality Estimation: Difference between revisions

From Algorithm Wiki
Jump to navigation Jump to search
(Created page with "{{DISPLAYTITLE:Cardinality Estimation (Cardinality Estimation)}} == Description == Given a multiset of (possibly hashed) values, estimate the number of distinct elements of the multiset. Of interest is minimizing storage usage. == Parameters == <pre>N: number of values in multiset n: cardinality of multiset (not known)</pre> == Table of Algorithms == {| class="wikitable sortable" style="text-align:center;" width="100%" ! Name !! Year !! Time !! Space !! Approxi...")
 
No edit summary
 
(4 intermediate revisions by the same user not shown)
Line 6: Line 6:
== Parameters ==  
== Parameters ==  


<pre>N: number of values in multiset
$N$: number of values in multiset
n: cardinality of multiset (not known)</pre>
 
$n$: cardinality of multiset (not known)


== Table of Algorithms ==  
== Table of Algorithms ==  
Line 29: Line 30:
|}
|}


== Time Complexity graph ==  
== Time Complexity Graph ==  


[[File:Cardinality Estimation - Time.png|1000px]]
[[File:Cardinality Estimation - Time.png|1000px]]
== Space Complexity graph ==
[[File:Cardinality Estimation - Space.png|1000px]]
== Pareto Decades graph ==
[[File:Cardinality Estimation - Pareto Frontier.png|1000px]]

Latest revision as of 10:08, 28 April 2023

Description

Given a multiset of (possibly hashed) values, estimate the number of distinct elements of the multiset. Of interest is minimizing storage usage.

Parameters

$N$: number of values in multiset

$n$: cardinality of multiset (not known)

Table of Algorithms

Name Year Time Space Approximation Factor Model Reference
Naive solution 1940 $O(N)$ $O(n)$ Exact Deterministic
Flajolet–Martin algorithm 1984 $O(N)$ $O(log n)$ Randomized Time & Space
LogLog algorithm 2003 $O(N)$ $O(log(log(n)$)) Randomized Time & Space
HyperLogLog algorithm 2007 $O(N)$ $O(eps^{-2}*log(log(n)$))+log(n)) Randomized Time & Space
HyperLogLog++ 2014 $O(N)$ $O(eps^{-2}*log(log(n)$))+log(n)) Randomized Time

Time Complexity Graph

Cardinality Estimation - Time.png