Duplicate Elimination: Difference between revisions
Jump to navigation
Jump to search
(Created page with "== Problem Description== SQL does not eliminate duplicates implicitly. It allows to enter duplicate values on columns other than candidate key or if did not specified any keys. If the user wants to eliminate duplicate records, he has to use DISTINCT keyword in the query. Databases, therefore, can have duplicate entries. The problem deals with identifying and removing duplicates from a database. == Bounds Chart == 1050px ==...") |
No edit summary |
||
Line 1: | Line 1: | ||
== | {{DISPLAYTITLE:Duplicate Elimination (Duplicate Elimination)}} | ||
== Description == | |||
SQL does not eliminate duplicates implicitly. It allows to enter duplicate values on columns other than candidate key or if did not specified any keys. If the user wants to eliminate duplicate records, he has to use DISTINCT keyword in the query. | SQL does not eliminate duplicates implicitly. It allows to enter duplicate values on columns other than candidate key or if did not specified any keys. If the user wants to eliminate duplicate records, he has to use DISTINCT keyword in the query. | ||
Databases, therefore, can have duplicate entries. The problem deals with identifying and removing duplicates from a database. | Databases, therefore, can have duplicate entries. The problem deals with identifying and removing duplicates from a database. | ||
== | == Parameters == | ||
No parameters found. | |||
== Table of Algorithms == | |||
== | {| class="wikitable sortable" style="text-align:center;" width="100%" | ||
! Name !! Year !! Time !! Space !! Approximation Factor !! Model !! Reference | |||
|- | |- | ||
| | |||
| | | [[Sorting based (Merge Sort) (Duplicate Elimination Duplicate Elimination)|Sorting based (Merge Sort)]] || 1964 || $O(nlogn)$ || $O(n)$ || Exact || Deterministic || | ||
| | |||
|- | |- | ||
| | | [[Sorting based (Merge Sort) + real-time elimination (Duplicate Elimination Duplicate Elimination)|Sorting based (Merge Sort) + real-time elimination]] || 1964 || $O(nlogn)$ || || Exact || Deterministic || | ||
| | |||
| | |||
|- | |- | ||
| | | [[BST Algorithm (Duplicate Elimination Duplicate Elimination)|BST Algorithm]] || 1999 || $O(nlogn)$ || $O(\log n)$ || Exact || Deterministic || | ||
| | |||
|- | |- | ||
| | | [[Priority Queue Algorithm (Duplicate Elimination Duplicate Elimination)|Priority Queue Algorithm]] || 1976 || $O(n^{2})$ || $O(n)$ || Exact || Deterministic || | ||
| [ | |- | ||
| [[Sorted Neighborhood Algorithm (SNA) (Duplicate Elimination Duplicate Elimination)|Sorted Neighborhood Algorithm (SNA)]] || 1998 || $O(n^{2})$ || $O(n)$ || Exact || Deterministic || [https://link-springer-com.ezproxy.canberra.edu.au/article/10.1023/A:1009761603038 Time] | |||
|- | |||
| [[Duplicate Elimination Sorted Neighborhood Algorithm (DE-SNA) (Duplicate Elimination Duplicate Elimination)|Duplicate Elimination Sorted Neighborhood Algorithm (DE-SNA)]] || 2002 || $O(nlogn)$ || || Exact || Deterministic || | |||
|- | |||
| [[Adaptive Duplicate Detection Algorithm (ADD) (Duplicate Elimination Duplicate Elimination)|Adaptive Duplicate Detection Algorithm (ADD)]] || 2003 || $O(n^{3})$ || $O({1})$ || Exact || Deterministic || [https://dl-acm-org.ezproxy.canberra.edu.au/doi/10.1145/956750.956759 Time] | |||
|- | |||
|} | |||
== Time Complexity graph == | |||
[[File:Duplicate Elimination - Time.png|1000px]] | |||
== Space Complexity graph == | |||
[ | [[File:Duplicate Elimination - Space.png|1000px]] | ||
== Pareto Decades graph == | |||
[ | [[File:Duplicate Elimination - Pareto Frontier.png|1000px]] | ||
Revision as of 10:24, 15 February 2023
Description
SQL does not eliminate duplicates implicitly. It allows to enter duplicate values on columns other than candidate key or if did not specified any keys. If the user wants to eliminate duplicate records, he has to use DISTINCT keyword in the query.
Databases, therefore, can have duplicate entries. The problem deals with identifying and removing duplicates from a database.
Parameters
No parameters found.
Table of Algorithms
Name | Year | Time | Space | Approximation Factor | Model | Reference |
---|---|---|---|---|---|---|
Sorting based (Merge Sort) | 1964 | $O(nlogn)$ | $O(n)$ | Exact | Deterministic | |
Sorting based (Merge Sort) + real-time elimination | 1964 | $O(nlogn)$ | Exact | Deterministic | ||
BST Algorithm | 1999 | $O(nlogn)$ | $O(\log n)$ | Exact | Deterministic | |
Priority Queue Algorithm | 1976 | $O(n^{2})$ | $O(n)$ | Exact | Deterministic | |
Sorted Neighborhood Algorithm (SNA) | 1998 | $O(n^{2})$ | $O(n)$ | Exact | Deterministic | Time |
Duplicate Elimination Sorted Neighborhood Algorithm (DE-SNA) | 2002 | $O(nlogn)$ | Exact | Deterministic | ||
Adaptive Duplicate Detection Algorithm (ADD) | 2003 | $O(n^{3})$ | $O({1})$ | Exact | Deterministic | Time |
Time Complexity graph
Error creating thumbnail: Unable to save thumbnail to destination
Space Complexity graph
Error creating thumbnail: Unable to save thumbnail to destination
Pareto Decades graph
Error creating thumbnail: Unable to save thumbnail to destination