Motif Search: Difference between revisions

From Algorithm Wiki
Jump to navigation Jump to search
(Created page with "== Problem Description== == Bounds Chart == 350px == Step Chart == 350px == Improvement Table == {| class="wikitable" style="text-align:center;" width="100%" !width="20%" | Complexity Classes !! width="40%" | Algorithm Paper Links !! width="40%" | Lower Bounds Paper Links |- | rowspan="1" | Exp/Factorial | | |- | rowspan="1" | Polynomial > 3 | | |- | rowspan="1" | Cubic | | |- | rowspan="1" |...")
 
No edit summary
Line 1: Line 1:
== Problem Description==
{{DISPLAYTITLE:Motif Search (Motif Search)}}
== Description ==  


Motif search is the problem of identifying motifs, recurring or conserved patterns, in the strings (typically biological sequence data sets).


== Bounds Chart ==
== Parameters ==  
[[File:Motif_SearchBoundsChart.png|350px]]


== Step Chart ==
<pre>$n$: size of set of input strings
[[File:Motif_SearchStepChart.png|350px]]
$m$: size of input strings
$k$: length of substrings
$\sigma$: function $V(k, m)$ defined as the number of $k$-mers that are at most $m$ Hamming distance from the motif space</pre>


== Improvement Table ==
== Table of Algorithms ==  
{| class="wikitable" style="text-align:center;" width="100%"  
 
!width="20%" | Complexity Classes !! width="40%" | Algorithm Paper Links !! width="40%" | Lower Bounds Paper Links
{| class="wikitable sortable" style="text-align:center;" width="100%"
  |-
 
| rowspan="1" | Exp/Factorial
! Name !! Year !! Time !! Space !! Approximation Factor !! Model !! Reference
|
 
|
|-
 
| [[Lawrence, Reilly (Motif Search Motif Search)|Lawrence, Reilly]] || 1990 || $O(nm)$ || $O(nm)$ ||  || Deterministic || [https://www-ncbi-nlm-nih-gov.ezproxy.canberra.edu.au/pubmed/2184437 Time & Space]
|-
| [[Lawrence Gibbs Sampling (Motif Search Motif Search)|Lawrence Gibbs Sampling]] || 1993 || $O(nm)$ || $O(n + m)$ ||  || Deterministic || [https://science-sciencemag-org.ezproxy.canberra.edu.au/content/262/5131/208 Time]
|-
| [[MotifSampler  (Motif Search Motif Search)|MotifSampler ]] || 2001 || $O(nm)$ || $O(n + m)$ ||  || Deterministic || [https://www-ncbi-nlm-nih-gov.ezproxy.canberra.edu.au/pubmed/12015892 Time]
|-
| [[Speller (Motif Search Motif Search)|Speller]] || 1998 || $O(mn^{2} \sigma)$ || $O(mn^{2}/w)$ || Exact || Deterministic || [https://link-springer-com.ezproxy.canberra.edu.au/chapter/10.1007/BFb0054337 Time] & [https://www-ncbi-nlm-nih-gov.ezproxy.canberra.edu.au/pmc/articles/PMC2966288/pdf/1471-2105-11-S8-S1.pdf Space]
|-
| [[Mitra (Motif Search Motif Search)|Mitra]] || 2002 || $O(k nm \sigma)$ || $O(mnk)$ || Exact || Deterministic || [https://pubmed-ncbi-nlm-nih-gov.ezproxy.canberra.edu.au/12169566/ Time] & [https://www-ncbi-nlm-nih-gov.ezproxy.canberra.edu.au/pmc/articles/PMC2966288/pdf/1471-2105-11-S8-S1.pdf Space]
|-
| [[Census (Motif Search Motif Search)|Census]] || 2003 || $O(k nm \sigma)$ || $O(mnk)$ || Exact || Deterministic || [https://link-springer-com.ezproxy.canberra.edu.au/chapter/10.1007/978-3-540-45078-8_5 Time] & [https://www-ncbi-nlm-nih-gov.ezproxy.canberra.edu.au/pmc/articles/PMC2966288/pdf/1471-2105-11-S8-S1.pdf Space]
|-
| [[Risotto (Motif Search Motif Search)|Risotto]] || 2006 || $O(mn^{2} \sigma)$ || $O(mn^{2})$ || Exact || Deterministic || [https://link-springer-com.ezproxy.canberra.edu.au/chapter/10.1007/11682462_69 Time] & [https://www-ncbi-nlm-nih-gov.ezproxy.canberra.edu.au/pmc/articles/PMC2966288/pdf/1471-2105-11-S8-S1.pdf Space]
|-
| [[PMS (Motif Search Motif Search)|PMS]] || 2007 || $O(nm^{2} \sigma)$ || $O(m^{2} n)$ || Exact || Deterministic || [https://ieeexplore-ieee-org.ezproxy.canberra.edu.au/abstract/document/4359890 Time] & [https://www-ncbi-nlm-nih-gov.ezproxy.canberra.edu.au/pmc/articles/PMC2966288/pdf/1471-2105-11-S8-S1.pdf Space]
|-
| [[Roth AlignACE (Motif Search Motif Search)|Roth AlignACE]] || 1998 || $O(nm)$ || $O(n + m)$ || || Deterministic || [https://www-ncbi-nlm-nih-gov.ezproxy.canberra.edu.au/pubmed/9788350 Time]
|-
| [[Helden Oligo-Analysis ( Motif Search)|Helden Oligo-Analysis]] || 1998 || $O(nm)$ || $O(m)$ || Exact || Deterministic || [https://www.ncbi.nlm.nih.gov/pubmed/9719638 Time]
|-
| [[van Helden J; Rios AF; Collado-Vides J ( Motif Search)|van Helden J; Rios AF; Collado-Vides J]] || 2000 || $O(nm)$ || $O(m)$ || Exact || Deterministic || [https://www-ncbi-nlm-nih-gov.ezproxy.canberra.edu.au/pmc/articles/PMC102821/ Time]
|-
|-
| rowspan="1" | Polynomial > 3
| [[Tompa M ( Motif Search)|Tompa M]] || 1999 || $O(nm)$ || $O(m^{2})$ || Exact || Deterministic || [https://www.aaai.org/Papers/ISMB/1999/ISMB99-030.pdf Time]
|
|
|-
|-
| rowspan="1" | Cubic
| [[Sinha S; Tompa M YMF (Yeast Motif Finder) ( Motif Search)|Sinha S; Tompa M YMF (Yeast Motif Finder)]] || 2000 || $O(n^{0.{6}6} m)$ || $O(m)$ || Exact || Deterministic || [https://www-ncbi-nlm-nih-gov.ezproxy.canberra.edu.au/pubmed/10977095 Time]
|
|
|-
|-
| rowspan="1" | Quadratic
| [[Bailey TL; Elkan C MEME ( Motif Search)|Bailey TL; Elkan C MEME]] || 1995 || $O(n^{2}m^{2})$ || $O(nm)$ || Exact || Deterministic || [https://link-springer-com.ezproxy.canberra.edu.au/article/10.1007/BF00993379 Time]
|
|
|-
|-
| rowspan="1" | nlogn
| [[Sagot M ( Motif Search)|Sagot M]] || 1988 || $O(n logn m^{1.{4}5})$ || $O(mn^{2}/w)$ || Exact || Deterministic || [https://link-springer-com.ezproxy.canberra.edu.au/chapter/10.1007/BFb0054337 Time & Space]
|
|
|-
|-
| rowspan="1" | Linear
| [[Liang Cwinnower ( Motif Search)|Liang Cwinnower]] || 2003 || $O(nm^{0.5})$ || $O(m^{2})$ || Exact || Deterministic || [https://www.worldscientific.com/doi/10.1142/S0219720004000466 Time]
|
|
|-
|-
| rowspan="1" | logn
| [[Kingsford ( Motif Search)|Kingsford]] || 2006 || $O(mn)$ || $O(m^{2}n^{2})$ || Exact || Deterministic || [https://link-springer-com.ezproxy.canberra.edu.au/chapter/10.1007/11780441_22 Time]
|
|-
|
|}
|-|}
 
== Time Complexity graph ==
 
[[File:Motif Search - Time.png|1000px]]
 
== Space Complexity graph ==
 
[[File:Motif Search - Space.png|1000px]]
 
== Pareto Decades graph ==
 
[[File:Motif Search - Pareto Frontier.png|1000px]]

Revision as of 10:24, 15 February 2023

Description

Motif search is the problem of identifying motifs, recurring or conserved patterns, in the strings (typically biological sequence data sets).

Parameters

$n$: size of set of input strings
$m$: size of input strings
$k$: length of substrings
$\sigma$: function $V(k, m)$ defined as the number of $k$-mers that are at most $m$ Hamming distance from the motif space

Table of Algorithms

Name Year Time Space Approximation Factor Model Reference
Lawrence, Reilly 1990 $O(nm)$ $O(nm)$ Deterministic Time & Space
Lawrence Gibbs Sampling 1993 $O(nm)$ $O(n + m)$ Deterministic Time
MotifSampler 2001 $O(nm)$ $O(n + m)$ Deterministic Time
Speller 1998 $O(mn^{2} \sigma)$ $O(mn^{2}/w)$ Exact Deterministic Time & Space
Mitra 2002 $O(k nm \sigma)$ $O(mnk)$ Exact Deterministic Time & Space
Census 2003 $O(k nm \sigma)$ $O(mnk)$ Exact Deterministic Time & Space
Risotto 2006 $O(mn^{2} \sigma)$ $O(mn^{2})$ Exact Deterministic Time & Space
PMS 2007 $O(nm^{2} \sigma)$ $O(m^{2} n)$ Exact Deterministic Time & Space
Roth AlignACE 1998 $O(nm)$ $O(n + m)$ Deterministic Time
Helden Oligo-Analysis 1998 $O(nm)$ $O(m)$ Exact Deterministic Time
van Helden J; Rios AF; Collado-Vides J 2000 $O(nm)$ $O(m)$ Exact Deterministic Time
Tompa M 1999 $O(nm)$ $O(m^{2})$ Exact Deterministic Time
Sinha S; Tompa M YMF (Yeast Motif Finder) 2000 $O(n^{0.{6}6} m)$ $O(m)$ Exact Deterministic Time
Bailey TL; Elkan C MEME 1995 $O(n^{2}m^{2})$ $O(nm)$ Exact Deterministic Time
Sagot M 1988 $O(n logn m^{1.{4}5})$ $O(mn^{2}/w)$ Exact Deterministic Time & Space
Liang Cwinnower 2003 $O(nm^{0.5})$ $O(m^{2})$ Exact Deterministic Time
Kingsford 2006 $O(mn)$ $O(m^{2}n^{2})$ Exact Deterministic Time

Time Complexity graph

Error creating thumbnail: Unable to save thumbnail to destination

Space Complexity graph

Error creating thumbnail: Unable to save thumbnail to destination

Pareto Decades graph

Error creating thumbnail: Unable to save thumbnail to destination