TY - GEN
T1 - EDAM
T2 - 4th Asia-Pacific Bioinformatics Conference, APBC 2006
AU - Ma, Yifei
AU - Wang, Guoren
AU - Li, Yongguang
AU - Zhao, Yuehai
PY - 2006
Y1 - 2006
N2 - Finding motifs in DNA sequences plays an important role in deciphering transcriptional regulatory mechanisms and drug target identification. In this paper, we propose an efficient algorithm, EDAM, for finding motifs based on frequency transformation and Minimum Bounding Rectangle (MBR) techniques. It works in three phases, frequency transformation, MBR-clique searching and motif discovery. In frequency transformation, EDAM divides the sample sequences into a set of substrings by sliding windows, then transforms them to frequency vectors which are stored in MBRs. In MBR-clique searching, based on the frequency distance theorems EDAM searches for MBR-cliques used for motif discovery. In motif discovery, EDAM discovers larger cliques by extending smaller cliques with their neighbors. To accelerate the clique discovery, we propose a range query facility to avoid unnecessary computations for clique extension. The experimental results illustrate that EDAM well solves the running time bottleneck of the motif discovery problem in large DNA database.
AB - Finding motifs in DNA sequences plays an important role in deciphering transcriptional regulatory mechanisms and drug target identification. In this paper, we propose an efficient algorithm, EDAM, for finding motifs based on frequency transformation and Minimum Bounding Rectangle (MBR) techniques. It works in three phases, frequency transformation, MBR-clique searching and motif discovery. In frequency transformation, EDAM divides the sample sequences into a set of substrings by sliding windows, then transforms them to frequency vectors which are stored in MBRs. In MBR-clique searching, based on the frequency distance theorems EDAM searches for MBR-cliques used for motif discovery. In motif discovery, EDAM discovers larger cliques by extending smaller cliques with their neighbors. To accelerate the clique discovery, we propose a range query facility to avoid unnecessary computations for clique extension. The experimental results illustrate that EDAM well solves the running time bottleneck of the motif discovery problem in large DNA database.
UR - https://www.scopus.com/pages/publications/84863053520
M3 - Conference contribution
AN - SCOPUS:84863053520
SN - 1860946232
SN - 9781860946233
T3 - Series on Advances in Bioinformatics and Computational Biology
SP - 119
EP - 128
BT - Proceedings of the 4th Asia-Pacific Bioinformatics Conference, APBC 2006
Y2 - 13 February 2006 through 16 February 2006
ER -