Abstract
Studies finding approximate repetitions in DNA sequence, which is an important problem in gene analysis. Analyzing the approximate repetitions and similarity measurements and based on Hamming Distance, two definitions of pattern-similarity and segment-similarity are proposed as new measurements of similarity, then on the basis of the two definitions, a new concept of approximate repetition, i.e., the segment-similarity based approximate tandem repeats (SATR) is given. In addition, the succeeding unit array (SUA) as a lightweight index is introduced in finding SATRs in DNA sequence with an algorithm designed to find SATRs based on the index. Theoretical analysis and experiment results both show that the SATR finding algorithm based on SUA is superior to other methods in finding results and time saving.
Original language | English |
---|---|
Pages (from-to) | 184-188 |
Number of pages | 5 |
Journal | Dongbei Daxue Xuebao/Journal of Northeastern University |
Volume | 28 |
Issue number | 2 |
Publication status | Published - Feb 2007 |
Externally published | Yes |
Keywords
- Approximate repetitions
- DNA sequence
- SATR
- Segment-similarity
- Succeeding unit array (SUA)