Abstract
Searching for repetitions is an important topic in bio-sequence analysis but the bottleneck of current indices used for it such as suffix tree is much too huge space consumption. Succeeding unit array (SUA), a lightweight index structure, is proposed through the analysis of repetitions in the DNA sequences in order to solve the bottleneck. It is constructed based on Radix sorting. Furthermore, SUA is suitable for multi-sequences analysis. The theoretical analysis shows the advantage of SUA in space consumption. Given a sequence of length n, the space consumption of SUA is only about 5 n in the experiments. Meanwhile, the construction is faster than other indices such as suffix tree.
| Original language | English |
|---|---|
| Pages (from-to) | 209-212+225 |
| Journal | Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition) |
| Volume | 33 |
| Issue number | SUPPL. |
| Publication status | Published - Dec 2005 |
| Externally published | Yes |
Keywords
- DNA sequences
- Repetition
- SUA