Abstract
Motivation: Regulatory DNA elements are associated with DNase I hypersensitive sites (DHSs). Accordingly, identification of DHSs will provide useful insights for in-depth investigation into the function of noncoding genomic regions. Results: In this study, using the strategy of ensemble learning framework, we proposed a new predictor called iDHS-EL for identifying the location of DHS in human genome. It was formed by fusing three individual Random Forest (RF) classifiers into an ensemble predictor. The three RF operators were respectively based on the three special modes of the general pseudo nucleotide composition (PseKNC): (i) kmer, (ii) reverse complement kmer and (iii) pseudo dinucleotide composition. It has been demonstrated that the new predictor remarkably outperforms the relevant state-of-the-art methods in both accuracy and stability.
| Original language | English |
|---|---|
| Pages (from-to) | 2411-2418 |
| Number of pages | 8 |
| Journal | Bioinformatics |
| Volume | 32 |
| Issue number | 16 |
| DOIs | |
| Publication status | Published - 15 Aug 2016 |
| Externally published | Yes |