TY - JOUR
T1 - PROTDEC-LTR3.0
T2 - Protein remote homology detection by incorporating profile-based features into learning to rank
AU - Liu, Bin
AU - Zhu, Yulin
N1 - Publisher Copyright:
© 2019 Institute of Electrical and Electronics Engineers Inc.. All rights reserved.
PY - 2019
Y1 - 2019
N2 - Protein remote homology detection is one of the most challenging problems in the field of protein sequence analysis, which is an important step for both theoretical research (such as the understanding of structures and functions of proteins) and drug design. Previous studies have shown that combining different ranking methods via learning to the rank algorithm is an effective strategy for remote protein homology detection, and the performance can be further improved by the protein similarity networks. In this paper, we improved the ProtDec-LTR1.0 and ProtDec-LTR2.0 predictors by incorporating three profile-based features (Top-1-gram, Top-2-gram, and ACC) into the framework of learning to rank via feature mapping strategies. The predictive performance was further refined by the pagerank (PR) algorithm and hyperlink-induced topic search (HITS) algorithm. Finally, a predictor called ProtDec-LTR3.0 was proposed. Rigorous tests on two widely used benchmark datasets showed that the ProtDec-LTR3.0 predictor outperformed both ProtDec-LTR1.0 and ProtDec-LTR2.0, and other nine existing state-of-the-art predictors, indicating that the ProtDec-LTR3.0 is an efficient method for protein remote homology detection, and will become a useful tool for protein sequence analysis. A user-friendly web server of the ProtDec-LTR3.0 predictor was established for the convenience of users, which can be accessed at http://bliulab.net/ProtDec-LTR3.0/.
AB - Protein remote homology detection is one of the most challenging problems in the field of protein sequence analysis, which is an important step for both theoretical research (such as the understanding of structures and functions of proteins) and drug design. Previous studies have shown that combining different ranking methods via learning to the rank algorithm is an effective strategy for remote protein homology detection, and the performance can be further improved by the protein similarity networks. In this paper, we improved the ProtDec-LTR1.0 and ProtDec-LTR2.0 predictors by incorporating three profile-based features (Top-1-gram, Top-2-gram, and ACC) into the framework of learning to rank via feature mapping strategies. The predictive performance was further refined by the pagerank (PR) algorithm and hyperlink-induced topic search (HITS) algorithm. Finally, a predictor called ProtDec-LTR3.0 was proposed. Rigorous tests on two widely used benchmark datasets showed that the ProtDec-LTR3.0 predictor outperformed both ProtDec-LTR1.0 and ProtDec-LTR2.0, and other nine existing state-of-the-art predictors, indicating that the ProtDec-LTR3.0 is an efficient method for protein remote homology detection, and will become a useful tool for protein sequence analysis. A user-friendly web server of the ProtDec-LTR3.0 predictor was established for the convenience of users, which can be accessed at http://bliulab.net/ProtDec-LTR3.0/.
KW - Feature mapping strategy
KW - Hyperlink-induced topic search
KW - Learning to rank
KW - Pagerank
KW - Profile-based features
KW - Protein remote homology detection
UR - http://www.scopus.com/inward/record.url?scp=85072162412&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2019.2929363
DO - 10.1109/ACCESS.2019.2929363
M3 - Article
AN - SCOPUS:85072162412
SN - 2169-3536
VL - 7
SP - 102499
EP - 102507
JO - IEEE Access
JF - IEEE Access
M1 - 8765711
ER -