TY - JOUR
T1 - IMFP-LG
T2 - Identify Novel Multi-functional Peptides Using Protein Language Models and Graph-based Deep Learning
AU - Luo, Jiawei
AU - Zhao, Kejuan
AU - Chen, Junjie
AU - Yang, Caihua
AU - Qu, Fuchuan
AU - Liu, Yumeng
AU - Jin, Xiaopeng
AU - Yan, Ke
AU - Zhang, Yang
AU - Liu, Bin
N1 - Publisher Copyright:
© 2024 The Author(s).
PY - 2024/12/1
Y1 - 2024/12/1
N2 - Functional peptides are short amino acid fragments that have a wide range of beneficial functions for living organisms. The majority of previous studies have focused on mono-functional peptides, but an increasing number of multi-functional peptides have been discovered. Although there have been enormous experimental efforts to assay multi-functional peptides, only a small portion of millions of known peptides has been explored. The development of effective and accurate techniques for identifying multi-functional peptides can facilitate their discovery and mechanistic understanding. In this study, we presented iMFP-LG, a method for multi-functional peptide identification based on protein language models (pLMs) and graph attention networks (GATs). Our comparative analyses demonstrated that iMFP-LG outperformed the state-of-The-Art methods in identifying both multi-functional bioactive peptides and multi-functional therapeutic peptides. The interpretability of iMFP-LG was also illustrated by visualizing attention patterns in pLMs and GATs. Regarding the outstanding performance of iMFP-LG on the identification of multi-functional peptides, we employed iMFP-LG to screen novel peptides with both anti-microbial and anti-cancer functions from millions of known peptides in the UniRef90 database. As a result, eight candidate peptides were identified, among which one candidate was validated to process both anti-bacterial and anti-cancer properties through molecular structure alignment and biological experiments. We anticipate that iMFP-LG can assist in the discovery of multi-functional peptides and contribute to the advancement of peptide drug design.
AB - Functional peptides are short amino acid fragments that have a wide range of beneficial functions for living organisms. The majority of previous studies have focused on mono-functional peptides, but an increasing number of multi-functional peptides have been discovered. Although there have been enormous experimental efforts to assay multi-functional peptides, only a small portion of millions of known peptides has been explored. The development of effective and accurate techniques for identifying multi-functional peptides can facilitate their discovery and mechanistic understanding. In this study, we presented iMFP-LG, a method for multi-functional peptide identification based on protein language models (pLMs) and graph attention networks (GATs). Our comparative analyses demonstrated that iMFP-LG outperformed the state-of-The-Art methods in identifying both multi-functional bioactive peptides and multi-functional therapeutic peptides. The interpretability of iMFP-LG was also illustrated by visualizing attention patterns in pLMs and GATs. Regarding the outstanding performance of iMFP-LG on the identification of multi-functional peptides, we employed iMFP-LG to screen novel peptides with both anti-microbial and anti-cancer functions from millions of known peptides in the UniRef90 database. As a result, eight candidate peptides were identified, among which one candidate was validated to process both anti-bacterial and anti-cancer properties through molecular structure alignment and biological experiments. We anticipate that iMFP-LG can assist in the discovery of multi-functional peptides and contribute to the advancement of peptide drug design.
KW - Deep learning
KW - Graph attention network
KW - Multi-functional peptide discovery
KW - Protein language model
KW - Therapeutic peptide screening
UR - http://www.scopus.com/inward/record.url?scp=105003109436&partnerID=8YFLogxK
U2 - 10.1093/gpbjnl/qzae084
DO - 10.1093/gpbjnl/qzae084
M3 - Article
C2 - 39585308
AN - SCOPUS:105003109436
SN - 1672-0229
VL - 22
JO - Genomics, Proteomics and Bioinformatics
JF - Genomics, Proteomics and Bioinformatics
IS - 6
M1 - qzae084
ER -