DisoFLAG: accurate prediction of protein intrinsic disorder and its functions using graph-based interaction protein language model

Yihe Pang, Bin Liu*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Intrinsically disordered proteins and regions (IDPs/IDRs) are functionally important proteins and regions that exist as highly dynamic conformations under natural physiological conditions. IDPs/IDRs exhibit a broad range of molecular functions, and their functions involve binding interactions with partners and remaining native structural flexibility. The rapid increase in the number of proteins in sequence databases and the diversity of disordered functions challenge existing computational methods for predicting protein intrinsic disorder and disordered functions. A disordered region interacts with different partners to perform multiple functions, and these disordered functions exhibit different dependencies and correlations. In this study, we introduce DisoFLAG, a computational method that leverages a graph-based interaction protein language model (GiPLM) for jointly predicting disorder and its multiple potential functions. GiPLM integrates protein semantic information based on pre-trained protein language models into graph-based interaction units to enhance the correlation of the semantic representation of multiple disordered functions. The DisoFLAG predictor takes amino acid sequences as the only inputs and provides predictions of intrinsic disorder and six disordered functions for proteins, including protein-binding, DNA-binding, RNA-binding, ion-binding, lipid-binding, and flexible linker. We evaluated the predictive performance of DisoFLAG following the Critical Assessment of protein Intrinsic Disorder (CAID) experiments, and the results demonstrated that DisoFLAG offers accurate and comprehensive predictions of disordered functions, extending the current coverage of computationally predicted disordered function categories. The standalone package and web server of DisoFLAG have been established to provide accurate prediction tools for intrinsic disorders and their associated functions.

Original languageEnglish
Article number3
JournalBMC Biology
Volume22
Issue number1
DOIs
Publication statusPublished - Dec 2024

Keywords

  • Disordered function prediction
  • Graph-based interaction protein language model
  • Protein intrinsic disorder
  • Protein language model

Fingerprint

Dive into the research topics of 'DisoFLAG: accurate prediction of protein intrinsic disorder and its functions using graph-based interaction protein language model'. Together they form a unique fingerprint.

Cite this