TY - JOUR
T1 - MuFGPS
T2 - enhancing liquid–liquid phase separation protein prediction through multi-level features and ensemble learning
AU - Xian, Lei
AU - Zou, Quan
AU - Qi, Ren
AU - Niu, Mengting
AU - Wang, Yansu
N1 - Publisher Copyright:
© The Author(s) 2026. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact reprints@oup.com for reprints and translation rights for reprints. All other permissions can be obtained through our RightsLink service via the Permissions link on the article page on our site—for further information please contact journals.permissions@oup.com.
PY - 2026/5
Y1 - 2026/5
N2 - Liquid–liquid phase separation (LLPS) is a key mechanism driving the assembly of membrane-less organelles and is increasingly recognized for its involvement in essential cellular functions and various diseases. However, existing computational approaches largely rely on sequence-level descriptors and often fail to explicitly incorporate structural topology information, limiting their ability to capture the complex determinants of LLPS behavior. Accurate identification of LLPS-capable proteins remains challenging due to their sequence diversity and complex structural determinants. Here, we present MuFGPS (Multi-level Feature Graph-based Predictor for Phase-Separating proteins), a predictive framework integrating sequence-derived physicochemical features, Define Secondary Structure of Proteins-annotated secondary structures, and graph-based structural embeddings from AlphaFold residue contact maps via a multi-head Graph Attention Network. Class imbalance is addressed using Synthetic Minority Oversampling Technique (SMOTE), and classification is performed through a stacking ensemble of Random Forest, XGBoost, and LightGBM. Benchmarks against six representative methods demonstrate that MuFGPS achieves superior performance across all metrics, with notable gains in F1-score and matthews correlation coefficient (MCC). Ablation analyses confirm the synergistic contributions of structural features and ensemble learning to accuracy and robustness. MuFGPS offers a scalable and high-accuracy framework for proteome-wide LLPS protein prediction.
AB - Liquid–liquid phase separation (LLPS) is a key mechanism driving the assembly of membrane-less organelles and is increasingly recognized for its involvement in essential cellular functions and various diseases. However, existing computational approaches largely rely on sequence-level descriptors and often fail to explicitly incorporate structural topology information, limiting their ability to capture the complex determinants of LLPS behavior. Accurate identification of LLPS-capable proteins remains challenging due to their sequence diversity and complex structural determinants. Here, we present MuFGPS (Multi-level Feature Graph-based Predictor for Phase-Separating proteins), a predictive framework integrating sequence-derived physicochemical features, Define Secondary Structure of Proteins-annotated secondary structures, and graph-based structural embeddings from AlphaFold residue contact maps via a multi-head Graph Attention Network. Class imbalance is addressed using Synthetic Minority Oversampling Technique (SMOTE), and classification is performed through a stacking ensemble of Random Forest, XGBoost, and LightGBM. Benchmarks against six representative methods demonstrate that MuFGPS achieves superior performance across all metrics, with notable gains in F1-score and matthews correlation coefficient (MCC). Ablation analyses confirm the synergistic contributions of structural features and ensemble learning to accuracy and robustness. MuFGPS offers a scalable and high-accuracy framework for proteome-wide LLPS protein prediction.
KW - AlphaFold
KW - ensemble learning
KW - graph attention network
KW - liquid–liquid phase separation (LLPS)
KW - multi-level feature integration
UR - https://www.scopus.com/pages/publications/105039090147
U2 - 10.1093/bib/bbag235
DO - 10.1093/bib/bbag235
M3 - Article
AN - SCOPUS:105039090147
SN - 1467-5463
VL - 27
JO - Briefings in Bioinformatics
JF - Briefings in Bioinformatics
IS - 3
M1 - bbag235
ER -