TY - GEN
T1 - Transfer-DDG
T2 - 2nd IEEE Industrial Electronics Society Annual On-Line Conference, ONCON 2023
AU - Wang, Yuxiang
AU - Shi, Xiumin
AU - Zhou, Han
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - The majority of proteins in organisms function through protein-protein interactions (PPIs). Moreover, specific amino acid mutations on these proteins may alter their functionality, which in turn may impact their interactions with other proteins. The investigation of the effect of mutations on protein interactions can be possible by using the numerical index of change in protein-protein binding affinity (ΔΔG). The conventional lab experiments to determine ΔΔG values are both inefficient and costly. In recent years, machine learning and deep learning techniques have been developed for rapid prediction of ΔΔG values. However, many of these approaches fail to extract the deep information embedded in protein amino acid sequences, resulting in unstable prediction results, and they do not generalize well to different types of protein ΔΔG prediction tasks. In this paper, we introduce a deep learning framework for predicting ΔΔG based on protein amino acid sequences, named TransferDDG. The framework utilizes large pre-trained models to learn features of individual amino acid levels, as well as a BiLSTM module to learn features of amino acid sequence levels. This enables the extraction of deep semantic information in protein sequences across multiple dimensions and channels. The model has achieved good results on both SKP1102s and SKP1400m datasets containing single amino acid mutations, surpassing other baseline models.
AB - The majority of proteins in organisms function through protein-protein interactions (PPIs). Moreover, specific amino acid mutations on these proteins may alter their functionality, which in turn may impact their interactions with other proteins. The investigation of the effect of mutations on protein interactions can be possible by using the numerical index of change in protein-protein binding affinity (ΔΔG). The conventional lab experiments to determine ΔΔG values are both inefficient and costly. In recent years, machine learning and deep learning techniques have been developed for rapid prediction of ΔΔG values. However, many of these approaches fail to extract the deep information embedded in protein amino acid sequences, resulting in unstable prediction results, and they do not generalize well to different types of protein ΔΔG prediction tasks. In this paper, we introduce a deep learning framework for predicting ΔΔG based on protein amino acid sequences, named TransferDDG. The framework utilizes large pre-trained models to learn features of individual amino acid levels, as well as a BiLSTM module to learn features of amino acid sequence levels. This enables the extraction of deep semantic information in protein sequences across multiple dimensions and channels. The model has achieved good results on both SKP1102s and SKP1400m datasets containing single amino acid mutations, surpassing other baseline models.
KW - amino acid mutations
KW - amino acid sequences
KW - deep learning
KW - pre-trained models
KW - protein-protein binding affinity
UR - https://www.scopus.com/pages/publications/85186744677
U2 - 10.1109/ONCON60463.2023.10430682
DO - 10.1109/ONCON60463.2023.10430682
M3 - Conference contribution
AN - SCOPUS:85186744677
T3 - 2023 IEEE 2nd Industrial Electronics Society Annual On-Line Conference, ONCON 2023
BT - 2023 IEEE 2nd Industrial Electronics Society Annual On-Line Conference, ONCON 2023
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 8 December 2023 through 10 December 2023
ER -