Graph embeddings on gene ontology annotations for protein–protein interaction prediction

Xiaoshi Zhong; Jagath C. Rajapakse

doi:10.1186/s12859-020-03816-8

Graph embeddings on gene ontology annotations for protein–protein interaction prediction

Xiaoshi Zhong^*, Jagath C. Rajapakse

^*Corresponding author for this work

School of Computer Science and Technology

Nanyang Technological University

Research output: Contribution to journal › Article › peer-review

17 Citations (Scopus)

Abstract

Background: Protein–protein interaction (PPI) prediction is an important task towards the understanding of many bioinformatics functions and applications, such as predicting protein functions, gene-disease associations and disease-drug associations. However, many previous PPI prediction researches do not consider missing and spurious interactions inherent in PPI networks. To address these two issues, we define two corresponding tasks, namely missing PPI prediction and spurious PPI prediction, and propose a method that employs graph embeddings that learn vector representations from constructed Gene Ontology Annotation (GOA) graphs and then use embedded vectors to achieve the two tasks. Our method leverages on information from both term–term relations among GO terms and term-protein annotations between GO terms and proteins, and preserves properties of both local and global structural information of the GO annotation graph. Results: We compare our method with those methods that are based on information content (IC) and one method that is based on word embeddings, with experiments on three PPI datasets from STRING database. Experimental results demonstrate that our method is more effective than those compared methods. Conclusion: Our experimental results demonstrate the effectiveness of using graph embeddings to learn vector representations from undirected GOA graphs for our defined missing and spurious PPI tasks.

Original language	English
Article number	560
Journal	BMC Bioinformatics
Volume	21
DOIs	https://doi.org/10.1186/s12859-020-03816-8
Publication status	Published - Dec 2020

Keywords

Gene Ontology annotations
Graph embeddings
Missing PPIs
Protein–protein interactions
Spurious PPIs
Vector representations

Access to Document

10.1186/s12859-020-03816-8

Cite this

@article{8f7259e3f0d44bb499ed499ddd3a356b,

title = "Graph embeddings on gene ontology annotations for protein–protein interaction prediction",

abstract = "Background: Protein–protein interaction (PPI) prediction is an important task towards the understanding of many bioinformatics functions and applications, such as predicting protein functions, gene-disease associations and disease-drug associations. However, many previous PPI prediction researches do not consider missing and spurious interactions inherent in PPI networks. To address these two issues, we define two corresponding tasks, namely missing PPI prediction and spurious PPI prediction, and propose a method that employs graph embeddings that learn vector representations from constructed Gene Ontology Annotation (GOA) graphs and then use embedded vectors to achieve the two tasks. Our method leverages on information from both term–term relations among GO terms and term-protein annotations between GO terms and proteins, and preserves properties of both local and global structural information of the GO annotation graph. Results: We compare our method with those methods that are based on information content (IC) and one method that is based on word embeddings, with experiments on three PPI datasets from STRING database. Experimental results demonstrate that our method is more effective than those compared methods. Conclusion: Our experimental results demonstrate the effectiveness of using graph embeddings to learn vector representations from undirected GOA graphs for our defined missing and spurious PPI tasks.",

keywords = "Gene Ontology annotations, Graph embeddings, Missing PPIs, Protein–protein interactions, Spurious PPIs, Vector representations",

author = "Xiaoshi Zhong and Rajapakse, {Jagath C.}",

note = "Publisher Copyright: {\textcopyright} 2020, The Author(s).",

year = "2020",

month = dec,

doi = "10.1186/s12859-020-03816-8",

language = "English",

volume = "21",

journal = "BMC Bioinformatics",

issn = "1471-2105",

publisher = "BioMed Central Ltd.",

}

TY - JOUR

T1 - Graph embeddings on gene ontology annotations for protein–protein interaction prediction

AU - Zhong, Xiaoshi

AU - Rajapakse, Jagath C.

PY - 2020/12

Y1 - 2020/12

N2 - Background: Protein–protein interaction (PPI) prediction is an important task towards the understanding of many bioinformatics functions and applications, such as predicting protein functions, gene-disease associations and disease-drug associations. However, many previous PPI prediction researches do not consider missing and spurious interactions inherent in PPI networks. To address these two issues, we define two corresponding tasks, namely missing PPI prediction and spurious PPI prediction, and propose a method that employs graph embeddings that learn vector representations from constructed Gene Ontology Annotation (GOA) graphs and then use embedded vectors to achieve the two tasks. Our method leverages on information from both term–term relations among GO terms and term-protein annotations between GO terms and proteins, and preserves properties of both local and global structural information of the GO annotation graph. Results: We compare our method with those methods that are based on information content (IC) and one method that is based on word embeddings, with experiments on three PPI datasets from STRING database. Experimental results demonstrate that our method is more effective than those compared methods. Conclusion: Our experimental results demonstrate the effectiveness of using graph embeddings to learn vector representations from undirected GOA graphs for our defined missing and spurious PPI tasks.

AB - Background: Protein–protein interaction (PPI) prediction is an important task towards the understanding of many bioinformatics functions and applications, such as predicting protein functions, gene-disease associations and disease-drug associations. However, many previous PPI prediction researches do not consider missing and spurious interactions inherent in PPI networks. To address these two issues, we define two corresponding tasks, namely missing PPI prediction and spurious PPI prediction, and propose a method that employs graph embeddings that learn vector representations from constructed Gene Ontology Annotation (GOA) graphs and then use embedded vectors to achieve the two tasks. Our method leverages on information from both term–term relations among GO terms and term-protein annotations between GO terms and proteins, and preserves properties of both local and global structural information of the GO annotation graph. Results: We compare our method with those methods that are based on information content (IC) and one method that is based on word embeddings, with experiments on three PPI datasets from STRING database. Experimental results demonstrate that our method is more effective than those compared methods. Conclusion: Our experimental results demonstrate the effectiveness of using graph embeddings to learn vector representations from undirected GOA graphs for our defined missing and spurious PPI tasks.

KW - Gene Ontology annotations

KW - Graph embeddings

KW - Missing PPIs

KW - Protein–protein interactions

KW - Spurious PPIs

KW - Vector representations

UR - http://www.scopus.com/inward/record.url?scp=85097563903&partnerID=8YFLogxK

U2 - 10.1186/s12859-020-03816-8

DO - 10.1186/s12859-020-03816-8

M3 - Article

C2 - 33323115

AN - SCOPUS:85097563903

SN - 1471-2105

VL - 21

JO - BMC Bioinformatics

JF - BMC Bioinformatics

M1 - 560

ER -

Graph embeddings on gene ontology annotations for protein–protein interaction prediction

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this