TY - JOUR
T1 - Biomedical Text Classification Method Based on Hypergraph Attention Network
AU - Simeng, Bai
AU - Zhendong, Niu
AU - Hui, He
AU - Kaize, Shi
AU - Kun, Yi
AU - Yuanchi, Ma
N1 - Publisher Copyright:
© 2022, Chinese Academy of Sciences. All rights reserved.
PY - 2022/11/25
Y1 - 2022/11/25
N2 - [Objective] This paper proposes a new model integrating tag semantics. It uses text-level hypergraph and cross attention mechanism to capture the organizational structure and grammatical semantics of literature, aiming to improve the classification of biomedical texts. [Methods] First, we utilized the fine-tuned BioBERT to retrieve vector features from the biomedical texts. Then, we constructed a text-level hypergraph to capture the word order, semantics, and syntactics of the texts. Finally, we merged the features of text-level hypergraph and labelled semantics through the cross attention mechanism network to finish the text classification. [Results] The experimental results on the PM-Sentence dataset show that the proposed model is 2.34 percentage points higher than the baseline model in the comprehensive evaluation of F1 indicators. [Limitations] The experimental dataset needs to be expanded to evaluate the model’s performance in other fields. [Conclusions] The newly constructed model improves the classification of biomedical texts and provides effective support for knowledge retrieval and mining.
AB - [Objective] This paper proposes a new model integrating tag semantics. It uses text-level hypergraph and cross attention mechanism to capture the organizational structure and grammatical semantics of literature, aiming to improve the classification of biomedical texts. [Methods] First, we utilized the fine-tuned BioBERT to retrieve vector features from the biomedical texts. Then, we constructed a text-level hypergraph to capture the word order, semantics, and syntactics of the texts. Finally, we merged the features of text-level hypergraph and labelled semantics through the cross attention mechanism network to finish the text classification. [Results] The experimental results on the PM-Sentence dataset show that the proposed model is 2.34 percentage points higher than the baseline model in the comprehensive evaluation of F1 indicators. [Limitations] The experimental dataset needs to be expanded to evaluate the model’s performance in other fields. [Conclusions] The newly constructed model improves the classification of biomedical texts and provides effective support for knowledge retrieval and mining.
KW - Biomedical Field Label Information Fusion
KW - Cross Attention Mechanism
KW - Text Classification
KW - Text-Level Hypergraph
UR - http://www.scopus.com/inward/record.url?scp=85146450233&partnerID=8YFLogxK
U2 - 10.11925/infotech.2096-3467.2022.0145
DO - 10.11925/infotech.2096-3467.2022.0145
M3 - Article
AN - SCOPUS:85146450233
SN - 2096-3467
VL - 6
SP - 13
EP - 24
JO - Data Analysis and Knowledge Discovery
JF - Data Analysis and Knowledge Discovery
IS - 11
ER -