Nested Named Entity Recognition in Chinese Electronic Medical Records

Maolin Yang, Zeran Lu, Yucong Lin, Hong Song*, Jian Yang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Nested named entity recognition (NER) is crucial in processing Chinese electronic medical records (EMRs). Recently, the BERT-based model using CNN and a multi-head Biaffine decoder has shown promising results in nested NER on news datasets. However, this model faces difficulties in dealing with the complex and unevenly distributed entities in Chinese EMRs, resulting in prediction errors. This paper proposes an MC-BERT-CGC model based on MC-BERT semantic features comprising Context-Gated Convolution and multi-head Biaffine decoder. Our model initially incorporates Chinese medical language knowledge by leveraging MC-BERT to represent medical descriptions as sentence vectors. We then use Context-Gated Convolution to accurately define the boundaries of nested entities by learning overlapping relationships between different entities. Finally, we use Focal Loss to classify difficult-to-distinguish entities. Experimental results tested on our Chinese EMRs and the CMeEE-V2 dataset show that our model performs better than existing baseline models in Chinese medical NER tasks. The impacts of this study on the life of patients are significant, as more accurate and detailed medical information can be extracted from EMRs, potentially leading to improved diagnoses, personalized treatment recommendations, and proactive identification of health risks. Our code is available at https://github.com/ymlmorning/MC-BERT-CGC.

Original languageEnglish
Title of host publicationComputational Intelligence Methods for Bioinformatics and Biostatistics - 18th International Meeting, CIBB 2023, Revised Selected Papers
EditorsMartina Vettoretti, Erica Tavazzi, Enrico Longato, Giacomo Baruzzo, Massimo Bellato
PublisherSpringer Science and Business Media Deutschland GmbH
Pages58-69
Number of pages12
ISBN (Print)9783031907135
DOIs
Publication statusPublished - 2025
Externally publishedYes
Event18th International Conference on Computational Intelligence Methods for Bioinformatics and Biostatistics, CIBB 2023 - Padova, Italy
Duration: 6 Sept 20238 Sept 2023

Publication series

NameLecture Notes in Computer Science
Volume14513 LNBI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference18th International Conference on Computational Intelligence Methods for Bioinformatics and Biostatistics, CIBB 2023
Country/TerritoryItaly
CityPadova
Period6/09/238/09/23

Keywords

  • Chinese electronic medical records
  • Context-Gated convolution
  • Focal Loss
  • MC-BERT
  • Nested named entity recognition

Fingerprint

Dive into the research topics of 'Nested Named Entity Recognition in Chinese Electronic Medical Records'. Together they form a unique fingerprint.

Cite this