A multi-layer soft lattice based model for Chinese clinical named entity recognition

Shuli Guo, Wentao Yang, Lina Han, Xiaowei Song, Guowei Wang

Research output: Contribution to journalArticlepeer-review

11 Citations (Scopus)

Abstract

OBJECTIVE: Named entity recognition (NER) is a key and fundamental part of many medical and clinical tasks, including the establishment of a medical knowledge graph, decision-making support, and question answering systems. When extracting entities from electronic health records (EHRs), NER models mostly apply long short-term memory (LSTM) and have surprising performance in clinical NER. However, increasing the depth of the network is often required by these LSTM-based models to capture long-distance dependencies. Therefore, these LSTM-based models that have achieved high accuracy generally require long training times and extensive training data, which has obstructed the adoption of LSTM-based models in clinical scenarios with limited training time. METHOD: Inspired by Transformer, we combine Transformer with Soft Term Position Lattice to form soft lattice structure Transformer, which models long-distance dependencies similarly to LSTM. Our model consists of four components: the WordPiece module, the BERT module, the soft lattice structure Transformer module, and the CRF module. RESULT: Our experiments demonstrated that this approach increased the F1 by 1-5% in the CCKS NER task compared to other models based on LSTM with CRF and consumed less training time. Additional evaluations showed that lattice structure transformer shows good performance for recognizing long medical terms, abbreviations, and numbers. The proposed model achieve 91.6% f-measure in recognizing long medical terms and 90.36% f-measure in abbreviations, and numbers. CONCLUSIONS: By using soft lattice structure Transformer, the method proposed in this paper captured Chinese words to lattice information, making our model suitable for Chinese clinical medical records. Transformers with Mutilayer soft lattice Chinese word construction can capture potential interactions between Chinese characters and words.

Original languageEnglish
Pages (from-to)201
Number of pages1
JournalBMC Medical Informatics and Decision Making
Volume22
Issue number1
DOIs
Publication statusPublished - 30 Jul 2022

Keywords

  • Clinical named entity recognition
  • Clinical text mining
  • Fine-tuning BERT
  • Medical information processing
  • Transformer
  • Word-character lattice

Fingerprint

Dive into the research topics of 'A multi-layer soft lattice based model for Chinese clinical named entity recognition'. Together they form a unique fingerprint.

Cite this