Clinical-BERT: Vision-Language Pre-training for Radiograph Diagnosis and Reports Generation

Bin Yan, Mingtao Pei*

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

32 引用 (Scopus)

摘要

In this paper, we propose a vision-language pre-training model, Clinical-BERT, for the medical domain, and devise three domain-specific tasks: Clinical Diagnosis (CD), Masked MeSH Modeling (MMM), Image-MeSH Matching (IMM), together with one general pre-training task: Masked Language Modeling (MLM), to pre-train the model. The CD task helps the model to learn medical domain knowledge by predicting disease from radiographs. Medical Subject Headings (MeSH) words are important semantic components in radiograph reports, and the MMM task helps the model focus on the prediction of MeSH words. The IMM task helps the model learn the alignment of MeSH words with radiographs by matching scores obtained by a two-level sparse attention: region sparse attention and word sparse attention. Region sparse attention generates corresponding visual features for each word, and word sparse attention enhances the contribution of images-MeSH matching to the matching scores. To the best of our knowledge, this is the first attempt to learn domain knowledge during pre-training for the medical domain. We evaluate the pre-training model on Radiograph Diagnosis and Reports Generation tasks across four challenging datasets: MIMIC-CXR, IU X-Ray, COV-CTR, and NIH, and achieve state-of-the-art results for all the tasks, which demonstrates the effectiveness of our pre-training model.

源语言英语
主期刊名AAAI-22 Technical Tracks 3
出版商Association for the Advancement of Artificial Intelligence
2982-2990
页数9
ISBN(电子版)1577358767, 9781577358763
出版状态已出版 - 30 6月 2022
活动36th AAAI Conference on Artificial Intelligence, AAAI 2022 - Virtual, Online
期限: 22 2月 20221 3月 2022

出版系列

姓名Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022
36

会议

会议36th AAAI Conference on Artificial Intelligence, AAAI 2022
Virtual, Online
时期22/02/221/03/22

指纹

探究 'Clinical-BERT: Vision-Language Pre-training for Radiograph Diagnosis and Reports Generation' 的科研主题。它们共同构成独一无二的指纹。

引用此