Improving Radiology Report Generation with Adaptive Attention

Lin Wang, Jie Chen*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

1 Citation (Scopus)

Abstract

To avoid the tedious and laborious radiology report writing, automatic radiology reports generation has drawn great attention in recent years. As vision to language task, visual features and language features are equally important for radiology report generation. However, previous methods mainly pay attention to generating fluent reports, which neglects the eminent importance of how to better extract and utilize vision information. Keeping this in mind, we propose a novel architecture with a CLIP-based visual extractor and Multi-Head Adaptive Attention (MHAA) module to address the above two issues: through the vision-language pretrained encoders, more sufficient visual information has been explored, then during report generation, MHAA controls the visual information participating in the generation of each word. Experiments conducted on two public datasets demonstrate that our method outperforms state-of-the-art methods on all the metrics.

Original languageEnglish
Title of host publicationStudies in Computational Intelligence
PublisherSpringer Science and Business Media Deutschland GmbH
Pages293-305
Number of pages13
DOIs
Publication statusPublished - 2023
Externally publishedYes

Publication series

NameStudies in Computational Intelligence
Volume1060
ISSN (Print)1860-949X
ISSN (Electronic)1860-9503

Keywords

  • Adaptive attention mechanism
  • Radiology report generation
  • Visual encoder

Fingerprint

Dive into the research topics of 'Improving Radiology Report Generation with Adaptive Attention'. Together they form a unique fingerprint.

Cite this