TY - GEN
T1 - Mitigating Data Imbalance in Medical Report Generation Through Visual Data Resampling
AU - Chen, Haoquan
AU - Yan, Bin
AU - Pei, Mingtao
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.
PY - 2024
Y1 - 2024
N2 - The generation of accurate medical reports plays an important role in effective healthcare communication and precise patient treatment. However, a significant challenge arises due to the imbalanced distribution, with considerable variation of different diseases within the unhealthy data. This imbalanced data distribution hampers the learning ability of models and results in sub-optimal performance when dealing with rare diseases. In this paper, we propose BERT-VDR, a novel approach that leverages a BERT-based single-stream encoder coupled with a Visual Data Resampling (VDR) module, to mitigate the data imbalance in medical report generation. Specifically, we employ multi-label data resampling (MLSMOTE) to identify the nearest neighbors among minority-class samples and create new instances through linear interpolation. By integrating this approach with a classification task during the pre-training process, we aim to enhance the semantic precision of visual feature representations and mitigate learning performance degradation. Our method's efficacy is validated on two prominent medical imaging datasets, MIMIC-CXR and IU X-Ray. Our method clearly outperforms the baseline model and achieves state-of-the-art results across multiple metrics. Our findings highlight the potential of data resampling in enhancing medical report generation facing imbalanced data distribution.
AB - The generation of accurate medical reports plays an important role in effective healthcare communication and precise patient treatment. However, a significant challenge arises due to the imbalanced distribution, with considerable variation of different diseases within the unhealthy data. This imbalanced data distribution hampers the learning ability of models and results in sub-optimal performance when dealing with rare diseases. In this paper, we propose BERT-VDR, a novel approach that leverages a BERT-based single-stream encoder coupled with a Visual Data Resampling (VDR) module, to mitigate the data imbalance in medical report generation. Specifically, we employ multi-label data resampling (MLSMOTE) to identify the nearest neighbors among minority-class samples and create new instances through linear interpolation. By integrating this approach with a classification task during the pre-training process, we aim to enhance the semantic precision of visual feature representations and mitigate learning performance degradation. Our method's efficacy is validated on two prominent medical imaging datasets, MIMIC-CXR and IU X-Ray. Our method clearly outperforms the baseline model and achieves state-of-the-art results across multiple metrics. Our findings highlight the potential of data resampling in enhancing medical report generation facing imbalanced data distribution.
KW - BERT
KW - data resampling
KW - medical report generatio
KW - semantic precision
KW - visual feature representation
UR - http://www.scopus.com/inward/record.url?scp=85201222477&partnerID=8YFLogxK
U2 - 10.1007/978-981-97-5692-6_23
DO - 10.1007/978-981-97-5692-6_23
M3 - Conference contribution
AN - SCOPUS:85201222477
SN - 9789819756919
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 255
EP - 266
BT - Advanced Intelligent Computing in Bioinformatics - 20th International Conference, ICIC 2024, Proceedings
A2 - Huang, De-Shuang
A2 - Pan, Yijie
A2 - Zhang, Qinhu
PB - Springer Science and Business Media Deutschland GmbH
T2 - 20th International Conference on Intelligent Computing, ICIC 2024
Y2 - 5 August 2024 through 8 August 2024
ER -