Abstract
Radiology report generation automates the creation of clinically accurate and coherent paragraphs from medical images, reducing the heavy burden of report writing for radiologists. However, current research in this field still faces limitations and urgently requires breakthroughs in feature extraction of image knowledge and model fusion. In this paper, we propose a radiology report generation framework, MedKit, that integrates high information density knowledge fusion with multi-level task feature distillation. We leverage knowledge embedding fusion through a knowledge graph to reduce semantic hallucinations. Additionally, by employing feature extraction techniques within a multi-level task feature distillation architecture, comprehensive image feature information is provided for the primary task. For adapting 2D and 3D images, we propose different visual encoders respectively, which address the issue of inconsistent shapes in medical images. Finally, utilizing a multimodal large model framework enables the generated radiology report to closely approximate medical experts’ fluent expression. Our proposed model significantly outperformed the state-of-the-art model in the MIMIC-CXR dataset with a 20.1 % increase in the BLEU-4 score, from 0.134 to 0.161. We also achieved the best result on the private Liver-CT dataset. Our code is available at https://github.com/sujaly/MedKit.
| Original language | English |
|---|---|
| Article number | 129003 |
| Journal | Expert Systems with Applications |
| Volume | 296 |
| DOIs | |
| Publication status | Published - 15 Jan 2026 |
| Externally published | Yes |
Keywords
- High information density knowledge injection
- Knowledge distillation
- Large language model
- Radiology report generation
Fingerprint
Dive into the research topics of 'MedKit: Multi-level feature distillation with knowledge injection for radiology report generation'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver