Enhanced ADHD detection: Frequency information embedded in a visual-language framework

Runze Hu; Kaishi Zhu; Zhenzhe Hou; Ruideng Wang; Feifei Liu

doi:10.1016/j.displa.2024.102712

Enhanced ADHD detection: Frequency information embedded in a visual-language framework

Runze Hu, Kaishi Zhu, Zhenzhe Hou, Ruideng Wang, Feifei Liu^*

^*此作品的通讯作者

信息与电子学院

科研成果: 期刊稿件 › 文章 › 同行评审

1 引用（Scopus）

摘要

This paper presents the Frequency-Integrated Visual-Language Network (FIVLNet), a deep learning (DL) framework tailored to improve the diagnostic accuracy for Attention Deficit Hyperactivity Disorder (ADHD) using magnetic resonance imaging (MRI) scans. Traditional DL approaches in ADHD diagnosis often overlook the sequential dependencies of MRI images or fail to adequately capture their complex structural details, resulting in suboptimal classification accuracy. To address this, the proposed FIVLNet synergistically integrates both high and low-frequency data from MRI images, based on the Convolutional Neural Network (CNN) and the cross-attention mechanism, subsequently achieving more comprehensive representations of the MRI images. Furthermore, in order to enrich the model's learning capacity, textual embeddings from Contrastive Language-Image Pre-training (CLIP) are introduced to provide additional modalities of information. FIVLNet also preserves a lightweight architecture, which necessitates a smaller number of learnable parameters compared to existing models.

源语言	英语
文章编号	102712
期刊	Displays
卷	83
DOI	https://doi.org/10.1016/j.displa.2024.102712
出版状态	已出版 - 7月 2024

访问文件

10.1016/j.displa.2024.102712

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{73bcba3224e044ee934532109dbd3689,

title = "Enhanced ADHD detection: Frequency information embedded in a visual-language framework",

abstract = "This paper presents the Frequency-Integrated Visual-Language Network (FIVLNet), a deep learning (DL) framework tailored to improve the diagnostic accuracy for Attention Deficit Hyperactivity Disorder (ADHD) using magnetic resonance imaging (MRI) scans. Traditional DL approaches in ADHD diagnosis often overlook the sequential dependencies of MRI images or fail to adequately capture their complex structural details, resulting in suboptimal classification accuracy. To address this, the proposed FIVLNet synergistically integrates both high and low-frequency data from MRI images, based on the Convolutional Neural Network (CNN) and the cross-attention mechanism, subsequently achieving more comprehensive representations of the MRI images. Furthermore, in order to enrich the model's learning capacity, textual embeddings from Contrastive Language-Image Pre-training (CLIP) are introduced to provide additional modalities of information. FIVLNet also preserves a lightweight architecture, which necessitates a smaller number of learnable parameters compared to existing models.",

keywords = "Deep learning, MRI image processing, Multimodal learning",

author = "Runze Hu and Kaishi Zhu and Zhenzhe Hou and Ruideng Wang and Feifei Liu",

note = "Publisher Copyright: {\textcopyright} 2024",

year = "2024",

month = jul,

doi = "10.1016/j.displa.2024.102712",

language = "English",

volume = "83",

journal = "Displays",

issn = "0141-9382",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - Enhanced ADHD detection

T2 - Frequency information embedded in a visual-language framework

AU - Hu, Runze

AU - Zhu, Kaishi

AU - Hou, Zhenzhe

AU - Wang, Ruideng

AU - Liu, Feifei

PY - 2024/7

Y1 - 2024/7

N2 - This paper presents the Frequency-Integrated Visual-Language Network (FIVLNet), a deep learning (DL) framework tailored to improve the diagnostic accuracy for Attention Deficit Hyperactivity Disorder (ADHD) using magnetic resonance imaging (MRI) scans. Traditional DL approaches in ADHD diagnosis often overlook the sequential dependencies of MRI images or fail to adequately capture their complex structural details, resulting in suboptimal classification accuracy. To address this, the proposed FIVLNet synergistically integrates both high and low-frequency data from MRI images, based on the Convolutional Neural Network (CNN) and the cross-attention mechanism, subsequently achieving more comprehensive representations of the MRI images. Furthermore, in order to enrich the model's learning capacity, textual embeddings from Contrastive Language-Image Pre-training (CLIP) are introduced to provide additional modalities of information. FIVLNet also preserves a lightweight architecture, which necessitates a smaller number of learnable parameters compared to existing models.

AB - This paper presents the Frequency-Integrated Visual-Language Network (FIVLNet), a deep learning (DL) framework tailored to improve the diagnostic accuracy for Attention Deficit Hyperactivity Disorder (ADHD) using magnetic resonance imaging (MRI) scans. Traditional DL approaches in ADHD diagnosis often overlook the sequential dependencies of MRI images or fail to adequately capture their complex structural details, resulting in suboptimal classification accuracy. To address this, the proposed FIVLNet synergistically integrates both high and low-frequency data from MRI images, based on the Convolutional Neural Network (CNN) and the cross-attention mechanism, subsequently achieving more comprehensive representations of the MRI images. Furthermore, in order to enrich the model's learning capacity, textual embeddings from Contrastive Language-Image Pre-training (CLIP) are introduced to provide additional modalities of information. FIVLNet also preserves a lightweight architecture, which necessitates a smaller number of learnable parameters compared to existing models.

KW - Deep learning

KW - MRI image processing

KW - Multimodal learning

UR - http://www.scopus.com/inward/record.url?scp=85190164172&partnerID=8YFLogxK

U2 - 10.1016/j.displa.2024.102712

DO - 10.1016/j.displa.2024.102712

M3 - Article

AN - SCOPUS:85190164172

SN - 0141-9382

VL - 83

JO - Displays

JF - Displays

M1 - 102712

ER -

Enhanced ADHD detection: Frequency information embedded in a visual-language framework

摘要

访问文件

其它文件与链接

指纹

引用此