TY - JOUR
T1 - Uncertainty-Driven Parallel Transformer-Based Segmentation for Oral Disease Dataset
AU - Peng, Lintao
AU - Liu, Wenhui
AU - Xie, Siyu
AU - Ye, Lin
AU - Ye, Peng
AU - Xiao, Fei
AU - Bian, Liheng
N1 - Publisher Copyright:
© 1992-2012 IEEE. All rights reserved,
PY - 2025
Y1 - 2025
N2 - Accurate oral disease segmentation is a challenging task, for three major reasons: 1) The same type of oral disease has a diversity of size, color and texture; 2) The boundary between oral lesions and their surrounding mucosa is not sharp; 3) There is a lack of public large-scale oral disease segmentation datasets. To address these issues, we first report an oral disease segmentation network termed Oralformer, which enables to tackle multiple oral diseases. Specifically, we use a parallel design to combine local-window self-attention (LWSA) with channel-wise convolution (CWC), modeling cross-window connections to enlarge the receptive fields while maintaining linear complexity. Meanwhile, we connect these two branches with bi-directional interactions to form a basic parallel Transformer block namely LC-block. We insert the LC-block as the main building block in a U-shape encoder-decoder architecture to form Oralformer. Second, we introduce an uncertainty-driven self-adaptive loss function which can reinforce the network’s attention on the lesion’s edge regions that are easily confused, thus improving the segmentation accuracy of these regions. Third, we construct a large-scale oral disease segmentation (ODS) dataset containing 2602 image pairs. It covers three common oral diseases (including dental plaque, calculus and caries) and all age groups, which we hope will advance the field. Extensive experiments on six challenging datasets show that our Oralformer achieves state-of-the-art segmentation accuracy, and presents advantages in terms of generalizability and real-time segmentation efficiency (35fps). The code and ODS dataset will be publicly available at https:// github.com/LintaoPeng/Oralformer.
AB - Accurate oral disease segmentation is a challenging task, for three major reasons: 1) The same type of oral disease has a diversity of size, color and texture; 2) The boundary between oral lesions and their surrounding mucosa is not sharp; 3) There is a lack of public large-scale oral disease segmentation datasets. To address these issues, we first report an oral disease segmentation network termed Oralformer, which enables to tackle multiple oral diseases. Specifically, we use a parallel design to combine local-window self-attention (LWSA) with channel-wise convolution (CWC), modeling cross-window connections to enlarge the receptive fields while maintaining linear complexity. Meanwhile, we connect these two branches with bi-directional interactions to form a basic parallel Transformer block namely LC-block. We insert the LC-block as the main building block in a U-shape encoder-decoder architecture to form Oralformer. Second, we introduce an uncertainty-driven self-adaptive loss function which can reinforce the network’s attention on the lesion’s edge regions that are easily confused, thus improving the segmentation accuracy of these regions. Third, we construct a large-scale oral disease segmentation (ODS) dataset containing 2602 image pairs. It covers three common oral diseases (including dental plaque, calculus and caries) and all age groups, which we hope will advance the field. Extensive experiments on six challenging datasets show that our Oralformer achieves state-of-the-art segmentation accuracy, and presents advantages in terms of generalizability and real-time segmentation efficiency (35fps). The code and ODS dataset will be publicly available at https:// github.com/LintaoPeng/Oralformer.
KW - Medical image segmentation
KW - oral disease dataset
KW - transformer
KW - uncertainty driven learning
UR - http://www.scopus.com/inward/record.url?scp=105001060510&partnerID=8YFLogxK
U2 - 10.1109/TIP.2025.3544139
DO - 10.1109/TIP.2025.3544139
M3 - Article
C2 - 40036515
AN - SCOPUS:105001060510
SN - 1057-7149
VL - 34
SP - 1632
EP - 1644
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
ER -