Geometric Boundary Guided Feature Fusion and Spatial-Semantic Context Aggregation for Semantic Segmentation of Remote Sensing Images

Yupei Wang; Haoran Zhang; Yongkang Hu; Xiaoxing Hu; Liang Chen; Shanqing Hu

doi:10.1109/TIP.2023.3326400

Geometric Boundary Guided Feature Fusion and Spatial-Semantic Context Aggregation for Semantic Segmentation of Remote Sensing Images

Yupei Wang^*, Haoran Zhang, Yongkang Hu, Xiaoxing Hu, Liang Chen, Shanqing Hu

^*此作品的通讯作者

信息与电子学院

Beijing Institute of Technology

科研成果: 期刊稿件 › 文章 › 同行评审

6 引用（Scopus）

摘要

Semantic segmentation of remote sensing images aims to achieve pixel-level semantic category assignment for input images. This task has achieved significant advances with the rapid development of deep neural network. Most current methods mainly focus on effectively fusing the low-level spatial details and high-level semantic cues. Other methods also propose to incorporate the boundary guidance to obtain boundary preserving segmentation. However, current methods treat the multi-level feature fusion and the boundary guidance as two separate tasks, resulting in sub-optimal solutions. Moreover, due to the large inter-class difference and small intra-class consistency within remote sensing images, current methods often fail to accurately aggregate the long-range contextual cues. These critical issues make current methods fail to achieve satisfactory segmentation predictions, which severely hinder downstream applications. To this end, we first propose a novel boundary guided multi-level feature fusion module to seamlessly incorporate the boundary guidance into the multi-level feature fusion operations. Meanwhile, in order to further enforce the boundary guidance effectively, we employ a geometric-similarity-based boundary loss function. In this way, under the explicit guidance of boundary constraint, the multi-level features are effectively combined. In addition, a channel-wise correlation guided spatial-semantic context aggregation module is presented to effectively aggregate the contextual cues. In this way, subtle but meaningful contextual cues about pixel-wise spatial context and channel-wise semantic correlation are effectively aggregated, leading to spatial-semantic context aggregation. Extensive qualitative and quantitative experimental results on ISPRS Vaihingen and GaoFen-2 datasets demonstrate the effectiveness of the proposed method.

源语言	英语
页（从-至）	6373-6385
页数	13
期刊	IEEE Transactions on Image Processing
卷	32
DOI	https://doi.org/10.1109/TIP.2023.3326400
出版状态	已出版 - 2023

访问文件

10.1109/TIP.2023.3326400

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{6964290059e341fb86b762106d697d70,

title = "Geometric Boundary Guided Feature Fusion and Spatial-Semantic Context Aggregation for Semantic Segmentation of Remote Sensing Images",

abstract = "Semantic segmentation of remote sensing images aims to achieve pixel-level semantic category assignment for input images. This task has achieved significant advances with the rapid development of deep neural network. Most current methods mainly focus on effectively fusing the low-level spatial details and high-level semantic cues. Other methods also propose to incorporate the boundary guidance to obtain boundary preserving segmentation. However, current methods treat the multi-level feature fusion and the boundary guidance as two separate tasks, resulting in sub-optimal solutions. Moreover, due to the large inter-class difference and small intra-class consistency within remote sensing images, current methods often fail to accurately aggregate the long-range contextual cues. These critical issues make current methods fail to achieve satisfactory segmentation predictions, which severely hinder downstream applications. To this end, we first propose a novel boundary guided multi-level feature fusion module to seamlessly incorporate the boundary guidance into the multi-level feature fusion operations. Meanwhile, in order to further enforce the boundary guidance effectively, we employ a geometric-similarity-based boundary loss function. In this way, under the explicit guidance of boundary constraint, the multi-level features are effectively combined. In addition, a channel-wise correlation guided spatial-semantic context aggregation module is presented to effectively aggregate the contextual cues. In this way, subtle but meaningful contextual cues about pixel-wise spatial context and channel-wise semantic correlation are effectively aggregated, leading to spatial-semantic context aggregation. Extensive qualitative and quantitative experimental results on ISPRS Vaihingen and GaoFen-2 datasets demonstrate the effectiveness of the proposed method.",

keywords = "Semantic segmentation, contextual cues, multi-level feature fusion",

author = "Yupei Wang and Haoran Zhang and Yongkang Hu and Xiaoxing Hu and Liang Chen and Shanqing Hu",

note = "Publisher Copyright: {\textcopyright} 1992-2012 IEEE.",

year = "2023",

doi = "10.1109/TIP.2023.3326400",

language = "English",

volume = "32",

pages = "6373--6385",

journal = "IEEE Transactions on Image Processing",

issn = "1057-7149",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Geometric Boundary Guided Feature Fusion and Spatial-Semantic Context Aggregation for Semantic Segmentation of Remote Sensing Images

AU - Wang, Yupei

AU - Zhang, Haoran

AU - Hu, Yongkang

AU - Hu, Xiaoxing

AU - Chen, Liang

AU - Hu, Shanqing

PY - 2023

Y1 - 2023

N2 - Semantic segmentation of remote sensing images aims to achieve pixel-level semantic category assignment for input images. This task has achieved significant advances with the rapid development of deep neural network. Most current methods mainly focus on effectively fusing the low-level spatial details and high-level semantic cues. Other methods also propose to incorporate the boundary guidance to obtain boundary preserving segmentation. However, current methods treat the multi-level feature fusion and the boundary guidance as two separate tasks, resulting in sub-optimal solutions. Moreover, due to the large inter-class difference and small intra-class consistency within remote sensing images, current methods often fail to accurately aggregate the long-range contextual cues. These critical issues make current methods fail to achieve satisfactory segmentation predictions, which severely hinder downstream applications. To this end, we first propose a novel boundary guided multi-level feature fusion module to seamlessly incorporate the boundary guidance into the multi-level feature fusion operations. Meanwhile, in order to further enforce the boundary guidance effectively, we employ a geometric-similarity-based boundary loss function. In this way, under the explicit guidance of boundary constraint, the multi-level features are effectively combined. In addition, a channel-wise correlation guided spatial-semantic context aggregation module is presented to effectively aggregate the contextual cues. In this way, subtle but meaningful contextual cues about pixel-wise spatial context and channel-wise semantic correlation are effectively aggregated, leading to spatial-semantic context aggregation. Extensive qualitative and quantitative experimental results on ISPRS Vaihingen and GaoFen-2 datasets demonstrate the effectiveness of the proposed method.

AB - Semantic segmentation of remote sensing images aims to achieve pixel-level semantic category assignment for input images. This task has achieved significant advances with the rapid development of deep neural network. Most current methods mainly focus on effectively fusing the low-level spatial details and high-level semantic cues. Other methods also propose to incorporate the boundary guidance to obtain boundary preserving segmentation. However, current methods treat the multi-level feature fusion and the boundary guidance as two separate tasks, resulting in sub-optimal solutions. Moreover, due to the large inter-class difference and small intra-class consistency within remote sensing images, current methods often fail to accurately aggregate the long-range contextual cues. These critical issues make current methods fail to achieve satisfactory segmentation predictions, which severely hinder downstream applications. To this end, we first propose a novel boundary guided multi-level feature fusion module to seamlessly incorporate the boundary guidance into the multi-level feature fusion operations. Meanwhile, in order to further enforce the boundary guidance effectively, we employ a geometric-similarity-based boundary loss function. In this way, under the explicit guidance of boundary constraint, the multi-level features are effectively combined. In addition, a channel-wise correlation guided spatial-semantic context aggregation module is presented to effectively aggregate the contextual cues. In this way, subtle but meaningful contextual cues about pixel-wise spatial context and channel-wise semantic correlation are effectively aggregated, leading to spatial-semantic context aggregation. Extensive qualitative and quantitative experimental results on ISPRS Vaihingen and GaoFen-2 datasets demonstrate the effectiveness of the proposed method.

KW - Semantic segmentation

KW - contextual cues

KW - multi-level feature fusion

UR - http://www.scopus.com/inward/record.url?scp=85178524179&partnerID=8YFLogxK

U2 - 10.1109/TIP.2023.3326400

DO - 10.1109/TIP.2023.3326400

M3 - Article

C2 - 37883288

AN - SCOPUS:85178524179

SN - 1057-7149

VL - 32

SP - 6373

EP - 6385

JO - IEEE Transactions on Image Processing

JF - IEEE Transactions on Image Processing

ER -

Geometric Boundary Guided Feature Fusion and Spatial-Semantic Context Aggregation for Semantic Segmentation of Remote Sensing Images

摘要

访问文件

其它文件与链接

指纹

引用此