A Local-Sparse-Information-Aggregation Transformer with Explicit Contour Guidance for SAR Ship Detection

Hao Shi; Bingqian Chai; Yupei Wang; Liang Chen

doi:10.3390/rs14205247

A Local-Sparse-Information-Aggregation Transformer with Explicit Contour Guidance for SAR Ship Detection

Hao Shi, Bingqian Chai, Yupei Wang^*, Liang Chen

^*此作品的通讯作者

信息与电子学院

Beijing Institute of Technology

科研成果: 期刊稿件 › 文章 › 同行评审

11 引用（Scopus）

摘要

Ship detection in synthetic aperture radar (SAR) images has witnessed rapid development in recent years, especially after the adoption of convolutional neural network (CNN)-based methods. Recently, a transformer using self-attention and a feed forward neural network with a encoder-decoder structure has received much attention from researchers, due to its intrinsic characteristics of global-relation modeling between pixels and an enlarged global receptive field. However, when adapting transformers to SAR ship detection, one challenging issue cannot be ignored. Background clutter, such as a coast, an island, or a sea wave, made previous object detectors easily miss ships with a blurred contour. Therefore, in this paper, we propose a local-sparse-information-aggregation transformer with explicit contour guidance for ship detection in SAR images. Based on the Swin Transformer architecture, in order to effectively aggregate sparse meaningful cues of small-scale ships, a deformable attention mechanism is incorporated to change the original self-attention mechanism. Moreover, a novel contour-guided shape-enhancement module is proposed to explicitly enforce the contour constraints on the one-dimensional transformer architecture. Experimental results show that our proposed method achieves superior performance on the challenging HRSID and SSDD datasets.

源语言	英语
文章编号	5247
期刊	Remote Sensing
卷	14
期	20
DOI	https://doi.org/10.3390/rs14205247
出版状态	已出版 - 10月 2022

访问文件

10.3390/rs14205247

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{44c7915dbdf348ee82aec11d38c478bf,

title = "A Local-Sparse-Information-Aggregation Transformer with Explicit Contour Guidance for SAR Ship Detection",

abstract = "Ship detection in synthetic aperture radar (SAR) images has witnessed rapid development in recent years, especially after the adoption of convolutional neural network (CNN)-based methods. Recently, a transformer using self-attention and a feed forward neural network with a encoder-decoder structure has received much attention from researchers, due to its intrinsic characteristics of global-relation modeling between pixels and an enlarged global receptive field. However, when adapting transformers to SAR ship detection, one challenging issue cannot be ignored. Background clutter, such as a coast, an island, or a sea wave, made previous object detectors easily miss ships with a blurred contour. Therefore, in this paper, we propose a local-sparse-information-aggregation transformer with explicit contour guidance for ship detection in SAR images. Based on the Swin Transformer architecture, in order to effectively aggregate sparse meaningful cues of small-scale ships, a deformable attention mechanism is incorporated to change the original self-attention mechanism. Moreover, a novel contour-guided shape-enhancement module is proposed to explicitly enforce the contour constraints on the one-dimensional transformer architecture. Experimental results show that our proposed method achieves superior performance on the challenging HRSID and SSDD datasets.",

keywords = "SAR ship detection, Swin Transformer, contour enhancement, deep learning, deformable attention",

author = "Hao Shi and Bingqian Chai and Yupei Wang and Liang Chen",

note = "Publisher Copyright: {\textcopyright} 2022 by the authors.",

year = "2022",

month = oct,

doi = "10.3390/rs14205247",

language = "English",

volume = "14",

journal = "Remote Sensing",

issn = "2072-4292",

publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",

number = "20",

}

TY - JOUR

T1 - A Local-Sparse-Information-Aggregation Transformer with Explicit Contour Guidance for SAR Ship Detection

AU - Shi, Hao

AU - Chai, Bingqian

AU - Wang, Yupei

AU - Chen, Liang

PY - 2022/10

Y1 - 2022/10

N2 - Ship detection in synthetic aperture radar (SAR) images has witnessed rapid development in recent years, especially after the adoption of convolutional neural network (CNN)-based methods. Recently, a transformer using self-attention and a feed forward neural network with a encoder-decoder structure has received much attention from researchers, due to its intrinsic characteristics of global-relation modeling between pixels and an enlarged global receptive field. However, when adapting transformers to SAR ship detection, one challenging issue cannot be ignored. Background clutter, such as a coast, an island, or a sea wave, made previous object detectors easily miss ships with a blurred contour. Therefore, in this paper, we propose a local-sparse-information-aggregation transformer with explicit contour guidance for ship detection in SAR images. Based on the Swin Transformer architecture, in order to effectively aggregate sparse meaningful cues of small-scale ships, a deformable attention mechanism is incorporated to change the original self-attention mechanism. Moreover, a novel contour-guided shape-enhancement module is proposed to explicitly enforce the contour constraints on the one-dimensional transformer architecture. Experimental results show that our proposed method achieves superior performance on the challenging HRSID and SSDD datasets.

AB - Ship detection in synthetic aperture radar (SAR) images has witnessed rapid development in recent years, especially after the adoption of convolutional neural network (CNN)-based methods. Recently, a transformer using self-attention and a feed forward neural network with a encoder-decoder structure has received much attention from researchers, due to its intrinsic characteristics of global-relation modeling between pixels and an enlarged global receptive field. However, when adapting transformers to SAR ship detection, one challenging issue cannot be ignored. Background clutter, such as a coast, an island, or a sea wave, made previous object detectors easily miss ships with a blurred contour. Therefore, in this paper, we propose a local-sparse-information-aggregation transformer with explicit contour guidance for ship detection in SAR images. Based on the Swin Transformer architecture, in order to effectively aggregate sparse meaningful cues of small-scale ships, a deformable attention mechanism is incorporated to change the original self-attention mechanism. Moreover, a novel contour-guided shape-enhancement module is proposed to explicitly enforce the contour constraints on the one-dimensional transformer architecture. Experimental results show that our proposed method achieves superior performance on the challenging HRSID and SSDD datasets.

KW - SAR ship detection

KW - Swin Transformer

KW - contour enhancement

KW - deep learning

KW - deformable attention

UR - http://www.scopus.com/inward/record.url?scp=85140990220&partnerID=8YFLogxK

U2 - 10.3390/rs14205247

DO - 10.3390/rs14205247

M3 - Article

AN - SCOPUS:85140990220

SN - 2072-4292

VL - 14

JO - Remote Sensing

JF - Remote Sensing

IS - 20

M1 - 5247

ER -

A Local-Sparse-Information-Aggregation Transformer with Explicit Contour Guidance for SAR Ship Detection

摘要

访问文件

其它文件与链接

指纹

引用此