Object-Centric Masked Image Modeling-Based Self-Supervised Pretraining for Remote Sensing Object Detection

Tong Zhang, Yin Zhuang*, He Chen, Liang Chen, Guanqun Wang, Peng Gao, Hao Dong

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

8 引用 (Scopus)

摘要

Masked image modeling (MIM) has been proved to be an optimal pretext task for self-supervised pretraining (SSP), which can facilitate the model to capture an effective task-agnostic representation at the pretraining step and then advance the fine-tuning performance of various downstream tasks. However, under the high randomly masked ratio of MIM, the scene-level MIM-based SSP is hard to capture the small-scale objects or local details from complex remote sensing scenes. Then, when the pretrained models capturing more scene-level information are directly applied for object-level fine-tuning step, there is an obvious representation learning misalignment between model pretraining and fine-tuning steps. Therefore, in this article, a novel object-centric masked image modeling (OCMIM) strategy is proposed to make the model better capture the object-level information at the pretraining step and then further advance the object detection fine-tuning step. First, to better learn the object-level representation involving full scales and multicategories at MIM-based SSP, a novel object-centric data generator is proposed to automatically setup targeted pretraining data according to objects themselves, which can provide the specific data condition for object detection model pretraining. Second, an attention-guided mask generator is designed to generate a guided mask for MIM pretext task, which can lead the model to learn more discriminative representation of highly attended object regions than by using the randomly masking strategy. Finally, several experiments are conducted on six remote sensing object detection benchmarks, and results proved that the proposed OCMIM-based SSP strategy is a better pretraining way for remote sensing object detection than normally used methods.

源语言英语
页(从-至)5013-5025
页数13
期刊IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
16
DOI
出版状态已出版 - 2023

指纹

探究 'Object-Centric Masked Image Modeling-Based Self-Supervised Pretraining for Remote Sensing Object Detection' 的科研主题。它们共同构成独一无二的指纹。

引用此