PT-RE: Prompt-Based Multimodal Transformer for Road Network Extraction From Remote Sensing Images

Yuxuan Han, Qingxiao Liu, Haiou Liu, Xiuzhong Hu, Boyang Wang*

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

1 引用 (Scopus)

摘要

Road network extraction from remote sensing images can provide precise map information for global positioning and planning. While existing transformer-based methods show promising performance in road network extraction, they suffer from misleading results of crossroad and low generalization ability. In our study, a prompt-based multimodal transformer for road network extraction (PT-RE) is proposed. In PT-RE, a Swin transformer is used as the backbone network to extract image features from remote sensing images. Then, a fine-tuned prompt-based method is employed to generate the road topology classification contexts. The prompt-based information generation and cross-modal loss function are designed to deal with the fine-tuning task. Compared with the original uni-modal loss function in fine-tuning, the cross-modal method processes the different modal information and improves the generalization ability. Finally, the topology decoder utilizes cross-attention architecture to predict the relationship by the information from images and classification contexts. With the help of different views and modal information, the framework strengthens the accuracy of crossroad detection rather than the uni-modal type. The proposed topological road network extraction method demonstrates superior accuracy across 20 U.S. Cities datasets and SpaceNet datasets, showcasing its accuracy and generalization ability.

源语言英语
页(从-至)35832-35844
页数13
期刊IEEE Sensors Journal
24
21
DOI
出版状态已出版 - 2024

指纹

探究 'PT-RE: Prompt-Based Multimodal Transformer for Road Network Extraction From Remote Sensing Images' 的科研主题。它们共同构成独一无二的指纹。

引用此

Han, Y., Liu, Q., Liu, H., Hu, X., & Wang, B. (2024). PT-RE: Prompt-Based Multimodal Transformer for Road Network Extraction From Remote Sensing Images. IEEE Sensors Journal, 24(21), 35832-35844. https://doi.org/10.1109/JSEN.2024.3428483