TY - JOUR
T1 - Learning to Match Ground Camera Image and UAV 3-D Model-Rendered Image Based on Siamese Network with Attention Mechanism
AU - Liu, Weiquan
AU - Wang, Cheng
AU - Bian, Xuesheng
AU - Chen, Shuting
AU - Yu, Shangshu
AU - Lin, Xiuhong
AU - Lai, Shang Hong
AU - Weng, Dongdong
AU - Li, Jonathan
N1 - Publisher Copyright:
© 2004-2012 IEEE.
PY - 2020/9
Y1 - 2020/9
N2 - Different domain image sensors or imaging mechanisms provide cross-domain images when sensing the same scene. There is a domain shift between cross-domain images so that the image gap between different domains is the major challenge for measuring the similarity of the feature descriptors extracted from different domain images. Specifically, matching ground camera images and unmanned aerial vehicle (UAV) 3-D model-rendered images, which are two kinds of extremely challenging cross-domain images, is a way to establish indirectly the spatial relationship between 2-D and 3-D spaces. This provides a solution for the virtual-real registration of augmented reality (AR) in outdoor environments. However, during matching, handcrafted descriptors and existing learning-based feature descriptors limit the rendered images. In this letter, first, to learn robust and invariant 128-D local feature descriptors for ground camera and rendered images, we present a novel network structure, SiamAM-Net, which embeds the autoencoders with an attention mechanism into the Siamese network. Then, to narrow the gap between the cross-domain images during the optimizing of SiamAM-Net, we design an adaptive margin for the loss function. Finally, we match the ground camera-rendered images by using the learned local feature descriptors and explore the outdoor AR virtual-real registration. Experiments show that the local feature descriptors, learned by SiamAM-Net, are robust and achieve state-of-the-art retrieval performance on the cross-domain image data set of ground camera and rendered images. In addition, several outdoor AR applications also demonstrate the usefulness of the proposed outdoor AR virtual-real registration.
AB - Different domain image sensors or imaging mechanisms provide cross-domain images when sensing the same scene. There is a domain shift between cross-domain images so that the image gap between different domains is the major challenge for measuring the similarity of the feature descriptors extracted from different domain images. Specifically, matching ground camera images and unmanned aerial vehicle (UAV) 3-D model-rendered images, which are two kinds of extremely challenging cross-domain images, is a way to establish indirectly the spatial relationship between 2-D and 3-D spaces. This provides a solution for the virtual-real registration of augmented reality (AR) in outdoor environments. However, during matching, handcrafted descriptors and existing learning-based feature descriptors limit the rendered images. In this letter, first, to learn robust and invariant 128-D local feature descriptors for ground camera and rendered images, we present a novel network structure, SiamAM-Net, which embeds the autoencoders with an attention mechanism into the Siamese network. Then, to narrow the gap between the cross-domain images during the optimizing of SiamAM-Net, we design an adaptive margin for the loss function. Finally, we match the ground camera-rendered images by using the learned local feature descriptors and explore the outdoor AR virtual-real registration. Experiments show that the local feature descriptors, learned by SiamAM-Net, are robust and achieve state-of-the-art retrieval performance on the cross-domain image data set of ground camera and rendered images. In addition, several outdoor AR applications also demonstrate the usefulness of the proposed outdoor AR virtual-real registration.
KW - Attention mechanism
KW - Siamese network
KW - augmented reality (AR)
KW - cross-domain image patch matching
KW - virtual-real registration
UR - http://www.scopus.com/inward/record.url?scp=85085373012&partnerID=8YFLogxK
U2 - 10.1109/LGRS.2019.2949351
DO - 10.1109/LGRS.2019.2949351
M3 - Article
AN - SCOPUS:85085373012
SN - 1545-598X
VL - 17
SP - 1608
EP - 1612
JO - IEEE Geoscience and Remote Sensing Letters
JF - IEEE Geoscience and Remote Sensing Letters
IS - 9
M1 - 8894486
ER -