Cap-CasMVSNet:Capsul-based Cascade Cost Volume for Multi-View Stereo Network

Hongmin Zhou*, Yuan Li

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Currently deep learning has been widely used in multi-view stereo, but there is still room for optimization. In this paper, we propose a capsule network-based multi-view stereo network, namely Cap-CasMVSNet. We first introduce a transformer-based filter to highlight the foreground part of the feature. Then aiming at the shortcomings of the fixed receptive field of the traditional convolution kernel, we added a deformable convolution module to the network to enable the convolution to adapt to geometric deformation. We use a capsule neuron to handle global semantic connections between high-level features. Finally, we achieve competitive results on the DTU dataset, showing strong robustness.

Original languageEnglish
Title of host publicationProceedings - 2023 38th Youth Academic Annual Conference of Chinese Association of Automation, YAC 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages59-63
Number of pages5
ISBN (Electronic)9798350303636
DOIs
Publication statusPublished - 2023
Event38th Youth Academic Annual Conference of Chinese Association of Automation, YAC 2023 - Hefei, China
Duration: 27 Aug 202329 Aug 2023

Publication series

NameProceedings - 2023 38th Youth Academic Annual Conference of Chinese Association of Automation, YAC 2023

Conference

Conference38th Youth Academic Annual Conference of Chinese Association of Automation, YAC 2023
Country/TerritoryChina
CityHefei
Period27/08/2329/08/23

Keywords

  • capsule network
  • multi-view stereo
  • transformer

Fingerprint

Dive into the research topics of 'Cap-CasMVSNet:Capsul-based Cascade Cost Volume for Multi-View Stereo Network'. Together they form a unique fingerprint.

Cite this