Abstract
Existing 3D stereo networks with 4D volumes are computationally expensive but precise while 2D stereo networks are efficient but poor performance. In this letter, we present a novel group fusion aggregation (GFA) for 2D convolutions cost aggregation based on 4D volumes to reduce computational costs. Group-wise disparity aggregation block (GDAB) and group-wise channel fusion block (GCFB) are proposed to fuse geometry and context information of different group cost volumes in GFA, respectively. Further, we employ channel-dimension-first cost volume transformation and disparity-dimension-first cost volume transformation to convert 4D cost volumes into 3D tensors for GDAB and GCFB input in GFA. We evaluate our method on two popular public benchmark datasets. Experimental results from the KITTI official website show that our method can achieve similar accuracy with other 3D stereo networks (PSMNet, GCNet, GwcNet, etc.) at a low computing consumption. The ablation studies further demonstrate the facticity and reasonability of our proposed GFA.
Original language | English |
---|---|
Pages (from-to) | 4251-4258 |
Number of pages | 8 |
Journal | IEEE Robotics and Automation Letters |
Volume | 8 |
Issue number | 7 |
DOIs | |
Publication status | Published - 1 Jul 2023 |
Keywords
- Robot vision
- depth estimation
- stereo matching