GFANet: Group Fusion Aggregation Network for Real Time Stereo Matching

Yakai Zhang; Jinhui Zhang

doi:10.1109/LRA.2023.3280818

GFANet: Group Fusion Aggregation Network for Real Time Stereo Matching

Yakai Zhang, Jinhui Zhang^*

^*Corresponding author for this work

School of Automation

Beijing Institute of Technology

Research output: Contribution to journal › Article › peer-review

4 Citations (Scopus)

Abstract

Existing 3D stereo networks with 4D volumes are computationally expensive but precise while 2D stereo networks are efficient but poor performance. In this letter, we present a novel group fusion aggregation (GFA) for 2D convolutions cost aggregation based on 4D volumes to reduce computational costs. Group-wise disparity aggregation block (GDAB) and group-wise channel fusion block (GCFB) are proposed to fuse geometry and context information of different group cost volumes in GFA, respectively. Further, we employ channel-dimension-first cost volume transformation and disparity-dimension-first cost volume transformation to convert 4D cost volumes into 3D tensors for GDAB and GCFB input in GFA. We evaluate our method on two popular public benchmark datasets. Experimental results from the KITTI official website show that our method can achieve similar accuracy with other 3D stereo networks (PSMNet, GCNet, GwcNet, etc.) at a low computing consumption. The ablation studies further demonstrate the facticity and reasonability of our proposed GFA.

Original language	English
Pages (from-to)	4251-4258
Number of pages	8
Journal	IEEE Robotics and Automation Letters
Volume	8
Issue number	7
DOIs	https://doi.org/10.1109/LRA.2023.3280818
Publication status	Published - 1 Jul 2023

Keywords

Robot vision
depth estimation
stereo matching

Access to Document

10.1109/LRA.2023.3280818

Cite this

Zhang, Y., & Zhang, J. (2023). GFANet: Group Fusion Aggregation Network for Real Time Stereo Matching. IEEE Robotics and Automation Letters, 8(7), 4251-4258. https://doi.org/10.1109/LRA.2023.3280818

@article{a55651efc947452b842371619f87b48c,

title = "GFANet: Group Fusion Aggregation Network for Real Time Stereo Matching",

abstract = "Existing 3D stereo networks with 4D volumes are computationally expensive but precise while 2D stereo networks are efficient but poor performance. In this letter, we present a novel group fusion aggregation (GFA) for 2D convolutions cost aggregation based on 4D volumes to reduce computational costs. Group-wise disparity aggregation block (GDAB) and group-wise channel fusion block (GCFB) are proposed to fuse geometry and context information of different group cost volumes in GFA, respectively. Further, we employ channel-dimension-first cost volume transformation and disparity-dimension-first cost volume transformation to convert 4D cost volumes into 3D tensors for GDAB and GCFB input in GFA. We evaluate our method on two popular public benchmark datasets. Experimental results from the KITTI official website show that our method can achieve similar accuracy with other 3D stereo networks (PSMNet, GCNet, GwcNet, etc.) at a low computing consumption. The ablation studies further demonstrate the facticity and reasonability of our proposed GFA.",

keywords = "Robot vision, depth estimation, stereo matching",

author = "Yakai Zhang and Jinhui Zhang",

note = "Publisher Copyright: {\textcopyright} 2016 IEEE.",

year = "2023",

month = jul,

day = "1",

doi = "10.1109/LRA.2023.3280818",

language = "English",

volume = "8",

pages = "4251--4258",

journal = "IEEE Robotics and Automation Letters",

issn = "2377-3766",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "7",

}

TY - JOUR

T1 - GFANet

T2 - Group Fusion Aggregation Network for Real Time Stereo Matching

AU - Zhang, Yakai

AU - Zhang, Jinhui

PY - 2023/7/1

Y1 - 2023/7/1

N2 - Existing 3D stereo networks with 4D volumes are computationally expensive but precise while 2D stereo networks are efficient but poor performance. In this letter, we present a novel group fusion aggregation (GFA) for 2D convolutions cost aggregation based on 4D volumes to reduce computational costs. Group-wise disparity aggregation block (GDAB) and group-wise channel fusion block (GCFB) are proposed to fuse geometry and context information of different group cost volumes in GFA, respectively. Further, we employ channel-dimension-first cost volume transformation and disparity-dimension-first cost volume transformation to convert 4D cost volumes into 3D tensors for GDAB and GCFB input in GFA. We evaluate our method on two popular public benchmark datasets. Experimental results from the KITTI official website show that our method can achieve similar accuracy with other 3D stereo networks (PSMNet, GCNet, GwcNet, etc.) at a low computing consumption. The ablation studies further demonstrate the facticity and reasonability of our proposed GFA.

AB - Existing 3D stereo networks with 4D volumes are computationally expensive but precise while 2D stereo networks are efficient but poor performance. In this letter, we present a novel group fusion aggregation (GFA) for 2D convolutions cost aggregation based on 4D volumes to reduce computational costs. Group-wise disparity aggregation block (GDAB) and group-wise channel fusion block (GCFB) are proposed to fuse geometry and context information of different group cost volumes in GFA, respectively. Further, we employ channel-dimension-first cost volume transformation and disparity-dimension-first cost volume transformation to convert 4D cost volumes into 3D tensors for GDAB and GCFB input in GFA. We evaluate our method on two popular public benchmark datasets. Experimental results from the KITTI official website show that our method can achieve similar accuracy with other 3D stereo networks (PSMNet, GCNet, GwcNet, etc.) at a low computing consumption. The ablation studies further demonstrate the facticity and reasonability of our proposed GFA.

KW - Robot vision

KW - depth estimation

KW - stereo matching

UR - http://www.scopus.com/inward/record.url?scp=85160999427&partnerID=8YFLogxK

U2 - 10.1109/LRA.2023.3280818

DO - 10.1109/LRA.2023.3280818

M3 - Article

AN - SCOPUS:85160999427

SN - 2377-3766

VL - 8

SP - 4251

EP - 4258

JO - IEEE Robotics and Automation Letters

JF - IEEE Robotics and Automation Letters

IS - 7

ER -

GFANet: Group Fusion Aggregation Network for Real Time Stereo Matching

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this