TY - JOUR
T1 - Learning event guided network for salient object detection
AU - Jiang, Xiurong
AU - Zhu, Lin
AU - Tian, Hui
N1 - Publisher Copyright:
© 2021 Elsevier B.V.
PY - 2021/11
Y1 - 2021/11
N2 - Salient object detection (SOD) focuses on mimicking the attention mechanism in human vision system. Due to the limited information provided from the traditional image data, the image-based SOD advances are very challenging in some complex scenes. Inspired by the emerging event cameras that provide asynchronous measurements of local temporal contrast over a large dynamic range, we propose a new idea to extract more effective information from the combination of event flow and RGB images. In this paper, we construct an end-to-end joint network for salient object detection (ERSOD-Net), which simultaneously supervise the RGB image and the event data within the corresponding image exposure time. To fully exploit temporal information of event data, Long Short-Term Memory module is utilized to effectively process event and learn salient object event surfaces. Moreover, multi-level feature interaction is designed to fuse two complementary branches of image and event and to predict significance mapping. Finally, to demonstrate the effectiveness of our model, a real Event-RGB SOD dataset (ERSOD) is built by DAVIS camera. Experiments on both benchmark and ERSOD datasets show that the proposed event guided network greatly improves the SOD performance in different evaluation metrics. The code and datasets will be released later at https://github.com/jxr326/ERSOD-Net.
AB - Salient object detection (SOD) focuses on mimicking the attention mechanism in human vision system. Due to the limited information provided from the traditional image data, the image-based SOD advances are very challenging in some complex scenes. Inspired by the emerging event cameras that provide asynchronous measurements of local temporal contrast over a large dynamic range, we propose a new idea to extract more effective information from the combination of event flow and RGB images. In this paper, we construct an end-to-end joint network for salient object detection (ERSOD-Net), which simultaneously supervise the RGB image and the event data within the corresponding image exposure time. To fully exploit temporal information of event data, Long Short-Term Memory module is utilized to effectively process event and learn salient object event surfaces. Moreover, multi-level feature interaction is designed to fuse two complementary branches of image and event and to predict significance mapping. Finally, to demonstrate the effectiveness of our model, a real Event-RGB SOD dataset (ERSOD) is built by DAVIS camera. Experiments on both benchmark and ERSOD datasets show that the proposed event guided network greatly improves the SOD performance in different evaluation metrics. The code and datasets will be released later at https://github.com/jxr326/ERSOD-Net.
KW - Event-based vision
KW - Image saliency
KW - Long short-term memory
KW - Salient object detection
KW - Visual attention
UR - https://www.scopus.com/pages/publications/85115377924
U2 - 10.1016/j.patrec.2021.08.034
DO - 10.1016/j.patrec.2021.08.034
M3 - Article
AN - SCOPUS:85115377924
SN - 0167-8655
VL - 151
SP - 317
EP - 324
JO - Pattern Recognition Letters
JF - Pattern Recognition Letters
ER -