Asynchronous Spatio-Temporal Memory Network for Continuous Event-Based Object Detection

Jianing Li, Jia Li*, Lin Zhu, Xijie Xiang, Tiejun Huang, Yonghong Tian*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

78 Citations (Scopus)

Abstract

Event cameras, offering extremely high temporal resolution and high dynamic range, have brought a new perspective to addressing common object detection challenges (e.g., motion blur and low light). However, how to learn a better spatio-temporal representation and exploit rich temporal cues from asynchronous events for object detection still remains an open issue. To address this problem, we propose a novel asynchronous spatio-temporal memory network (ASTMNet) that directly consumes asynchronous events instead of event images prior to processing, which can well detect objects in a continuous manner. Technically, ASTMNet learns an asynchronous attention embedding from the continuous event stream by adopting an adaptive temporal sampling strategy and a temporal attention convolutional module. Besides, a spatio-temporal memory module is designed to exploit rich temporal cues via a lightweight yet efficient inter-weaved recurrent-convolutional architecture. Empirically, it shows that our approach outperforms the state-of-the-art methods using the feed-forward frame-based detectors on three datasets by a large margin (i.e., 7.6% in the KITTI Simulated Dataset, 10.8% in the Gen1 Automotive Dataset, and 10.5% in the 1Mpx Detection Dataset). The results demonstrate that event cameras can perform robust object detection even in cases where conventional cameras fail, e.g., fast motion and challenging light conditions.

Original languageEnglish
Pages (from-to)2975-2987
Number of pages13
JournalIEEE Transactions on Image Processing
Volume31
DOIs
Publication statusPublished - 2022
Externally publishedYes

Keywords

  • Object detection
  • deep neural networks
  • event cameras
  • event-based vision
  • neuromorphic engineering

Fingerprint

Dive into the research topics of 'Asynchronous Spatio-Temporal Memory Network for Continuous Event-Based Object Detection'. Together they form a unique fingerprint.

Cite this