Continuous-Time Object Segmentation Using High Temporal Resolution Event Camera

Lin Zhu; Xianzhang Chen; Lizhi Wang; Xiao Wang; Yonghong Tian; Hua Huang

doi:10.1109/TPAMI.2024.3477591

Continuous-Time Object Segmentation Using High Temporal Resolution Event Camera

Lin Zhu, Xianzhang Chen, Lizhi Wang, Xiao Wang, Yonghong Tian, Hua Huang^*

^*Corresponding author for this work

School of Computer Science and Technology

Research output: Contribution to journal › Article › peer-review

Abstract

Event cameras are novel bio-inspired sensors, where individual pixels operate independently and asynchronously, generating intensity changes as events. Leveraging the microsecond resolution (no motion blur) and high dynamic range (compatible with extreme light conditions) of events, there is considerable promise in directly segmenting objects from sparse and asynchronous event streams in various applications. However, different from the rich cues in video object segmentation, it is challenging to segment complete objects from the sparse event stream. In this paper, we present the first framework for continuous-time object segmentation from event stream. Given the object mask at the initial time, our task aims to segment the complete object at any subsequent time in event streams. Specifically, our framework consists of a Recurrent Temporal Embedding Extraction (RTEE) module based on a novel ResLSTM, a Cross-time Spatiotemporal Feature Modeling (CSFM) module which is a transformer architecture with long-term and short-term matching modules, and a segmentation head. The historical events and masks (reference sets) are recurrently fed into our framework along with current-time events. The temporal embedding is updated as new events are input, enabling our framework to continuously process the event stream. To train and test our model, we construct both real-world and simulated event-based object segmentation datasets, each comprising event streams, APS images, and object annotations. Extensive experiments on our datasets demonstrate the effectiveness of the proposed recurrent architecture.

Original language	English
Pages (from-to)	807-824
Number of pages	18
Journal	IEEE Transactions on Pattern Analysis and Machine Intelligence
Volume	47
Issue number	2
DOIs	https://doi.org/10.1109/TPAMI.2024.3477591
Publication status	Published - 2025

Keywords

Event camera
event-based segmentation
temporal information extraction

Access to Document

10.1109/TPAMI.2024.3477591

Cite this

Zhu, L., Chen, X., Wang, L., Wang, X., Tian, Y., & Huang, H. (2025). Continuous-Time Object Segmentation Using High Temporal Resolution Event Camera. IEEE Transactions on Pattern Analysis and Machine Intelligence, 47(2), 807-824. https://doi.org/10.1109/TPAMI.2024.3477591

@article{66ed6c50690744ed9aea5f952dac0bc0,

title = "Continuous-Time Object Segmentation Using High Temporal Resolution Event Camera",

abstract = "Event cameras are novel bio-inspired sensors, where individual pixels operate independently and asynchronously, generating intensity changes as events. Leveraging the microsecond resolution (no motion blur) and high dynamic range (compatible with extreme light conditions) of events, there is considerable promise in directly segmenting objects from sparse and asynchronous event streams in various applications. However, different from the rich cues in video object segmentation, it is challenging to segment complete objects from the sparse event stream. In this paper, we present the first framework for continuous-time object segmentation from event stream. Given the object mask at the initial time, our task aims to segment the complete object at any subsequent time in event streams. Specifically, our framework consists of a Recurrent Temporal Embedding Extraction (RTEE) module based on a novel ResLSTM, a Cross-time Spatiotemporal Feature Modeling (CSFM) module which is a transformer architecture with long-term and short-term matching modules, and a segmentation head. The historical events and masks (reference sets) are recurrently fed into our framework along with current-time events. The temporal embedding is updated as new events are input, enabling our framework to continuously process the event stream. To train and test our model, we construct both real-world and simulated event-based object segmentation datasets, each comprising event streams, APS images, and object annotations. Extensive experiments on our datasets demonstrate the effectiveness of the proposed recurrent architecture.",

keywords = "Event camera, event-based segmentation, temporal information extraction",

author = "Lin Zhu and Xianzhang Chen and Lizhi Wang and Xiao Wang and Yonghong Tian and Hua Huang",

note = "Publisher Copyright: {\textcopyright} 1979-2012 IEEE.",

year = "2025",

doi = "10.1109/TPAMI.2024.3477591",

language = "English",

volume = "47",

pages = "807--824",

journal = "IEEE Transactions on Pattern Analysis and Machine Intelligence",

issn = "0162-8828",

publisher = "IEEE Computer Society",

number = "2",

}

TY - JOUR

T1 - Continuous-Time Object Segmentation Using High Temporal Resolution Event Camera

AU - Zhu, Lin

AU - Chen, Xianzhang

AU - Wang, Lizhi

AU - Wang, Xiao

AU - Tian, Yonghong

AU - Huang, Hua

PY - 2025

Y1 - 2025

N2 - Event cameras are novel bio-inspired sensors, where individual pixels operate independently and asynchronously, generating intensity changes as events. Leveraging the microsecond resolution (no motion blur) and high dynamic range (compatible with extreme light conditions) of events, there is considerable promise in directly segmenting objects from sparse and asynchronous event streams in various applications. However, different from the rich cues in video object segmentation, it is challenging to segment complete objects from the sparse event stream. In this paper, we present the first framework for continuous-time object segmentation from event stream. Given the object mask at the initial time, our task aims to segment the complete object at any subsequent time in event streams. Specifically, our framework consists of a Recurrent Temporal Embedding Extraction (RTEE) module based on a novel ResLSTM, a Cross-time Spatiotemporal Feature Modeling (CSFM) module which is a transformer architecture with long-term and short-term matching modules, and a segmentation head. The historical events and masks (reference sets) are recurrently fed into our framework along with current-time events. The temporal embedding is updated as new events are input, enabling our framework to continuously process the event stream. To train and test our model, we construct both real-world and simulated event-based object segmentation datasets, each comprising event streams, APS images, and object annotations. Extensive experiments on our datasets demonstrate the effectiveness of the proposed recurrent architecture.

AB - Event cameras are novel bio-inspired sensors, where individual pixels operate independently and asynchronously, generating intensity changes as events. Leveraging the microsecond resolution (no motion blur) and high dynamic range (compatible with extreme light conditions) of events, there is considerable promise in directly segmenting objects from sparse and asynchronous event streams in various applications. However, different from the rich cues in video object segmentation, it is challenging to segment complete objects from the sparse event stream. In this paper, we present the first framework for continuous-time object segmentation from event stream. Given the object mask at the initial time, our task aims to segment the complete object at any subsequent time in event streams. Specifically, our framework consists of a Recurrent Temporal Embedding Extraction (RTEE) module based on a novel ResLSTM, a Cross-time Spatiotemporal Feature Modeling (CSFM) module which is a transformer architecture with long-term and short-term matching modules, and a segmentation head. The historical events and masks (reference sets) are recurrently fed into our framework along with current-time events. The temporal embedding is updated as new events are input, enabling our framework to continuously process the event stream. To train and test our model, we construct both real-world and simulated event-based object segmentation datasets, each comprising event streams, APS images, and object annotations. Extensive experiments on our datasets demonstrate the effectiveness of the proposed recurrent architecture.

KW - Event camera

KW - event-based segmentation

KW - temporal information extraction

UR - http://www.scopus.com/inward/record.url?scp=85207711848&partnerID=8YFLogxK

U2 - 10.1109/TPAMI.2024.3477591

DO - 10.1109/TPAMI.2024.3477591

M3 - Article

AN - SCOPUS:85207711848

SN - 0162-8828

VL - 47

SP - 807

EP - 824

JO - IEEE Transactions on Pattern Analysis and Machine Intelligence

JF - IEEE Transactions on Pattern Analysis and Machine Intelligence

IS - 2

ER -

Continuous-Time Object Segmentation Using High Temporal Resolution Event Camera

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this