Skip to main navigation Skip to search Skip to main content

ISTASTrack: Bridging ANN and SNN via ISTA Adapter for RGB-Event Tracking

  • Siying Liu
  • , Zikai Wang
  • , Hanle Zheng
  • , Yifan Hu
  • , Xilin Wang
  • , Qingkai Yang
  • , Jibin Wu
  • , Hao Guo*
  • , Lei Deng*
  • *Corresponding author for this work
  • Tsinghua University
  • Taiyuan University of Technology
  • Beijing Institute of Technology
  • Hong Kong Polytechnic University

Research output: Contribution to journalArticlepeer-review

Abstract

RGB-Event tracking has become a promising trend in visual object tracking to leverage the complementary strengths of both RGB images and dynamic spike events for improved performance. However, existing artificial neural networks (ANNs) struggle to fully exploit the sparse and asynchronous nature of event streams. Recent efforts toward hybrid architectures combining ANNs and spiking neural networks (SNNs) have emerged as a promising solution in RGB-Event perception, yet effectively fusing features across heterogeneous paradigms remains a challenge. In this work, we propose ISTASTrack, the first transformer-based ANN-SNN hybrid Tracker equipped with ISTA adapters for RGB-Event tracking. The two-branch model employs a vision transformer to extract spatial context from RGB inputs and a spiking transformer to capture spatio-temporal dynamics from event streams. To bridge the modality and paradigm gap between ANN and SNN features, we systematically design an ISTA adapter for bidirectional feature interaction between the two branches. The ISTA adapter is derived from the sparse representation theory by unfolding the iterative shrinkage-thresholding algorithm. Additionally, we incorporate a temporal downsampling attention module within the adapter to align multi-step SNN features with single-step ANN features in the latent space. Experimental results on RGB-Event tracking benchmarks, such as FE240hz, VisEvent, COESOT, and FELT, have demonstrated that ISTASTrack achieves state-of-the-art performance while maintaining high energy efficiency. This work highlights the effectiveness and practicality of hybrid ANN-SNN designs for robust visual tracking. The code is publicly available at https://github.com/lsying009/ISTASTrack.git.

Original languageEnglish
Pages (from-to)5423-5438
Number of pages16
JournalIEEE Transactions on Image Processing
Volume35
DOIs
Publication statusPublished - 2026
Externally publishedYes

Keywords

  • Hybrid neural networks
  • RGB-event fusion
  • multimodal object tracking
  • sparse representation
  • spiking neural networks

Fingerprint

Dive into the research topics of 'ISTASTrack: Bridging ANN and SNN via ISTA Adapter for RGB-Event Tracking'. Together they form a unique fingerprint.

Cite this