TY - GEN
T1 - Multi-frame Temporal Modeling for Multi-object Tracking in Satellite Video
AU - Chai, Bingqian
AU - Qiao, Tingting
AU - Xie, Baorong
AU - Wang, Jue
AU - Liu, Wenchao
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - With the advancement of remote sensing technology, video satellites have played an increasingly important role in both military and civilian domains, and they are widely used in areas such as military guidance, intelligent transportation, and disaster response. Multi-object tracking (MOT) in satellite video is a crucial research topic in these fields and has made certain progress. However, due to the difficulty of tracking objects in complex scene and low resolution, current MOT methods face issues of low accuracy in satellite video. In this work, we propose a joint-detection-and-tracking method called multi-frame temporal MOT (MFT-MOT). Firstly, a heatmap guided multi-frame temporal modeling method is introduced to extract features with temporal semantic information while suppressing background noise. Then, we utilize a weights-shared encoder-decoder network to facilitate interaction of temporal and spatial information and capture inter-frame object motion tracks at low resolution. Experiments on SAT-MTB dataset shows the effectiveness of our method.
AB - With the advancement of remote sensing technology, video satellites have played an increasingly important role in both military and civilian domains, and they are widely used in areas such as military guidance, intelligent transportation, and disaster response. Multi-object tracking (MOT) in satellite video is a crucial research topic in these fields and has made certain progress. However, due to the difficulty of tracking objects in complex scene and low resolution, current MOT methods face issues of low accuracy in satellite video. In this work, we propose a joint-detection-and-tracking method called multi-frame temporal MOT (MFT-MOT). Firstly, a heatmap guided multi-frame temporal modeling method is introduced to extract features with temporal semantic information while suppressing background noise. Then, we utilize a weights-shared encoder-decoder network to facilitate interaction of temporal and spatial information and capture inter-frame object motion tracks at low resolution. Experiments on SAT-MTB dataset shows the effectiveness of our method.
KW - joint detection and tracking
KW - multi-frame temporal model
KW - multi-object tracking
KW - satellite video
UR - http://www.scopus.com/inward/record.url?scp=86000027376&partnerID=8YFLogxK
U2 - 10.1109/ICSIDP62679.2024.10868570
DO - 10.1109/ICSIDP62679.2024.10868570
M3 - Conference contribution
AN - SCOPUS:86000027376
T3 - IEEE International Conference on Signal, Information and Data Processing, ICSIDP 2024
BT - IEEE International Conference on Signal, Information and Data Processing, ICSIDP 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2nd IEEE International Conference on Signal, Information and Data Processing, ICSIDP 2024
Y2 - 22 November 2024 through 24 November 2024
ER -