Temporal Action Localization in the Deep Learning Era: A Survey

Binglu Wang, Yongqiang Zhao, Le Yang*, Teng Long, Xuelong Li*

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

14 引用 (Scopus)
Plum Print visual indicator of research metrics
  • Citations
    • Citation Indexes: 15
  • Captures
    • Readers: 27
  • Mentions
    • News Mentions: 1
see details

摘要

The temporal action localization research aims to discover action instances from untrimmed videos, representing a fundamental step in the field of intelligent video understanding. With the advent of deep learning, backbone networks have been instrumental in providing representative spatiotemporal features, while the end-to-end learning paradigm has enabled the development of high-quality models through data-driven training. Both supervised and weakly supervised learning approaches have contributed to the rapid progress of temporal action localization, resulting in a multitude of methods and a large body of literature, making a comprehensive survey a pressing necessity. This paper presents a thorough analysis of existing action localization works, offering a well-organized taxonomy that highlights the strengths and weaknesses of each strategy. In the realm of supervised learning, in addition to the anchor mechanism, we introduce a novel classification mechanism to categorize and summarize existing works. Similarly, for weakly supervised learning, we extend the traditional pre-classification and post-classification mechanisms by providing a fresh perspective on enhancement strategies. Furthermore, we shed light on the bottleneck of confidence estimation, a critical yet overlooked aspect of current works. By conducting detailed analyses, this survey serves as a valuable resource for researchers, providing beneficial guidance to newcomers and inspiring seasoned researchers alike.

源语言英语
页(从-至)2171-2190
页数20
期刊IEEE Transactions on Pattern Analysis and Machine Intelligence
46
4
DOI
出版状态已出版 - 1 4月 2024

指纹

探究 'Temporal Action Localization in the Deep Learning Era: A Survey' 的科研主题。它们共同构成独一无二的指纹。

引用此

Wang, B., Zhao, Y., Yang, L., Long, T., & Li, X. (2024). Temporal Action Localization in the Deep Learning Era: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(4), 2171-2190. https://doi.org/10.1109/TPAMI.2023.3330794