Task-Aware Encoder Control for Deep Video Compression

Xingtong Ge; Jixiang Luo; Xinjie Zhang; Tongda Xu; Guo Lu; Dailan He; Jing Geng; Yan Wang; Jun Zhang; Hongwei Qin

doi:10.1109/CVPR52733.2024.02460

Task-Aware Encoder Control for Deep Video Compression

Xingtong Ge, Jixiang Luo, Xinjie Zhang, Tongda Xu, Guo Lu, Dailan He, Jing Geng^*, Yan Wang, Jun Zhang, Hongwei Qin

^*Corresponding author for this work

School of Computer Science and Technology

Research output: Contribution to journal › Conference article › peer-review

3 Citations (Scopus)

Abstract

Prior research on deep video compression (DVC) for machine tasks typically necessitates training a unique codec for each specific task, mandating a dedicated decoder per task. In contrast, traditional video codecs employ a flexible encoder controller, enabling the adaptation of a single codec to different tasks through mechanisms like mode prediction. Drawing inspiration from this, we introduce an innovative encoder controller for deep video compression for machines. This controller features a mode prediction and a Group of Pictures (GoP) selection module. Our approach centralizes control at the encoding stage, allowing for adaptable encoder adjustments across different tasks, such as detection and tracking, while maintaining compatibility with a standard pre-trained DVC decoder. Empirical evidence demonstrates that our method is applicable across multiple tasks with various existing pre-trained DVCs. Moreover, extensive experiments demonstrate that our method outperforms previous DVC by about 25% bitrate for different tasks, with only one pre-trained decoder.

Original language	English
Pages (from-to)	26036-26045
Number of pages	10
Journal	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
DOIs	https://doi.org/10.1109/CVPR52733.2024.02460
Publication status	Published - 2024
Event	2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024 - Seattle, United States Duration: 16 Jun 2024 → 22 Jun 2024

Access to Document

10.1109/CVPR52733.2024.02460

Cite this

Ge, X., Luo, J., Zhang, X., Xu, T., Lu, G., He, D., Geng, J., Wang, Y., Zhang, J., & Qin, H. (2024). Task-Aware Encoder Control for Deep Video Compression. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 26036-26045. https://doi.org/10.1109/CVPR52733.2024.02460

@article{be90580c70cc49de9a83fffe8ec63834,

title = "Task-Aware Encoder Control for Deep Video Compression",

abstract = "Prior research on deep video compression (DVC) for machine tasks typically necessitates training a unique codec for each specific task, mandating a dedicated decoder per task. In contrast, traditional video codecs employ a flexible encoder controller, enabling the adaptation of a single codec to different tasks through mechanisms like mode prediction. Drawing inspiration from this, we introduce an innovative encoder controller for deep video compression for machines. This controller features a mode prediction and a Group of Pictures (GoP) selection module. Our approach centralizes control at the encoding stage, allowing for adaptable encoder adjustments across different tasks, such as detection and tracking, while maintaining compatibility with a standard pre-trained DVC decoder. Empirical evidence demonstrates that our method is applicable across multiple tasks with various existing pre-trained DVCs. Moreover, extensive experiments demonstrate that our method outperforms previous DVC by about 25% bitrate for different tasks, with only one pre-trained decoder.",

author = "Xingtong Ge and Jixiang Luo and Xinjie Zhang and Tongda Xu and Guo Lu and Dailan He and Jing Geng and Yan Wang and Jun Zhang and Hongwei Qin",

note = "Publisher Copyright: {\textcopyright} 2024 IEEE.; 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024 ; Conference date: 16-06-2024 Through 22-06-2024",

year = "2024",

doi = "10.1109/CVPR52733.2024.02460",

language = "English",

pages = "26036--26045",

journal = "Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition",

issn = "1063-6919",

publisher = "IEEE Computer Society",

}

TY - JOUR

T1 - Task-Aware Encoder Control for Deep Video Compression

AU - Ge, Xingtong

AU - Luo, Jixiang

AU - Zhang, Xinjie

AU - Xu, Tongda

AU - Lu, Guo

AU - He, Dailan

AU - Geng, Jing

AU - Wang, Yan

AU - Zhang, Jun

AU - Qin, Hongwei

PY - 2024

Y1 - 2024

N2 - Prior research on deep video compression (DVC) for machine tasks typically necessitates training a unique codec for each specific task, mandating a dedicated decoder per task. In contrast, traditional video codecs employ a flexible encoder controller, enabling the adaptation of a single codec to different tasks through mechanisms like mode prediction. Drawing inspiration from this, we introduce an innovative encoder controller for deep video compression for machines. This controller features a mode prediction and a Group of Pictures (GoP) selection module. Our approach centralizes control at the encoding stage, allowing for adaptable encoder adjustments across different tasks, such as detection and tracking, while maintaining compatibility with a standard pre-trained DVC decoder. Empirical evidence demonstrates that our method is applicable across multiple tasks with various existing pre-trained DVCs. Moreover, extensive experiments demonstrate that our method outperforms previous DVC by about 25% bitrate for different tasks, with only one pre-trained decoder.

AB - Prior research on deep video compression (DVC) for machine tasks typically necessitates training a unique codec for each specific task, mandating a dedicated decoder per task. In contrast, traditional video codecs employ a flexible encoder controller, enabling the adaptation of a single codec to different tasks through mechanisms like mode prediction. Drawing inspiration from this, we introduce an innovative encoder controller for deep video compression for machines. This controller features a mode prediction and a Group of Pictures (GoP) selection module. Our approach centralizes control at the encoding stage, allowing for adaptable encoder adjustments across different tasks, such as detection and tracking, while maintaining compatibility with a standard pre-trained DVC decoder. Empirical evidence demonstrates that our method is applicable across multiple tasks with various existing pre-trained DVCs. Moreover, extensive experiments demonstrate that our method outperforms previous DVC by about 25% bitrate for different tasks, with only one pre-trained decoder.

UR - http://www.scopus.com/inward/record.url?scp=85210032758&partnerID=8YFLogxK

U2 - 10.1109/CVPR52733.2024.02460

DO - 10.1109/CVPR52733.2024.02460

M3 - Conference article

AN - SCOPUS:85210032758

SN - 1063-6919

SP - 26036

EP - 26045

JO - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

JF - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

T2 - 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024

Y2 - 16 June 2024 through 22 June 2024

ER -

Task-Aware Encoder Control for Deep Video Compression

Abstract

Access to Document

Other files and links

Fingerprint

Cite this