HRDNET: HIGH-RESOLUTION DETECTION NETWORK FOR SMALL OBJECTS

Ziming Liu; Guangyu Gao; Lin Sun; Zhiyuan Fang

doi:10.1109/ICME51207.2021.9428241

HRDNET: HIGH-RESOLUTION DETECTION NETWORK FOR SMALL OBJECTS

Ziming Liu, Guangyu Gao^*, Lin Sun, Zhiyuan Fang

^*Corresponding author for this work

School of Computer Science and Technology

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

108 Citations (Scopus)

Abstract

Small object detection is a very challenging yet practical vision task. With deep network-based methods, the contextual information of small objects may disappear when the network goes deeper. An intuitive solution to alleviate this issue is to increase the input resolution, however, it will aggravate the large variant of object scale and introduce unbearable computation cost. To leverage the benefits of high-resolution images without bringing up new problems, we propose a High-Resolution Detection Network (HRDNet) which takes multiple resolution inputs with multi-depth backbones. Meanwhile, we propose the Multi-Depth Image Pyramid Network (MD-IPN) and Multi-Scale Feature Pyramid Network (MS-FPN). The MD-IPN maintains multiple position information using multiple depth backbones. Specifically, high-resolution input will be fed into a shallow network to reserve more positional information and reduce computational costs, while low-resolution input will be fed into a deep network to extract more semantics. By extracting various features from high to low resolutions, the MD-IPN can improve the performance of small object detection and maintain the performance of middle and large objects. Additionally, MS-FPN is introduced to align and fuse multi-scale feature groups generated by MD-IPN to reduce the information imbalance. Extensive experiments are conducted on the COCO2017 and the typical small object dataset, VisDrone 2019. Notably, our HRDNet achieves the state-of-the-art on these two datasets with significant improvements on small objects.

Original language	English
Title of host publication	2021 IEEE International Conference on Multimedia and Expo, ICME 2021
Publisher	IEEE Computer Society
ISBN (Electronic)	9781665438643
DOIs	https://doi.org/10.1109/ICME51207.2021.9428241
Publication status	Published - 2021
Event	2021 IEEE International Conference on Multimedia and Expo, ICME 2021 - Shenzhen, China Duration: 5 Jul 2021 → 9 Jul 2021

Publication series

Name	Proceedings - IEEE International Conference on Multimedia and Expo
ISSN (Print)	1945-7871
ISSN (Electronic)	1945-788X

Conference

Conference	2021 IEEE International Conference on Multimedia and Expo, ICME 2021
Country/Territory	China
City	Shenzhen
Period	5/07/21 → 9/07/21

Keywords

Deep Neural Network
High-resolution Images
Image Pyramid
Small Object Detection

Access to Document

10.1109/ICME51207.2021.9428241

Cite this

@inproceedings{a09d3723a7264a8a9d28eac01940ead3,

title = "HRDNET: HIGH-RESOLUTION DETECTION NETWORK FOR SMALL OBJECTS",

abstract = "Small object detection is a very challenging yet practical vision task. With deep network-based methods, the contextual information of small objects may disappear when the network goes deeper. An intuitive solution to alleviate this issue is to increase the input resolution, however, it will aggravate the large variant of object scale and introduce unbearable computation cost. To leverage the benefits of high-resolution images without bringing up new problems, we propose a High-Resolution Detection Network (HRDNet) which takes multiple resolution inputs with multi-depth backbones. Meanwhile, we propose the Multi-Depth Image Pyramid Network (MD-IPN) and Multi-Scale Feature Pyramid Network (MS-FPN). The MD-IPN maintains multiple position information using multiple depth backbones. Specifically, high-resolution input will be fed into a shallow network to reserve more positional information and reduce computational costs, while low-resolution input will be fed into a deep network to extract more semantics. By extracting various features from high to low resolutions, the MD-IPN can improve the performance of small object detection and maintain the performance of middle and large objects. Additionally, MS-FPN is introduced to align and fuse multi-scale feature groups generated by MD-IPN to reduce the information imbalance. Extensive experiments are conducted on the COCO2017 and the typical small object dataset, VisDrone 2019. Notably, our HRDNet achieves the state-of-the-art on these two datasets with significant improvements on small objects.",

keywords = "Deep Neural Network, High-resolution Images, Image Pyramid, Small Object Detection",

author = "Ziming Liu and Guangyu Gao and Lin Sun and Zhiyuan Fang",

note = "Publisher Copyright: {\textcopyright} 2021 IEEE; 2021 IEEE International Conference on Multimedia and Expo, ICME 2021 ; Conference date: 05-07-2021 Through 09-07-2021",

year = "2021",

doi = "10.1109/ICME51207.2021.9428241",

language = "English",

series = "Proceedings - IEEE International Conference on Multimedia and Expo",

publisher = "IEEE Computer Society",

booktitle = "2021 IEEE International Conference on Multimedia and Expo, ICME 2021",

address = "United States",

}

Liu, Z, Gao, G, Sun, L & Fang, Z 2021, HRDNET: HIGH-RESOLUTION DETECTION NETWORK FOR SMALL OBJECTS. in 2021 IEEE International Conference on Multimedia and Expo, ICME 2021. Proceedings - IEEE International Conference on Multimedia and Expo, IEEE Computer Society, 2021 IEEE International Conference on Multimedia and Expo, ICME 2021, Shenzhen, China, 5/07/21. https://doi.org/10.1109/ICME51207.2021.9428241

HRDNET: HIGH-RESOLUTION DETECTION NETWORK FOR SMALL OBJECTS. / Liu, Ziming; Gao, Guangyu; Sun, Lin et al.
2021 IEEE International Conference on Multimedia and Expo, ICME 2021. IEEE Computer Society, 2021. (Proceedings - IEEE International Conference on Multimedia and Expo).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - HRDNET

T2 - 2021 IEEE International Conference on Multimedia and Expo, ICME 2021

AU - Liu, Ziming

AU - Gao, Guangyu

AU - Sun, Lin

AU - Fang, Zhiyuan

PY - 2021

Y1 - 2021

N2 - Small object detection is a very challenging yet practical vision task. With deep network-based methods, the contextual information of small objects may disappear when the network goes deeper. An intuitive solution to alleviate this issue is to increase the input resolution, however, it will aggravate the large variant of object scale and introduce unbearable computation cost. To leverage the benefits of high-resolution images without bringing up new problems, we propose a High-Resolution Detection Network (HRDNet) which takes multiple resolution inputs with multi-depth backbones. Meanwhile, we propose the Multi-Depth Image Pyramid Network (MD-IPN) and Multi-Scale Feature Pyramid Network (MS-FPN). The MD-IPN maintains multiple position information using multiple depth backbones. Specifically, high-resolution input will be fed into a shallow network to reserve more positional information and reduce computational costs, while low-resolution input will be fed into a deep network to extract more semantics. By extracting various features from high to low resolutions, the MD-IPN can improve the performance of small object detection and maintain the performance of middle and large objects. Additionally, MS-FPN is introduced to align and fuse multi-scale feature groups generated by MD-IPN to reduce the information imbalance. Extensive experiments are conducted on the COCO2017 and the typical small object dataset, VisDrone 2019. Notably, our HRDNet achieves the state-of-the-art on these two datasets with significant improvements on small objects.

AB - Small object detection is a very challenging yet practical vision task. With deep network-based methods, the contextual information of small objects may disappear when the network goes deeper. An intuitive solution to alleviate this issue is to increase the input resolution, however, it will aggravate the large variant of object scale and introduce unbearable computation cost. To leverage the benefits of high-resolution images without bringing up new problems, we propose a High-Resolution Detection Network (HRDNet) which takes multiple resolution inputs with multi-depth backbones. Meanwhile, we propose the Multi-Depth Image Pyramid Network (MD-IPN) and Multi-Scale Feature Pyramid Network (MS-FPN). The MD-IPN maintains multiple position information using multiple depth backbones. Specifically, high-resolution input will be fed into a shallow network to reserve more positional information and reduce computational costs, while low-resolution input will be fed into a deep network to extract more semantics. By extracting various features from high to low resolutions, the MD-IPN can improve the performance of small object detection and maintain the performance of middle and large objects. Additionally, MS-FPN is introduced to align and fuse multi-scale feature groups generated by MD-IPN to reduce the information imbalance. Extensive experiments are conducted on the COCO2017 and the typical small object dataset, VisDrone 2019. Notably, our HRDNet achieves the state-of-the-art on these two datasets with significant improvements on small objects.

KW - Deep Neural Network

KW - High-resolution Images

KW - Image Pyramid

KW - Small Object Detection

UR - http://www.scopus.com/inward/record.url?scp=85126429420&partnerID=8YFLogxK

U2 - 10.1109/ICME51207.2021.9428241

DO - 10.1109/ICME51207.2021.9428241

M3 - Conference contribution

AN - SCOPUS:85126429420

T3 - Proceedings - IEEE International Conference on Multimedia and Expo

BT - 2021 IEEE International Conference on Multimedia and Expo, ICME 2021

PB - IEEE Computer Society

Y2 - 5 July 2021 through 9 July 2021

ER -

HRDNET: HIGH-RESOLUTION DETECTION NETWORK FOR SMALL OBJECTS

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this