Inference Adaptive Thresholding based Non-Maximum Suppression for Object Detection in Video Image Sequence

Mengqing Jiang; Yurong Jiang; Min Li; Bo Meng; Hong Song; Danni Ai; Jian Yang

doi:10.1145/3319921.3319950

Inference Adaptive Thresholding based Non-Maximum Suppression for Object Detection in Video Image Sequence

Mengqing Jiang, Yurong Jiang, Min Li, Bo Meng, Hong Song, Danni Ai, Jian Yang

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

2 Citations (Scopus)

Abstract

This study proposes a novel inference adaptive thresholding based non-maximum suppression (NMS) (IAT-NMS) algorithm for deriving temporal cues between video sequences. The inference of temporal connectivity is first derived according to an overlapping measure of the bounding boxes between adjacent frames. Frames with high-confidence detection object are taken as key frames to leverage the scores of neighbor detections and preserve potential detections of blurred objects with low scores. Then, bounding boxes within each frame are ranked via their confidence scores and the overlapping ratio between the bounding box with the highest score against the remaining surrounding boxes is computed. This measure of overlapping is brought into a Gaussian function to estimate weights for adaptive suppression and to softly suppress the detection scores of possible severely overlapped objects. The proposed method is compared with state-of-the-art video object detection techniques. With the application of IAT-NMS, overlapping objects originally undistinguishable in the compared methods become detectable. Experimental results demonstrate that this simple and unsupervised method outperforms state-of-the-art NMS algorithms, with an increase of 6% in mean average precision (mAP) on the ImageNet VID dataset. Our study on performance limitations and sensitivity to parametric variations also finds that IAT-NMS demonstrates better detection capability than does the three compared algorithms, which fail to detect all targets or distinguish in the presence of multiple overlapping targets.

Original language	English
Title of host publication	ACM International Conference Proceeding Series
Publisher	Association for Computing Machinery
Pages	21-27
Number of pages	7
ISBN (Print)	9781450361286
DOIs	https://doi.org/10.1145/3319921.3319950
Publication status	Published - 2019
Event	3rd International Conference on Innovation in Artificial Intelligence, ICIAI 2019 - Suzhou, China Duration: 15 Mar 2019 → 18 Mar 2019

Publication series

Name	ACM International Conference Proceeding Series
Volume	Part F148152

Conference

Conference	3rd International Conference on Innovation in Artificial Intelligence, ICIAI 2019
Country/Territory	China
City	Suzhou
Period	15/03/19 → 18/03/19

Keywords

Non-maximum suppression
Object detection
Video image

Access to Document

10.1145/3319921.3319950

Cite this

Jiang, M., Jiang, Y., Li, M., Meng, B., Song, H., Ai, D., & Yang, J. (2019). Inference Adaptive Thresholding based Non-Maximum Suppression for Object Detection in Video Image Sequence. In ACM International Conference Proceeding Series (pp. 21-27). (ACM International Conference Proceeding Series; Vol. Part F148152). Association for Computing Machinery. https://doi.org/10.1145/3319921.3319950

@inproceedings{26a14da8a533438dbc993950d08f82b0,

title = "Inference Adaptive Thresholding based Non-Maximum Suppression for Object Detection in Video Image Sequence",

abstract = "This study proposes a novel inference adaptive thresholding based non-maximum suppression (NMS) (IAT-NMS) algorithm for deriving temporal cues between video sequences. The inference of temporal connectivity is first derived according to an overlapping measure of the bounding boxes between adjacent frames. Frames with high-confidence detection object are taken as key frames to leverage the scores of neighbor detections and preserve potential detections of blurred objects with low scores. Then, bounding boxes within each frame are ranked via their confidence scores and the overlapping ratio between the bounding box with the highest score against the remaining surrounding boxes is computed. This measure of overlapping is brought into a Gaussian function to estimate weights for adaptive suppression and to softly suppress the detection scores of possible severely overlapped objects. The proposed method is compared with state-of-the-art video object detection techniques. With the application of IAT-NMS, overlapping objects originally undistinguishable in the compared methods become detectable. Experimental results demonstrate that this simple and unsupervised method outperforms state-of-the-art NMS algorithms, with an increase of 6% in mean average precision (mAP) on the ImageNet VID dataset. Our study on performance limitations and sensitivity to parametric variations also finds that IAT-NMS demonstrates better detection capability than does the three compared algorithms, which fail to detect all targets or distinguish in the presence of multiple overlapping targets.",

keywords = "Non-maximum suppression, Object detection, Video image",

author = "Mengqing Jiang and Yurong Jiang and Min Li and Bo Meng and Hong Song and Danni Ai and Jian Yang",

note = "Publisher Copyright: {\textcopyright} 2019 Association for Computing Machinery.; 3rd International Conference on Innovation in Artificial Intelligence, ICIAI 2019 ; Conference date: 15-03-2019 Through 18-03-2019",

year = "2019",

doi = "10.1145/3319921.3319950",

language = "English",

isbn = "9781450361286",

series = "ACM International Conference Proceeding Series",

publisher = "Association for Computing Machinery",

pages = "21--27",

booktitle = "ACM International Conference Proceeding Series",

}

Jiang, M, Jiang, Y, Li, M, Meng, B, Song, H, Ai, D & Yang, J 2019, Inference Adaptive Thresholding based Non-Maximum Suppression for Object Detection in Video Image Sequence. in ACM International Conference Proceeding Series. ACM International Conference Proceeding Series, vol. Part F148152, Association for Computing Machinery, pp. 21-27, 3rd International Conference on Innovation in Artificial Intelligence, ICIAI 2019, Suzhou, China, 15/03/19. https://doi.org/10.1145/3319921.3319950

Inference Adaptive Thresholding based Non-Maximum Suppression for Object Detection in Video Image Sequence. / Jiang, Mengqing; Jiang, Yurong; Li, Min et al.
ACM International Conference Proceeding Series. Association for Computing Machinery, 2019. p. 21-27 (ACM International Conference Proceeding Series; Vol. Part F148152).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Inference Adaptive Thresholding based Non-Maximum Suppression for Object Detection in Video Image Sequence

AU - Jiang, Mengqing

AU - Jiang, Yurong

AU - Li, Min

AU - Meng, Bo

AU - Song, Hong

AU - Ai, Danni

AU - Yang, Jian

PY - 2019

Y1 - 2019

N2 - This study proposes a novel inference adaptive thresholding based non-maximum suppression (NMS) (IAT-NMS) algorithm for deriving temporal cues between video sequences. The inference of temporal connectivity is first derived according to an overlapping measure of the bounding boxes between adjacent frames. Frames with high-confidence detection object are taken as key frames to leverage the scores of neighbor detections and preserve potential detections of blurred objects with low scores. Then, bounding boxes within each frame are ranked via their confidence scores and the overlapping ratio between the bounding box with the highest score against the remaining surrounding boxes is computed. This measure of overlapping is brought into a Gaussian function to estimate weights for adaptive suppression and to softly suppress the detection scores of possible severely overlapped objects. The proposed method is compared with state-of-the-art video object detection techniques. With the application of IAT-NMS, overlapping objects originally undistinguishable in the compared methods become detectable. Experimental results demonstrate that this simple and unsupervised method outperforms state-of-the-art NMS algorithms, with an increase of 6% in mean average precision (mAP) on the ImageNet VID dataset. Our study on performance limitations and sensitivity to parametric variations also finds that IAT-NMS demonstrates better detection capability than does the three compared algorithms, which fail to detect all targets or distinguish in the presence of multiple overlapping targets.

AB - This study proposes a novel inference adaptive thresholding based non-maximum suppression (NMS) (IAT-NMS) algorithm for deriving temporal cues between video sequences. The inference of temporal connectivity is first derived according to an overlapping measure of the bounding boxes between adjacent frames. Frames with high-confidence detection object are taken as key frames to leverage the scores of neighbor detections and preserve potential detections of blurred objects with low scores. Then, bounding boxes within each frame are ranked via their confidence scores and the overlapping ratio between the bounding box with the highest score against the remaining surrounding boxes is computed. This measure of overlapping is brought into a Gaussian function to estimate weights for adaptive suppression and to softly suppress the detection scores of possible severely overlapped objects. The proposed method is compared with state-of-the-art video object detection techniques. With the application of IAT-NMS, overlapping objects originally undistinguishable in the compared methods become detectable. Experimental results demonstrate that this simple and unsupervised method outperforms state-of-the-art NMS algorithms, with an increase of 6% in mean average precision (mAP) on the ImageNet VID dataset. Our study on performance limitations and sensitivity to parametric variations also finds that IAT-NMS demonstrates better detection capability than does the three compared algorithms, which fail to detect all targets or distinguish in the presence of multiple overlapping targets.

KW - Non-maximum suppression

KW - Object detection

KW - Video image

UR - http://www.scopus.com/inward/record.url?scp=85066489551&partnerID=8YFLogxK

U2 - 10.1145/3319921.3319950

DO - 10.1145/3319921.3319950

M3 - Conference contribution

AN - SCOPUS:85066489551

SN - 9781450361286

T3 - ACM International Conference Proceeding Series

SP - 21

EP - 27

BT - ACM International Conference Proceeding Series

PB - Association for Computing Machinery

T2 - 3rd International Conference on Innovation in Artificial Intelligence, ICIAI 2019

Y2 - 15 March 2019 through 18 March 2019

ER -

Inference Adaptive Thresholding based Non-Maximum Suppression for Object Detection in Video Image Sequence

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this