MDRNet: a lightweight network for real-time semantic segmentation in street scenes

Yingpeng Dai; Junzheng Wang; Jiehao Li; Jing Li

doi:10.1108/AA-06-2021-0078

MDRNet: a lightweight network for real-time semantic segmentation in street scenes

Yingpeng Dai, Junzheng Wang, Jiehao Li, Jing Li^*

^*Corresponding author for this work

School of Automation

Beijing Institute of Technology

Research output: Contribution to journal › Article › peer-review

26 Citations (Scopus)

Abstract

Purpose: This paper aims to focus on the environmental perception of unmanned platform under complex street scenes. Unmanned platform has a strict requirement both on accuracy and inference speed. So how to make a trade-off between accuracy and inference speed during the extraction of environmental information becomes a challenge. Design/methodology/approach: In this paper, a novel multi-scale depth-wise residual (MDR) module is proposed. This module makes full use of depth-wise separable convolution, dilated convolution and 1-dimensional (1-D) convolution, which is able to extract local information and contextual information jointly while keeping this module small-scale and shallow. Then, based on MDR module, a novel network named multi-scale depth-wise residual network (MDRNet) is designed for fast semantic segmentation. This network could extract multi-scale information and maintain feature maps with high spatial resolution to mitigate the existence of objects at multiple scales. Findings: Experiments on Camvid data set and Cityscapes data set reveal that the proposed MDRNet produces competitive results both in terms of computational time and accuracy during inference. Specially, the authors got 67.47 and 68.7% Mean Intersection over Union (MIoU) on Camvid data set and Cityscapes data set, respectively, with only 0.84 million parameters and quicker speed on a single GTX 1070Ti card. Originality/value: This research can provide the theoretical and engineering basis for environmental perception on the unmanned platform. In addition, it provides environmental information to support the subsequent works.

Original language	English
Pages (from-to)	725-733
Number of pages	9
Journal	Assembly Automation
Volume	41
Issue number	6
DOIs	https://doi.org/10.1108/AA-06-2021-0078
Publication status	Published - 24 Nov 2021

Keywords

Environmental perception
Lightweight neural network
Real-time application
Unmanned platform

Access to Document

10.1108/AA-06-2021-0078

Cite this

@article{f70369882f7f4655a1c037b7d4d525a0,

title = "MDRNet: a lightweight network for real-time semantic segmentation in street scenes",

abstract = "Purpose: This paper aims to focus on the environmental perception of unmanned platform under complex street scenes. Unmanned platform has a strict requirement both on accuracy and inference speed. So how to make a trade-off between accuracy and inference speed during the extraction of environmental information becomes a challenge. Design/methodology/approach: In this paper, a novel multi-scale depth-wise residual (MDR) module is proposed. This module makes full use of depth-wise separable convolution, dilated convolution and 1-dimensional (1-D) convolution, which is able to extract local information and contextual information jointly while keeping this module small-scale and shallow. Then, based on MDR module, a novel network named multi-scale depth-wise residual network (MDRNet) is designed for fast semantic segmentation. This network could extract multi-scale information and maintain feature maps with high spatial resolution to mitigate the existence of objects at multiple scales. Findings: Experiments on Camvid data set and Cityscapes data set reveal that the proposed MDRNet produces competitive results both in terms of computational time and accuracy during inference. Specially, the authors got 67.47 and 68.7% Mean Intersection over Union (MIoU) on Camvid data set and Cityscapes data set, respectively, with only 0.84 million parameters and quicker speed on a single GTX 1070Ti card. Originality/value: This research can provide the theoretical and engineering basis for environmental perception on the unmanned platform. In addition, it provides environmental information to support the subsequent works.",

keywords = "Environmental perception, Lightweight neural network, Real-time application, Unmanned platform",

author = "Yingpeng Dai and Junzheng Wang and Jiehao Li and Jing Li",

note = "Publisher Copyright: {\textcopyright} 2021, Emerald Publishing Limited.",

year = "2021",

month = nov,

day = "24",

doi = "10.1108/AA-06-2021-0078",

language = "English",

volume = "41",

pages = "725--733",

journal = "Assembly Automation",

issn = "0144-5154",

publisher = "Emerald Publishing",

number = "6",

}

TY - JOUR

T1 - MDRNet

T2 - a lightweight network for real-time semantic segmentation in street scenes

AU - Dai, Yingpeng

AU - Wang, Junzheng

AU - Li, Jiehao

AU - Li, Jing

PY - 2021/11/24

Y1 - 2021/11/24

N2 - Purpose: This paper aims to focus on the environmental perception of unmanned platform under complex street scenes. Unmanned platform has a strict requirement both on accuracy and inference speed. So how to make a trade-off between accuracy and inference speed during the extraction of environmental information becomes a challenge. Design/methodology/approach: In this paper, a novel multi-scale depth-wise residual (MDR) module is proposed. This module makes full use of depth-wise separable convolution, dilated convolution and 1-dimensional (1-D) convolution, which is able to extract local information and contextual information jointly while keeping this module small-scale and shallow. Then, based on MDR module, a novel network named multi-scale depth-wise residual network (MDRNet) is designed for fast semantic segmentation. This network could extract multi-scale information and maintain feature maps with high spatial resolution to mitigate the existence of objects at multiple scales. Findings: Experiments on Camvid data set and Cityscapes data set reveal that the proposed MDRNet produces competitive results both in terms of computational time and accuracy during inference. Specially, the authors got 67.47 and 68.7% Mean Intersection over Union (MIoU) on Camvid data set and Cityscapes data set, respectively, with only 0.84 million parameters and quicker speed on a single GTX 1070Ti card. Originality/value: This research can provide the theoretical and engineering basis for environmental perception on the unmanned platform. In addition, it provides environmental information to support the subsequent works.

AB - Purpose: This paper aims to focus on the environmental perception of unmanned platform under complex street scenes. Unmanned platform has a strict requirement both on accuracy and inference speed. So how to make a trade-off between accuracy and inference speed during the extraction of environmental information becomes a challenge. Design/methodology/approach: In this paper, a novel multi-scale depth-wise residual (MDR) module is proposed. This module makes full use of depth-wise separable convolution, dilated convolution and 1-dimensional (1-D) convolution, which is able to extract local information and contextual information jointly while keeping this module small-scale and shallow. Then, based on MDR module, a novel network named multi-scale depth-wise residual network (MDRNet) is designed for fast semantic segmentation. This network could extract multi-scale information and maintain feature maps with high spatial resolution to mitigate the existence of objects at multiple scales. Findings: Experiments on Camvid data set and Cityscapes data set reveal that the proposed MDRNet produces competitive results both in terms of computational time and accuracy during inference. Specially, the authors got 67.47 and 68.7% Mean Intersection over Union (MIoU) on Camvid data set and Cityscapes data set, respectively, with only 0.84 million parameters and quicker speed on a single GTX 1070Ti card. Originality/value: This research can provide the theoretical and engineering basis for environmental perception on the unmanned platform. In addition, it provides environmental information to support the subsequent works.

KW - Environmental perception

KW - Lightweight neural network

KW - Real-time application

KW - Unmanned platform

UR - http://www.scopus.com/inward/record.url?scp=85117614409&partnerID=8YFLogxK

U2 - 10.1108/AA-06-2021-0078

DO - 10.1108/AA-06-2021-0078

M3 - Article

AN - SCOPUS:85117614409

SN - 0144-5154

VL - 41

SP - 725

EP - 733

JO - Assembly Automation

JF - Assembly Automation

IS - 6

ER -

MDRNet: a lightweight network for real-time semantic segmentation in street scenes

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this