Local Mixer with Prior Position for Cars’ Type Recognition

Bin Cao; Hongbin Ma; Ying Jin

doi:10.20965/jaciii.2022.p0922

Local Mixer with Prior Position for Cars’ Type Recognition

Bin Cao, Hongbin Ma^*, Ying Jin

^*此作品的通讯作者

自动化学院

Beijing Institute of Technology

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

Deep learning has attracted attention widely as the successful application of deep learning for vision tasks, such as image classification, object detection and so on. Due to the robustness and universality of deep learning, automotive manufacturing, a crucial part of national economy, needs deep learning to make production lines more intelligent and improve efficiency. However, some superior generally deep learning models, such as ViT, TNT, and Swin transformer, cannot meet automotive manufacturing requirements with high accuracy on a specific scene. As for automotive production lines, engineers usually adopt some smart designs, which can provide prior knowledge for designing deep learning models. Specifically, in an image, the position of target is usually fixed. Therefore, in order to take advantage of prior position, this paper designs a local mixer with prior position to capture local feature. Its main idea is that dividing the whole feature map into window feature maps and connecting window feature maps along channel dimension in order to make convolution kernel parameters for each window feature map are independent from others. Besides, MLP is adopted as global mixer to capture global feature and the pyramidal architecture with CNN is adopted. Comprehensive results demonstrate the effectiveness of proposed model on cars’ type recognition. In particular, the proposed model achieves 97.938% accuracy on our data set, surpassing some transformer-like models.

源语言	英语
页（从-至）	922-929
页数	8
期刊	Journal of Advanced Computational Intelligence and Intelligent Informatics
卷	26
期	6
DOI	https://doi.org/10.20965/jaciii.2022.p0922
出版状态	已出版 - 11月 2022

访问文件

10.20965/jaciii.2022.p0922

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{b032d41bcbda4f359fbf848da46a9991,

title = "Local Mixer with Prior Position for Cars{\textquoteright} Type Recognition",

abstract = "Deep learning has attracted attention widely as the successful application of deep learning for vision tasks, such as image classification, object detection and so on. Due to the robustness and universality of deep learning, automotive manufacturing, a crucial part of national economy, needs deep learning to make production lines more intelligent and improve efficiency. However, some superior generally deep learning models, such as ViT, TNT, and Swin transformer, cannot meet automotive manufacturing requirements with high accuracy on a specific scene. As for automotive production lines, engineers usually adopt some smart designs, which can provide prior knowledge for designing deep learning models. Specifically, in an image, the position of target is usually fixed. Therefore, in order to take advantage of prior position, this paper designs a local mixer with prior position to capture local feature. Its main idea is that dividing the whole feature map into window feature maps and connecting window feature maps along channel dimension in order to make convolution kernel parameters for each window feature map are independent from others. Besides, MLP is adopted as global mixer to capture global feature and the pyramidal architecture with CNN is adopted. Comprehensive results demonstrate the effectiveness of proposed model on cars{\textquoteright} type recognition. In particular, the proposed model achieves 97.938% accuracy on our data set, surpassing some transformer-like models.",

keywords = "CNN, MLP, cars{\textquoteright} type recognition, pyramidal architecture",

author = "Bin Cao and Hongbin Ma and Ying Jin",

note = "Publisher Copyright: {\textcopyright} Fuji Technology Press Ltd. Creative Commons CC BY-ND: This is an Open Access article distributed under the terms of the Creative Commons Attribution-NoDerivatives 4.0 InternationalLicense (https://creativecommons.org/licenses/by-nd/4.0/)",

year = "2022",

month = nov,

doi = "10.20965/jaciii.2022.p0922",

language = "English",

volume = "26",

pages = "922--929",

journal = "Journal of Advanced Computational Intelligence and Intelligent Informatics",

issn = "1343-0130",

publisher = "Fuji Technology Press",

number = "6",

}

TY - JOUR

T1 - Local Mixer with Prior Position for Cars’ Type Recognition

AU - Cao, Bin

AU - Ma, Hongbin

AU - Jin, Ying

N1 - Publisher Copyright: © Fuji Technology Press Ltd. Creative Commons CC BY-ND: This is an Open Access article distributed under the terms of the Creative Commons Attribution-NoDerivatives 4.0 InternationalLicense (https://creativecommons.org/licenses/by-nd/4.0/)

PY - 2022/11

Y1 - 2022/11

N2 - Deep learning has attracted attention widely as the successful application of deep learning for vision tasks, such as image classification, object detection and so on. Due to the robustness and universality of deep learning, automotive manufacturing, a crucial part of national economy, needs deep learning to make production lines more intelligent and improve efficiency. However, some superior generally deep learning models, such as ViT, TNT, and Swin transformer, cannot meet automotive manufacturing requirements with high accuracy on a specific scene. As for automotive production lines, engineers usually adopt some smart designs, which can provide prior knowledge for designing deep learning models. Specifically, in an image, the position of target is usually fixed. Therefore, in order to take advantage of prior position, this paper designs a local mixer with prior position to capture local feature. Its main idea is that dividing the whole feature map into window feature maps and connecting window feature maps along channel dimension in order to make convolution kernel parameters for each window feature map are independent from others. Besides, MLP is adopted as global mixer to capture global feature and the pyramidal architecture with CNN is adopted. Comprehensive results demonstrate the effectiveness of proposed model on cars’ type recognition. In particular, the proposed model achieves 97.938% accuracy on our data set, surpassing some transformer-like models.

AB - Deep learning has attracted attention widely as the successful application of deep learning for vision tasks, such as image classification, object detection and so on. Due to the robustness and universality of deep learning, automotive manufacturing, a crucial part of national economy, needs deep learning to make production lines more intelligent and improve efficiency. However, some superior generally deep learning models, such as ViT, TNT, and Swin transformer, cannot meet automotive manufacturing requirements with high accuracy on a specific scene. As for automotive production lines, engineers usually adopt some smart designs, which can provide prior knowledge for designing deep learning models. Specifically, in an image, the position of target is usually fixed. Therefore, in order to take advantage of prior position, this paper designs a local mixer with prior position to capture local feature. Its main idea is that dividing the whole feature map into window feature maps and connecting window feature maps along channel dimension in order to make convolution kernel parameters for each window feature map are independent from others. Besides, MLP is adopted as global mixer to capture global feature and the pyramidal architecture with CNN is adopted. Comprehensive results demonstrate the effectiveness of proposed model on cars’ type recognition. In particular, the proposed model achieves 97.938% accuracy on our data set, surpassing some transformer-like models.

KW - CNN

KW - MLP

KW - cars’ type recognition

KW - pyramidal architecture

UR - http://www.scopus.com/inward/record.url?scp=85146495253&partnerID=8YFLogxK

U2 - 10.20965/jaciii.2022.p0922

DO - 10.20965/jaciii.2022.p0922

M3 - Article

AN - SCOPUS:85146495253

SN - 1343-0130

VL - 26

SP - 922

EP - 929

JO - Journal of Advanced Computational Intelligence and Intelligent Informatics

JF - Journal of Advanced Computational Intelligence and Intelligent Informatics

IS - 6

ER -

Local Mixer with Prior Position for Cars’ Type Recognition

摘要

访问文件

其它文件与链接

指纹

引用此