Lpnet: Retina inspired neural network for object detection and recognition

Jie Cao; Chun Bao; Qun Hao; Yang Cheng; Chenglin Chen

doi:10.3390/electronics10222883

Lpnet: Retina inspired neural network for object detection and recognition

Jie Cao, Chun Bao, Qun Hao^*, Yang Cheng^*, Chenglin Chen

^*Corresponding author for this work

School of Optics and Photonics

Beijing Institute of Technology

Research output: Contribution to journal › Article › peer-review

4 Citations (Scopus)

Abstract

The detection of rotated objects is a meaningful and challenging research work. Although the state-of-the-art deep learning models have feature invariance, especially convolutional neural networks (CNNs), their architectures did not specifically design for rotation invariance. They only slightly compensate for this feature through pooling layers. In this study, we propose a novel network, named LPNet, to solve the problem of object rotation. LPNet improves the detection accuracy by combining retina-like log-polar transformation. Furthermore, LPNet is a plug-and-play architecture for object detection and recognition. It consists of two parts, which we name as encoder and decoder. An encoder extracts images which feature in log-polar coordinates while a decoder eliminates image noise in cartesian coordinates. Moreover, according to the movement of center points, LPNet has stable and sliding modes. LPNet takes the single-shot multibox detector (SSD) network as the baseline network and the visual geometry group (VGG16) as the feature extraction backbone network. The experiment results show that, compared with conventional SSD networks, the mean average precision (mAP) of LPNet increased by 3.4% for regular objects and by 17.6% for rotated objects.

Original language	English
Article number	2883
Journal	Electronics (Switzerland)
Volume	10
Issue number	22
DOIs	https://doi.org/10.3390/electronics10222883
Publication status	Published - 1 Nov 2021

Keywords

Convolutional neural networks
LPNet
Log-polar
Object detection and recognition
Retina-like

Access to Document

10.3390/electronics10222883

Cite this

@article{dfb464fbfc3f419fad2bbd90c4c787b1,

title = "Lpnet: Retina inspired neural network for object detection and recognition",

abstract = "The detection of rotated objects is a meaningful and challenging research work. Although the state-of-the-art deep learning models have feature invariance, especially convolutional neural networks (CNNs), their architectures did not specifically design for rotation invariance. They only slightly compensate for this feature through pooling layers. In this study, we propose a novel network, named LPNet, to solve the problem of object rotation. LPNet improves the detection accuracy by combining retina-like log-polar transformation. Furthermore, LPNet is a plug-and-play architecture for object detection and recognition. It consists of two parts, which we name as encoder and decoder. An encoder extracts images which feature in log-polar coordinates while a decoder eliminates image noise in cartesian coordinates. Moreover, according to the movement of center points, LPNet has stable and sliding modes. LPNet takes the single-shot multibox detector (SSD) network as the baseline network and the visual geometry group (VGG16) as the feature extraction backbone network. The experiment results show that, compared with conventional SSD networks, the mean average precision (mAP) of LPNet increased by 3.4% for regular objects and by 17.6% for rotated objects.",

keywords = "Convolutional neural networks, LPNet, Log-polar, Object detection and recognition, Retina-like",

author = "Jie Cao and Chun Bao and Qun Hao and Yang Cheng and Chenglin Chen",

note = "Publisher Copyright: {\textcopyright} 2021 by the authors. Licensee MDPI, Basel, Switzerland.",

year = "2021",

month = nov,

day = "1",

doi = "10.3390/electronics10222883",

language = "English",

volume = "10",

journal = "Electronics (Switzerland)",

issn = "2079-9292",

publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",

number = "22",

}

TY - JOUR

T1 - Lpnet

T2 - Retina inspired neural network for object detection and recognition

AU - Cao, Jie

AU - Bao, Chun

AU - Hao, Qun

AU - Cheng, Yang

AU - Chen, Chenglin

PY - 2021/11/1

Y1 - 2021/11/1

N2 - The detection of rotated objects is a meaningful and challenging research work. Although the state-of-the-art deep learning models have feature invariance, especially convolutional neural networks (CNNs), their architectures did not specifically design for rotation invariance. They only slightly compensate for this feature through pooling layers. In this study, we propose a novel network, named LPNet, to solve the problem of object rotation. LPNet improves the detection accuracy by combining retina-like log-polar transformation. Furthermore, LPNet is a plug-and-play architecture for object detection and recognition. It consists of two parts, which we name as encoder and decoder. An encoder extracts images which feature in log-polar coordinates while a decoder eliminates image noise in cartesian coordinates. Moreover, according to the movement of center points, LPNet has stable and sliding modes. LPNet takes the single-shot multibox detector (SSD) network as the baseline network and the visual geometry group (VGG16) as the feature extraction backbone network. The experiment results show that, compared with conventional SSD networks, the mean average precision (mAP) of LPNet increased by 3.4% for regular objects and by 17.6% for rotated objects.

AB - The detection of rotated objects is a meaningful and challenging research work. Although the state-of-the-art deep learning models have feature invariance, especially convolutional neural networks (CNNs), their architectures did not specifically design for rotation invariance. They only slightly compensate for this feature through pooling layers. In this study, we propose a novel network, named LPNet, to solve the problem of object rotation. LPNet improves the detection accuracy by combining retina-like log-polar transformation. Furthermore, LPNet is a plug-and-play architecture for object detection and recognition. It consists of two parts, which we name as encoder and decoder. An encoder extracts images which feature in log-polar coordinates while a decoder eliminates image noise in cartesian coordinates. Moreover, according to the movement of center points, LPNet has stable and sliding modes. LPNet takes the single-shot multibox detector (SSD) network as the baseline network and the visual geometry group (VGG16) as the feature extraction backbone network. The experiment results show that, compared with conventional SSD networks, the mean average precision (mAP) of LPNet increased by 3.4% for regular objects and by 17.6% for rotated objects.

KW - Convolutional neural networks

KW - LPNet

KW - Log-polar

KW - Object detection and recognition

KW - Retina-like

UR - http://www.scopus.com/inward/record.url?scp=85119579342&partnerID=8YFLogxK

U2 - 10.3390/electronics10222883

DO - 10.3390/electronics10222883

M3 - Article

AN - SCOPUS:85119579342

SN - 2079-9292

VL - 10

JO - Electronics (Switzerland)

JF - Electronics (Switzerland)

IS - 22

M1 - 2883

ER -

Lpnet: Retina inspired neural network for object detection and recognition

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this