CNN- Transformer-Based Modeling and Visual Measurement of Compound Eye Vision System

Shangwu Feng; Li Yang; Yuan Li

doi:10.23919/CCC58697.2023.10240669

CNN- Transformer-Based Modeling and Visual Measurement of Compound Eye Vision System

Shangwu Feng, Li Yang, Yuan Li

School of Automation

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

Abstract

In some special environments, vision measurement systems need to be miniaturized, lightweight and have a large field of view. The compound eye vision system satisfies these features in small-range close-up measurements. In this paper, an 8 ∗ 8 compound eye array is built so that a compound eye vision measurement system can be established to obtain the world coordinates of the image and the target. The traditional approach is model-based vision system modeling, which is very dependent on the accuracy of the model. This is very difficult in compound-eye vision systems with a large degree of nonlinearity. In this paper, a SubHarris-based feature point extraction method is designed. A new data structure based on the extracted feature points is constructed for the feature that not all subeyes are imaged, and a neural network calibration method based on CNN - Transformer is designed to make the model more focused on the regions with images. The results show that the MAE of the method using deep learning improves by 19.7% relative to the basic neural network. The length measurement error is improved by 15.7% at 30-80 mm. It is also found in the experiments that the designed or improved modules SPP, RoIPool, VggBlock, CBAM and Transfomer encoding block in the convolutional structure all have a boosting effect on the final error results.

Original language	English
Title of host publication	2023 42nd Chinese Control Conference, CCC 2023
Publisher	IEEE Computer Society
Pages	7572-7577
Number of pages	6
ISBN (Electronic)	9789887581543
DOIs	https://doi.org/10.23919/CCC58697.2023.10240669
Publication status	Published - 2023
Event	42nd Chinese Control Conference, CCC 2023 - Tianjin, China Duration: 24 Jul 2023 → 26 Jul 2023

Publication series

Name	Chinese Control Conference, CCC
Volume	2023-July
ISSN (Print)	1934-1768
ISSN (Electronic)	2161-2927

Conference

Conference	42nd Chinese Control Conference, CCC 2023
Country/Territory	China
City	Tianjin
Period	24/07/23 → 26/07/23

Keywords

CNN-Transformer
Compound Eye
Visual System Model

Access to Document

10.23919/CCC58697.2023.10240669

Cite this

Feng, S., Yang, L., & Li, Y. (2023). CNN- Transformer-Based Modeling and Visual Measurement of Compound Eye Vision System. In 2023 42nd Chinese Control Conference, CCC 2023 (pp. 7572-7577). (Chinese Control Conference, CCC; Vol. 2023-July). IEEE Computer Society. https://doi.org/10.23919/CCC58697.2023.10240669

@inproceedings{cf96c3a87cbf481f99075e007a0870aa,

title = "CNN- Transformer-Based Modeling and Visual Measurement of Compound Eye Vision System",

abstract = "In some special environments, vision measurement systems need to be miniaturized, lightweight and have a large field of view. The compound eye vision system satisfies these features in small-range close-up measurements. In this paper, an 8 ∗ 8 compound eye array is built so that a compound eye vision measurement system can be established to obtain the world coordinates of the image and the target. The traditional approach is model-based vision system modeling, which is very dependent on the accuracy of the model. This is very difficult in compound-eye vision systems with a large degree of nonlinearity. In this paper, a SubHarris-based feature point extraction method is designed. A new data structure based on the extracted feature points is constructed for the feature that not all subeyes are imaged, and a neural network calibration method based on CNN - Transformer is designed to make the model more focused on the regions with images. The results show that the MAE of the method using deep learning improves by 19.7% relative to the basic neural network. The length measurement error is improved by 15.7% at 30-80 mm. It is also found in the experiments that the designed or improved modules SPP, RoIPool, VggBlock, CBAM and Transfomer encoding block in the convolutional structure all have a boosting effect on the final error results.",

keywords = "CNN-Transformer, Compound Eye, Visual System Model",

author = "Shangwu Feng and Li Yang and Yuan Li",

note = "Publisher Copyright: {\textcopyright} 2023 Technical Committee on Control Theory, Chinese Association of Automation.; 42nd Chinese Control Conference, CCC 2023 ; Conference date: 24-07-2023 Through 26-07-2023",

year = "2023",

doi = "10.23919/CCC58697.2023.10240669",

language = "English",

series = "Chinese Control Conference, CCC",

publisher = "IEEE Computer Society",

pages = "7572--7577",

booktitle = "2023 42nd Chinese Control Conference, CCC 2023",

address = "United States",

}

Feng, S, Yang, L & Li, Y 2023, CNN- Transformer-Based Modeling and Visual Measurement of Compound Eye Vision System. in 2023 42nd Chinese Control Conference, CCC 2023. Chinese Control Conference, CCC, vol. 2023-July, IEEE Computer Society, pp. 7572-7577, 42nd Chinese Control Conference, CCC 2023, Tianjin, China, 24/07/23. https://doi.org/10.23919/CCC58697.2023.10240669

TY - GEN

T1 - CNN- Transformer-Based Modeling and Visual Measurement of Compound Eye Vision System

AU - Feng, Shangwu

AU - Yang, Li

AU - Li, Yuan

PY - 2023

Y1 - 2023

N2 - In some special environments, vision measurement systems need to be miniaturized, lightweight and have a large field of view. The compound eye vision system satisfies these features in small-range close-up measurements. In this paper, an 8 ∗ 8 compound eye array is built so that a compound eye vision measurement system can be established to obtain the world coordinates of the image and the target. The traditional approach is model-based vision system modeling, which is very dependent on the accuracy of the model. This is very difficult in compound-eye vision systems with a large degree of nonlinearity. In this paper, a SubHarris-based feature point extraction method is designed. A new data structure based on the extracted feature points is constructed for the feature that not all subeyes are imaged, and a neural network calibration method based on CNN - Transformer is designed to make the model more focused on the regions with images. The results show that the MAE of the method using deep learning improves by 19.7% relative to the basic neural network. The length measurement error is improved by 15.7% at 30-80 mm. It is also found in the experiments that the designed or improved modules SPP, RoIPool, VggBlock, CBAM and Transfomer encoding block in the convolutional structure all have a boosting effect on the final error results.

AB - In some special environments, vision measurement systems need to be miniaturized, lightweight and have a large field of view. The compound eye vision system satisfies these features in small-range close-up measurements. In this paper, an 8 ∗ 8 compound eye array is built so that a compound eye vision measurement system can be established to obtain the world coordinates of the image and the target. The traditional approach is model-based vision system modeling, which is very dependent on the accuracy of the model. This is very difficult in compound-eye vision systems with a large degree of nonlinearity. In this paper, a SubHarris-based feature point extraction method is designed. A new data structure based on the extracted feature points is constructed for the feature that not all subeyes are imaged, and a neural network calibration method based on CNN - Transformer is designed to make the model more focused on the regions with images. The results show that the MAE of the method using deep learning improves by 19.7% relative to the basic neural network. The length measurement error is improved by 15.7% at 30-80 mm. It is also found in the experiments that the designed or improved modules SPP, RoIPool, VggBlock, CBAM and Transfomer encoding block in the convolutional structure all have a boosting effect on the final error results.

KW - CNN-Transformer

KW - Compound Eye

KW - Visual System Model

UR - http://www.scopus.com/inward/record.url?scp=85175567591&partnerID=8YFLogxK

U2 - 10.23919/CCC58697.2023.10240669

DO - 10.23919/CCC58697.2023.10240669

M3 - Conference contribution

AN - SCOPUS:85175567591

T3 - Chinese Control Conference, CCC

SP - 7572

EP - 7577

BT - 2023 42nd Chinese Control Conference, CCC 2023

PB - IEEE Computer Society

T2 - 42nd Chinese Control Conference, CCC 2023

Y2 - 24 July 2023 through 26 July 2023

ER -

CNN- Transformer-Based Modeling and Visual Measurement of Compound Eye Vision System

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this