TY - GEN
T1 - CNN- Transformer-Based Modeling and Visual Measurement of Compound Eye Vision System
AU - Feng, Shangwu
AU - Yang, Li
AU - Li, Yuan
N1 - Publisher Copyright:
© 2023 Technical Committee on Control Theory, Chinese Association of Automation.
PY - 2023
Y1 - 2023
N2 - In some special environments, vision measurement systems need to be miniaturized, lightweight and have a large field of view. The compound eye vision system satisfies these features in small-range close-up measurements. In this paper, an 8 ∗ 8 compound eye array is built so that a compound eye vision measurement system can be established to obtain the world coordinates of the image and the target. The traditional approach is model-based vision system modeling, which is very dependent on the accuracy of the model. This is very difficult in compound-eye vision systems with a large degree of nonlinearity. In this paper, a SubHarris-based feature point extraction method is designed. A new data structure based on the extracted feature points is constructed for the feature that not all subeyes are imaged, and a neural network calibration method based on CNN - Transformer is designed to make the model more focused on the regions with images. The results show that the MAE of the method using deep learning improves by 19.7% relative to the basic neural network. The length measurement error is improved by 15.7% at 30-80 mm. It is also found in the experiments that the designed or improved modules SPP, RoIPool, VggBlock, CBAM and Transfomer encoding block in the convolutional structure all have a boosting effect on the final error results.
AB - In some special environments, vision measurement systems need to be miniaturized, lightweight and have a large field of view. The compound eye vision system satisfies these features in small-range close-up measurements. In this paper, an 8 ∗ 8 compound eye array is built so that a compound eye vision measurement system can be established to obtain the world coordinates of the image and the target. The traditional approach is model-based vision system modeling, which is very dependent on the accuracy of the model. This is very difficult in compound-eye vision systems with a large degree of nonlinearity. In this paper, a SubHarris-based feature point extraction method is designed. A new data structure based on the extracted feature points is constructed for the feature that not all subeyes are imaged, and a neural network calibration method based on CNN - Transformer is designed to make the model more focused on the regions with images. The results show that the MAE of the method using deep learning improves by 19.7% relative to the basic neural network. The length measurement error is improved by 15.7% at 30-80 mm. It is also found in the experiments that the designed or improved modules SPP, RoIPool, VggBlock, CBAM and Transfomer encoding block in the convolutional structure all have a boosting effect on the final error results.
KW - CNN-Transformer
KW - Compound Eye
KW - Visual System Model
UR - http://www.scopus.com/inward/record.url?scp=85175567591&partnerID=8YFLogxK
U2 - 10.23919/CCC58697.2023.10240669
DO - 10.23919/CCC58697.2023.10240669
M3 - Conference contribution
AN - SCOPUS:85175567591
T3 - Chinese Control Conference, CCC
SP - 7572
EP - 7577
BT - 2023 42nd Chinese Control Conference, CCC 2023
PB - IEEE Computer Society
T2 - 42nd Chinese Control Conference, CCC 2023
Y2 - 24 July 2023 through 26 July 2023
ER -