基 于 特 征 聚 合 与 边 缘 检 测 的 伪 装 目 标 检 测

Cheng Ding; Xueqiong Bai; Yong Lv; Yang Liu; Chunhui Niu; Xin Liu

doi:10.3788/gzxb20245308.0810002

基于特征聚合与边缘检测的伪装目标检测

Cheng Ding, Xueqiong Bai^*, Yong Lv^*, Yang Liu, Chunhui Niu, Xin Liu

^*此作品的通讯作者

Beijing Information Science & Technology University

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

Camouflaged Object Detection（COD）holds significant research and application value in various fields. The ability of deep learning is pushing the performance of target detection algorithms to new heights. Designing a network that effectively integrates features of different layer sizes and eliminates background noise while preserving detailed information presents the main challenges in this field. We propose Feature Fusion and Edge Detection Net（F2-EDNet），a camouflaged object segmentation model based on feature fusion and edge detection. ConvNeXt is used as the backbone to extract multi-scale contextual features. The extensiveness and diversity of features are then enhanced through two approaches. The first approach involves using the Feature Enhancement Module（FEM） to refine and downsize the multi-scale contextual features. The second approach introduces an auxiliary task to fuse cross-layer features through the Cross-layer Guided Edge prediction Branch（CGEB）. The process extracts edge features and predicts edge information to increase feature diversity. Additionally，the Multiscale Feature Aggregation Module（MFAM）improves feature fusion by capturing and fusing information about interlayer differences between edge features and contextual features through multiscale attention and feature cascading. The model's prediction results are subjected to deep supervision to obtain the final target detection results. To validate the performance of the proposed model，it is compared qualitatively and quantitatively with eight camouflage object models from the past three years on three publicly available datasets. This comparison aims to observe its detection accuracy. Additionally，a model efficiency analysis is conducted by comparing it with five open-source models. Finally，the module's effectiveness is verified through ablation experiments to determine the optimal structure. The results of a quantitative experiment indicate that on the CAMO dataset，the S-measure，Fmeasure，E-measure correlation and mean absolute error metrics for F2-EDNet are optimal. On the COD10K dataset，the structural similarity metric indicates that the proposed algorithm is optimal，while the mean precision and recall，E-measure and MAEmetrics reach sub-optimal levels. On NC4K，all four metrics for the proposed algorithm reach optimization. From the visualized detection results，it can be observed that in the camouflage object detection task，the prediction results of the proposed model are more accurate and refined than those of other methods. Compared with other models，although the number of parameters in the proposed model is higher，the simple structure of the model framework enables it to outperform models specifically designed for lightweight purposes，faster than most other models. In comparison of the number of operations，the arithmetic complexity of the proposed model shows a significant decrease compared to a model that also utilizes multi-task learning. The model presented maintains high accuracy in target detection performance while ensuring a reasonable balance between computing speed and the number of operations. The results of ablation experiments demonstrate that each of the current modules plays the expected role，and the model's performance has been optimized. Experimental results show that the proposed algorithm achieves optimal detection accuracy. Compared to suboptimal models，our model demonstrates an average improvement of 1.41%，1.74%，0.14%，and 0.77% on the S-measure，F-measure，MAE，and E-measure indices across three datasets. Additionally，the model's design achieves a reasonable balance between operation volume and operation rate. During performance testing，the model's test speed was 46 fps，striking a balance between detection accuracy and execution efficiency，demonstrating practical application value. In future work，the algorithms will be lightened to further reduce the amount of computation to improve the speed of model inference；in applications，the model can be helpful in directions such as medical segmentation，defect detection with transparent object segmentation through migration learning.

投稿的翻译标题	Camouflage Object Detection Based on Feature Fusion and Edge Detection
源语言	繁体中文
文章编号	0810002
期刊	Guangzi Xuebao/Acta Photonica Sinica
卷	53
期	8
DOI	https://doi.org/10.3788/gzxb20245308.0810002
出版状态	已出版 - 8月 2024
已对外发布	是

关键词

Camouflaged image
Camouflaged object detection
Deep learning
Edge detection
Feature fusion

访问文件

10.3788/gzxb20245308.0810002

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{b357d75dcc08467390da255083abdda1,

title = "基于特征聚合与边缘检测的伪装目标检测",

abstract = "Camouflaged Object Detection（COD）holds significant research and application value in various fields. The ability of deep learning is pushing the performance of target detection algorithms to new heights. Designing a network that effectively integrates features of different layer sizes and eliminates background noise while preserving detailed information presents the main challenges in this field. We propose Feature Fusion and Edge Detection Net（F2-EDNet），a camouflaged object segmentation model based on feature fusion and edge detection. ConvNeXt is used as the backbone to extract multi-scale contextual features. The extensiveness and diversity of features are then enhanced through two approaches. The first approach involves using the Feature Enhancement Module（FEM） to refine and downsize the multi-scale contextual features. The second approach introduces an auxiliary task to fuse cross-layer features through the Cross-layer Guided Edge prediction Branch（CGEB）. The process extracts edge features and predicts edge information to increase feature diversity. Additionally，the Multiscale Feature Aggregation Module（MFAM）improves feature fusion by capturing and fusing information about interlayer differences between edge features and contextual features through multiscale attention and feature cascading. The model's prediction results are subjected to deep supervision to obtain the final target detection results. To validate the performance of the proposed model，it is compared qualitatively and quantitatively with eight camouflage object models from the past three years on three publicly available datasets. This comparison aims to observe its detection accuracy. Additionally，a model efficiency analysis is conducted by comparing it with five open-source models. Finally，the module's effectiveness is verified through ablation experiments to determine the optimal structure. The results of a quantitative experiment indicate that on the CAMO dataset，the S-measure，Fmeasure，E-measure correlation and mean absolute error metrics for F2-EDNet are optimal. On the COD10K dataset，the structural similarity metric indicates that the proposed algorithm is optimal，while the mean precision and recall，E-measure and MAEmetrics reach sub-optimal levels. On NC4K，all four metrics for the proposed algorithm reach optimization. From the visualized detection results，it can be observed that in the camouflage object detection task，the prediction results of the proposed model are more accurate and refined than those of other methods. Compared with other models，although the number of parameters in the proposed model is higher，the simple structure of the model framework enables it to outperform models specifically designed for lightweight purposes，faster than most other models. In comparison of the number of operations，the arithmetic complexity of the proposed model shows a significant decrease compared to a model that also utilizes multi-task learning. The model presented maintains high accuracy in target detection performance while ensuring a reasonable balance between computing speed and the number of operations. The results of ablation experiments demonstrate that each of the current modules plays the expected role，and the model's performance has been optimized. Experimental results show that the proposed algorithm achieves optimal detection accuracy. Compared to suboptimal models，our model demonstrates an average improvement of 1.41%，1.74%，0.14%，and 0.77% on the S-measure，F-measure，MAE，and E-measure indices across three datasets. Additionally，the model's design achieves a reasonable balance between operation volume and operation rate. During performance testing，the model's test speed was 46 fps，striking a balance between detection accuracy and execution efficiency，demonstrating practical application value. In future work，the algorithms will be lightened to further reduce the amount of computation to improve the speed of model inference；in applications，the model can be helpful in directions such as medical segmentation，defect detection with transparent object segmentation through migration learning.",

keywords = "Camouflaged image, Camouflaged object detection, Deep learning, Edge detection, Feature fusion",

author = "Cheng Ding and Xueqiong Bai and Yong Lv and Yang Liu and Chunhui Niu and Xin Liu",

year = "2024",

month = aug,

doi = "10.3788/gzxb20245308.0810002",

language = "繁体中文",

volume = "53",

journal = "Guangzi Xuebao/Acta Photonica Sinica",

issn = "1004-4213",

publisher = "Chinese Optical Society",

number = "8",

}

TY - JOUR

T1 - 基于特征聚合与边缘检测的伪装目标检测

AU - Ding, Cheng

AU - Bai, Xueqiong

AU - Lv, Yong

AU - Liu, Yang

AU - Niu, Chunhui

AU - Liu, Xin

PY - 2024/8

Y1 - 2024/8

N2 - Camouflaged Object Detection（COD）holds significant research and application value in various fields. The ability of deep learning is pushing the performance of target detection algorithms to new heights. Designing a network that effectively integrates features of different layer sizes and eliminates background noise while preserving detailed information presents the main challenges in this field. We propose Feature Fusion and Edge Detection Net（F2-EDNet），a camouflaged object segmentation model based on feature fusion and edge detection. ConvNeXt is used as the backbone to extract multi-scale contextual features. The extensiveness and diversity of features are then enhanced through two approaches. The first approach involves using the Feature Enhancement Module（FEM） to refine and downsize the multi-scale contextual features. The second approach introduces an auxiliary task to fuse cross-layer features through the Cross-layer Guided Edge prediction Branch（CGEB）. The process extracts edge features and predicts edge information to increase feature diversity. Additionally，the Multiscale Feature Aggregation Module（MFAM）improves feature fusion by capturing and fusing information about interlayer differences between edge features and contextual features through multiscale attention and feature cascading. The model's prediction results are subjected to deep supervision to obtain the final target detection results. To validate the performance of the proposed model，it is compared qualitatively and quantitatively with eight camouflage object models from the past three years on three publicly available datasets. This comparison aims to observe its detection accuracy. Additionally，a model efficiency analysis is conducted by comparing it with five open-source models. Finally，the module's effectiveness is verified through ablation experiments to determine the optimal structure. The results of a quantitative experiment indicate that on the CAMO dataset，the S-measure，Fmeasure，E-measure correlation and mean absolute error metrics for F2-EDNet are optimal. On the COD10K dataset，the structural similarity metric indicates that the proposed algorithm is optimal，while the mean precision and recall，E-measure and MAEmetrics reach sub-optimal levels. On NC4K，all four metrics for the proposed algorithm reach optimization. From the visualized detection results，it can be observed that in the camouflage object detection task，the prediction results of the proposed model are more accurate and refined than those of other methods. Compared with other models，although the number of parameters in the proposed model is higher，the simple structure of the model framework enables it to outperform models specifically designed for lightweight purposes，faster than most other models. In comparison of the number of operations，the arithmetic complexity of the proposed model shows a significant decrease compared to a model that also utilizes multi-task learning. The model presented maintains high accuracy in target detection performance while ensuring a reasonable balance between computing speed and the number of operations. The results of ablation experiments demonstrate that each of the current modules plays the expected role，and the model's performance has been optimized. Experimental results show that the proposed algorithm achieves optimal detection accuracy. Compared to suboptimal models，our model demonstrates an average improvement of 1.41%，1.74%，0.14%，and 0.77% on the S-measure，F-measure，MAE，and E-measure indices across three datasets. Additionally，the model's design achieves a reasonable balance between operation volume and operation rate. During performance testing，the model's test speed was 46 fps，striking a balance between detection accuracy and execution efficiency，demonstrating practical application value. In future work，the algorithms will be lightened to further reduce the amount of computation to improve the speed of model inference；in applications，the model can be helpful in directions such as medical segmentation，defect detection with transparent object segmentation through migration learning.

AB - Camouflaged Object Detection（COD）holds significant research and application value in various fields. The ability of deep learning is pushing the performance of target detection algorithms to new heights. Designing a network that effectively integrates features of different layer sizes and eliminates background noise while preserving detailed information presents the main challenges in this field. We propose Feature Fusion and Edge Detection Net（F2-EDNet），a camouflaged object segmentation model based on feature fusion and edge detection. ConvNeXt is used as the backbone to extract multi-scale contextual features. The extensiveness and diversity of features are then enhanced through two approaches. The first approach involves using the Feature Enhancement Module（FEM） to refine and downsize the multi-scale contextual features. The second approach introduces an auxiliary task to fuse cross-layer features through the Cross-layer Guided Edge prediction Branch（CGEB）. The process extracts edge features and predicts edge information to increase feature diversity. Additionally，the Multiscale Feature Aggregation Module（MFAM）improves feature fusion by capturing and fusing information about interlayer differences between edge features and contextual features through multiscale attention and feature cascading. The model's prediction results are subjected to deep supervision to obtain the final target detection results. To validate the performance of the proposed model，it is compared qualitatively and quantitatively with eight camouflage object models from the past three years on three publicly available datasets. This comparison aims to observe its detection accuracy. Additionally，a model efficiency analysis is conducted by comparing it with five open-source models. Finally，the module's effectiveness is verified through ablation experiments to determine the optimal structure. The results of a quantitative experiment indicate that on the CAMO dataset，the S-measure，Fmeasure，E-measure correlation and mean absolute error metrics for F2-EDNet are optimal. On the COD10K dataset，the structural similarity metric indicates that the proposed algorithm is optimal，while the mean precision and recall，E-measure and MAEmetrics reach sub-optimal levels. On NC4K，all four metrics for the proposed algorithm reach optimization. From the visualized detection results，it can be observed that in the camouflage object detection task，the prediction results of the proposed model are more accurate and refined than those of other methods. Compared with other models，although the number of parameters in the proposed model is higher，the simple structure of the model framework enables it to outperform models specifically designed for lightweight purposes，faster than most other models. In comparison of the number of operations，the arithmetic complexity of the proposed model shows a significant decrease compared to a model that also utilizes multi-task learning. The model presented maintains high accuracy in target detection performance while ensuring a reasonable balance between computing speed and the number of operations. The results of ablation experiments demonstrate that each of the current modules plays the expected role，and the model's performance has been optimized. Experimental results show that the proposed algorithm achieves optimal detection accuracy. Compared to suboptimal models，our model demonstrates an average improvement of 1.41%，1.74%，0.14%，and 0.77% on the S-measure，F-measure，MAE，and E-measure indices across three datasets. Additionally，the model's design achieves a reasonable balance between operation volume and operation rate. During performance testing，the model's test speed was 46 fps，striking a balance between detection accuracy and execution efficiency，demonstrating practical application value. In future work，the algorithms will be lightened to further reduce the amount of computation to improve the speed of model inference；in applications，the model can be helpful in directions such as medical segmentation，defect detection with transparent object segmentation through migration learning.

KW - Camouflaged image

KW - Camouflaged object detection

KW - Deep learning

KW - Edge detection

KW - Feature fusion

UR - http://www.scopus.com/inward/record.url?scp=85204900106&partnerID=8YFLogxK

U2 - 10.3788/gzxb20245308.0810002

DO - 10.3788/gzxb20245308.0810002

M3 - 文章

AN - SCOPUS:85204900106

SN - 1004-4213

VL - 53

JO - Guangzi Xuebao/Acta Photonica Sinica

JF - Guangzi Xuebao/Acta Photonica Sinica

IS - 8

M1 - 0810002

ER -

基 于 特 征 聚 合 与 边 缘 检 测 的 伪 装 目 标 检 测

摘要

关键词

访问文件

其它文件与链接

指纹

引用此

基于特征聚合与边缘检测的伪装目标检测