Dual-correlate optimized coarse-fine strategy for monocular laparoscopic videos feature matching via multilevel sequential coupling feature descriptor

Ziang Zhang; Hong Song; Jingfan Fan; Tianyu Fu; Qiang Li; Danni Ai; Deqaing Xiao; Jian Yang

doi:10.1016/j.compbiomed.2023.107890

Dual-correlate optimized coarse-fine strategy for monocular laparoscopic videos feature matching via multilevel sequential coupling feature descriptor

Ziang Zhang, Hong Song^*, Jingfan Fan^*, Tianyu Fu, Qiang Li, Danni Ai, Deqaing Xiao, Jian Yang^*

^*Corresponding author for this work

Beijing Institute of Technology

Research output: Contribution to journal › Article › peer-review

Abstract

Feature matching of monocular laparoscopic videos is crucial for visualization enhancement in computer-assisted surgery, and the keys to conducting high-quality matches are accurate homography estimation, relative pose estimation, as well as sufficient matches and fast calculation. However, limited by various monocular laparoscopic imaging characteristics such as highlight noises, motion blur, texture interference and illumination variation, most exiting feature matching methods face the challenges of producing high-quality matches efficiently and sufficiently. To overcome these limitations, this paper presents a novel sequential coupling feature descriptor to extract and express multilevel feature maps efficiently, and a dual-correlate optimized coarse-fine strategy to establish dense matches in coarse level and adjust pixel-wise matches in fine level. Firstly, a novel sequential coupling swin transformer layer is designed in feature descriptor to learn and extract multilevel feature representations richly without increasing complexity. Then, a dual-correlate optimized coarse-fine strategy is proposed to match coarse feature sequences under low resolution, and the correlated fine feature sequences is optimized to refine pixel-wise matches based on coarse matching priors. Finally, the sequential coupling feature descriptor and dual-correlate optimization are merged into the Sequential Coupling Dual-Correlate Network (SeCo DC-Net) to produce high-quality matches. The evaluation is conducted on two public laparoscopic datasets: Scared and EndoSLAM, and the experimental results show the proposed network outperforms state-of-the-art methods in homography estimation, relative pose estimation, reprojection error, matching pairs number and inference runtime. The source code is publicly available at https://github.com/Iheckzza/FeatureMatching.

Original language	English
Article number	107890
Journal	Computers in Biology and Medicine
Volume	169
DOIs	https://doi.org/10.1016/j.compbiomed.2023.107890
Publication status	Published - Feb 2024

Keywords

Dual-correlate optimization
Feature description
Feature matching
Monocular laparoscopic videos
Sequential coupling
Vision transformer

Access to Document

10.1016/j.compbiomed.2023.107890

Cite this

@article{fb19c4bbdb654a7ea524762cfb12b4e7,

title = "Dual-correlate optimized coarse-fine strategy for monocular laparoscopic videos feature matching via multilevel sequential coupling feature descriptor",

abstract = "Feature matching of monocular laparoscopic videos is crucial for visualization enhancement in computer-assisted surgery, and the keys to conducting high-quality matches are accurate homography estimation, relative pose estimation, as well as sufficient matches and fast calculation. However, limited by various monocular laparoscopic imaging characteristics such as highlight noises, motion blur, texture interference and illumination variation, most exiting feature matching methods face the challenges of producing high-quality matches efficiently and sufficiently. To overcome these limitations, this paper presents a novel sequential coupling feature descriptor to extract and express multilevel feature maps efficiently, and a dual-correlate optimized coarse-fine strategy to establish dense matches in coarse level and adjust pixel-wise matches in fine level. Firstly, a novel sequential coupling swin transformer layer is designed in feature descriptor to learn and extract multilevel feature representations richly without increasing complexity. Then, a dual-correlate optimized coarse-fine strategy is proposed to match coarse feature sequences under low resolution, and the correlated fine feature sequences is optimized to refine pixel-wise matches based on coarse matching priors. Finally, the sequential coupling feature descriptor and dual-correlate optimization are merged into the Sequential Coupling Dual-Correlate Network (SeCo DC-Net) to produce high-quality matches. The evaluation is conducted on two public laparoscopic datasets: Scared and EndoSLAM, and the experimental results show the proposed network outperforms state-of-the-art methods in homography estimation, relative pose estimation, reprojection error, matching pairs number and inference runtime. The source code is publicly available at https://github.com/Iheckzza/FeatureMatching.",

keywords = "Dual-correlate optimization, Feature description, Feature matching, Monocular laparoscopic videos, Sequential coupling, Vision transformer",

author = "Ziang Zhang and Hong Song and Jingfan Fan and Tianyu Fu and Qiang Li and Danni Ai and Deqaing Xiao and Jian Yang",

note = "Publisher Copyright: {\textcopyright} 2023",

year = "2024",

month = feb,

doi = "10.1016/j.compbiomed.2023.107890",

language = "English",

volume = "169",

journal = "Computers in Biology and Medicine",

issn = "0010-4825",

publisher = "Elsevier Ltd.",

}

TY - JOUR

T1 - Dual-correlate optimized coarse-fine strategy for monocular laparoscopic videos feature matching via multilevel sequential coupling feature descriptor

AU - Zhang, Ziang

AU - Song, Hong

AU - Fan, Jingfan

AU - Fu, Tianyu

AU - Li, Qiang

AU - Ai, Danni

AU - Xiao, Deqaing

AU - Yang, Jian

PY - 2024/2

Y1 - 2024/2

N2 - Feature matching of monocular laparoscopic videos is crucial for visualization enhancement in computer-assisted surgery, and the keys to conducting high-quality matches are accurate homography estimation, relative pose estimation, as well as sufficient matches and fast calculation. However, limited by various monocular laparoscopic imaging characteristics such as highlight noises, motion blur, texture interference and illumination variation, most exiting feature matching methods face the challenges of producing high-quality matches efficiently and sufficiently. To overcome these limitations, this paper presents a novel sequential coupling feature descriptor to extract and express multilevel feature maps efficiently, and a dual-correlate optimized coarse-fine strategy to establish dense matches in coarse level and adjust pixel-wise matches in fine level. Firstly, a novel sequential coupling swin transformer layer is designed in feature descriptor to learn and extract multilevel feature representations richly without increasing complexity. Then, a dual-correlate optimized coarse-fine strategy is proposed to match coarse feature sequences under low resolution, and the correlated fine feature sequences is optimized to refine pixel-wise matches based on coarse matching priors. Finally, the sequential coupling feature descriptor and dual-correlate optimization are merged into the Sequential Coupling Dual-Correlate Network (SeCo DC-Net) to produce high-quality matches. The evaluation is conducted on two public laparoscopic datasets: Scared and EndoSLAM, and the experimental results show the proposed network outperforms state-of-the-art methods in homography estimation, relative pose estimation, reprojection error, matching pairs number and inference runtime. The source code is publicly available at https://github.com/Iheckzza/FeatureMatching.

AB - Feature matching of monocular laparoscopic videos is crucial for visualization enhancement in computer-assisted surgery, and the keys to conducting high-quality matches are accurate homography estimation, relative pose estimation, as well as sufficient matches and fast calculation. However, limited by various monocular laparoscopic imaging characteristics such as highlight noises, motion blur, texture interference and illumination variation, most exiting feature matching methods face the challenges of producing high-quality matches efficiently and sufficiently. To overcome these limitations, this paper presents a novel sequential coupling feature descriptor to extract and express multilevel feature maps efficiently, and a dual-correlate optimized coarse-fine strategy to establish dense matches in coarse level and adjust pixel-wise matches in fine level. Firstly, a novel sequential coupling swin transformer layer is designed in feature descriptor to learn and extract multilevel feature representations richly without increasing complexity. Then, a dual-correlate optimized coarse-fine strategy is proposed to match coarse feature sequences under low resolution, and the correlated fine feature sequences is optimized to refine pixel-wise matches based on coarse matching priors. Finally, the sequential coupling feature descriptor and dual-correlate optimization are merged into the Sequential Coupling Dual-Correlate Network (SeCo DC-Net) to produce high-quality matches. The evaluation is conducted on two public laparoscopic datasets: Scared and EndoSLAM, and the experimental results show the proposed network outperforms state-of-the-art methods in homography estimation, relative pose estimation, reprojection error, matching pairs number and inference runtime. The source code is publicly available at https://github.com/Iheckzza/FeatureMatching.

KW - Dual-correlate optimization

KW - Feature description

KW - Feature matching

KW - Monocular laparoscopic videos

KW - Sequential coupling

KW - Vision transformer

UR - http://www.scopus.com/inward/record.url?scp=85181587821&partnerID=8YFLogxK

U2 - 10.1016/j.compbiomed.2023.107890

DO - 10.1016/j.compbiomed.2023.107890

M3 - Article

C2 - 38168646

AN - SCOPUS:85181587821

SN - 0010-4825

VL - 169

JO - Computers in Biology and Medicine

JF - Computers in Biology and Medicine

M1 - 107890

ER -

Dual-correlate optimized coarse-fine strategy for monocular laparoscopic videos feature matching via multilevel sequential coupling feature descriptor

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this