Multi-scale deep learning and clustering-based tabletop object instance segmentation for robot manipulation

Zhihong Jiang; Yongrui Xue; Yan Zhao; Xiao Huang; Hui Li

doi:10.1177/17298806241278165

Multi-scale deep learning and clustering-based tabletop object instance segmentation for robot manipulation

Zhihong Jiang, Yongrui Xue, Yan Zhao, Xiao Huang^*, Hui Li^*

^*Corresponding author for this work

School of Mechatronical Engineering

Research output: Contribution to journal › Article › peer-review

Abstract

3D object instance segmentation plays a vital role in various applications such as autonomous driving, robotics and virtual reality. However, tabletop scenes exhibit diverse object complexities and size variations. The challenge is to enhance the accuracy of segmenting these scenes for multiple object instances. This limitation directly impacts robots’ capabilities to effectively grasp and manipulate objects. In this paper, we propose a multi-scale deep learning and clustering-based approach for object instance segmentation in tabletop scenes. Our approach incorporates a multi-scale neighborhood feature sampling (MNFS) module specifically designed to extract local features, and a clustering algorithm to eliminate noise and preserve instance integrity. Furthermore, we combine the strength of both methods through ScoreNet and non-maximal suppression. We conducted extensive experiments on TO-Scene, the first large-scale dataset of 3D tabletop scenes, and observed an average mIoU improvement of approximately 4.07% compared to existing methods. This highlights the superior performance of our proposed method. In addition, we tested our algorithm on a real-scene robotics platform and showed that it has good performance and generalization capabilities to support future applications such as robot grasping.

Original language	English
Journal	International Journal of Advanced Robotic Systems
Volume	21
Issue number	5
DOIs	https://doi.org/10.1177/17298806241278165
Publication status	Published - 1 Sept 2024

Keywords

3D point cloud
clustering algorithm
deep learning
instance segmentation
robot grasping

Access to Document

10.1177/17298806241278165

Cite this

Jiang, Z., Xue, Y., Zhao, Y., Huang, X., & Li, H. (2024). Multi-scale deep learning and clustering-based tabletop object instance segmentation for robot manipulation. International Journal of Advanced Robotic Systems, 21(5). https://doi.org/10.1177/17298806241278165

@article{b02b9cfdb8e84e918f5f1da9a813bfe9,

title = "Multi-scale deep learning and clustering-based tabletop object instance segmentation for robot manipulation",

abstract = "3D object instance segmentation plays a vital role in various applications such as autonomous driving, robotics and virtual reality. However, tabletop scenes exhibit diverse object complexities and size variations. The challenge is to enhance the accuracy of segmenting these scenes for multiple object instances. This limitation directly impacts robots{\textquoteright} capabilities to effectively grasp and manipulate objects. In this paper, we propose a multi-scale deep learning and clustering-based approach for object instance segmentation in tabletop scenes. Our approach incorporates a multi-scale neighborhood feature sampling (MNFS) module specifically designed to extract local features, and a clustering algorithm to eliminate noise and preserve instance integrity. Furthermore, we combine the strength of both methods through ScoreNet and non-maximal suppression. We conducted extensive experiments on TO-Scene, the first large-scale dataset of 3D tabletop scenes, and observed an average mIoU improvement of approximately 4.07% compared to existing methods. This highlights the superior performance of our proposed method. In addition, we tested our algorithm on a real-scene robotics platform and showed that it has good performance and generalization capabilities to support future applications such as robot grasping.",

keywords = "3D point cloud, clustering algorithm, deep learning, instance segmentation, robot grasping",

author = "Zhihong Jiang and Yongrui Xue and Yan Zhao and Xiao Huang and Hui Li",

note = "Publisher Copyright: {\textcopyright} The Author(s) 2024.",

year = "2024",

month = sep,

day = "1",

doi = "10.1177/17298806241278165",

language = "English",

volume = "21",

journal = "International Journal of Advanced Robotic Systems",

issn = "1729-8806",

publisher = "SAGE Publications Inc.",

number = "5",

}

TY - JOUR

T1 - Multi-scale deep learning and clustering-based tabletop object instance segmentation for robot manipulation

AU - Jiang, Zhihong

AU - Xue, Yongrui

AU - Zhao, Yan

AU - Huang, Xiao

AU - Li, Hui

N1 - Publisher Copyright: © The Author(s) 2024.

PY - 2024/9/1

Y1 - 2024/9/1

N2 - 3D object instance segmentation plays a vital role in various applications such as autonomous driving, robotics and virtual reality. However, tabletop scenes exhibit diverse object complexities and size variations. The challenge is to enhance the accuracy of segmenting these scenes for multiple object instances. This limitation directly impacts robots’ capabilities to effectively grasp and manipulate objects. In this paper, we propose a multi-scale deep learning and clustering-based approach for object instance segmentation in tabletop scenes. Our approach incorporates a multi-scale neighborhood feature sampling (MNFS) module specifically designed to extract local features, and a clustering algorithm to eliminate noise and preserve instance integrity. Furthermore, we combine the strength of both methods through ScoreNet and non-maximal suppression. We conducted extensive experiments on TO-Scene, the first large-scale dataset of 3D tabletop scenes, and observed an average mIoU improvement of approximately 4.07% compared to existing methods. This highlights the superior performance of our proposed method. In addition, we tested our algorithm on a real-scene robotics platform and showed that it has good performance and generalization capabilities to support future applications such as robot grasping.

AB - 3D object instance segmentation plays a vital role in various applications such as autonomous driving, robotics and virtual reality. However, tabletop scenes exhibit diverse object complexities and size variations. The challenge is to enhance the accuracy of segmenting these scenes for multiple object instances. This limitation directly impacts robots’ capabilities to effectively grasp and manipulate objects. In this paper, we propose a multi-scale deep learning and clustering-based approach for object instance segmentation in tabletop scenes. Our approach incorporates a multi-scale neighborhood feature sampling (MNFS) module specifically designed to extract local features, and a clustering algorithm to eliminate noise and preserve instance integrity. Furthermore, we combine the strength of both methods through ScoreNet and non-maximal suppression. We conducted extensive experiments on TO-Scene, the first large-scale dataset of 3D tabletop scenes, and observed an average mIoU improvement of approximately 4.07% compared to existing methods. This highlights the superior performance of our proposed method. In addition, we tested our algorithm on a real-scene robotics platform and showed that it has good performance and generalization capabilities to support future applications such as robot grasping.

KW - 3D point cloud

KW - clustering algorithm

KW - deep learning

KW - instance segmentation

KW - robot grasping

UR - http://www.scopus.com/inward/record.url?scp=85204712119&partnerID=8YFLogxK

U2 - 10.1177/17298806241278165

DO - 10.1177/17298806241278165

M3 - Article

AN - SCOPUS:85204712119

SN - 1729-8806

VL - 21

JO - International Journal of Advanced Robotic Systems

JF - International Journal of Advanced Robotic Systems

IS - 5

ER -

Multi-scale deep learning and clustering-based tabletop object instance segmentation for robot manipulation

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this