An Efficient Robotic Pushing and Grasping Method in Cluttered Scene

Sheng Yu; Di Hua Zhai; Yuanqing Xia; Yuyin Guan

doi:10.1109/TCYB.2024.3381639

An Efficient Robotic Pushing and Grasping Method in Cluttered Scene

Sheng Yu, Di Hua Zhai, Yuanqing Xia, Yuyin Guan

自动化学院

科研成果: 期刊稿件 › 文章 › 同行评审

4 引用（Scopus）

摘要

Pushing and grasping (PG) are crucial skills for intelligent robots. These skills enable robots to perform complex grasping tasks in various scenarios. These PG methods can be categorized into single-stage and multistage approaches. Single-stage methods are faster but less accurate, while multistage methods offer high accuracy at the expense of time efficiency. To address this issue, a novel end-to-end PG method called efficient PG network (EPGNet) is proposed in this article. EPGNet achieves both high accuracy and efficiency simultaneously. To optimize performance with fewer parameters, EfficientNet-B0 is used as the backbone of EPGNet. Additionally, a novel cross-fusion module is introduced to enhance network performance in robotic PG tasks. This module fuses and utilizes local and global features, aiding the network in handling objects of varying sizes in different scenes. EPGNet consists of two branches dedicated to predicting PG actions, respectively. Both branches are trained simultaneously within a <inline-formula> <tex-math notation="LaTeX">$Q$</tex-math> </inline-formula>-learning framework. Training data is collected through trial and error, involving the robot performing PG actions. To bridge the gap between simulation and reality, a unique PG dataset is proposed. Additionally, a YOLACT network is trained on the PG dataset to facilitate object detection and segmentation. A comprehensive set of experiments is conducted in simulated environments and real-world scenarios. The results demonstrate that EPGNet outperforms single-stage methods and offers competitive performance compared to multistage methods, all while utilizing fewer parameters. A video is available at https://youtu.be/HNKJjQH0MPc.

源语言	英语
页（从-至）	1-14
页数	14
期刊	IEEE Transactions on Cybernetics
DOI	https://doi.org/10.1109/TCYB.2024.3381639
出版状态	已接受/待刊 - 2024

访问文件

10.1109/TCYB.2024.3381639

其它文件与链接

链接到 Scopus 的出版物

引用此

Yu, S., Zhai, D. H., Xia, Y., & Guan, Y. (已接受/印刷中). An Efficient Robotic Pushing and Grasping Method in Cluttered Scene. IEEE Transactions on Cybernetics, 1-14. https://doi.org/10.1109/TCYB.2024.3381639

@article{77ea7fcabffb46bf8c38895b12ade173,

title = "An Efficient Robotic Pushing and Grasping Method in Cluttered Scene",

abstract = "Pushing and grasping (PG) are crucial skills for intelligent robots. These skills enable robots to perform complex grasping tasks in various scenarios. These PG methods can be categorized into single-stage and multistage approaches. Single-stage methods are faster but less accurate, while multistage methods offer high accuracy at the expense of time efficiency. To address this issue, a novel end-to-end PG method called efficient PG network (EPGNet) is proposed in this article. EPGNet achieves both high accuracy and efficiency simultaneously. To optimize performance with fewer parameters, EfficientNet-B0 is used as the backbone of EPGNet. Additionally, a novel cross-fusion module is introduced to enhance network performance in robotic PG tasks. This module fuses and utilizes local and global features, aiding the network in handling objects of varying sizes in different scenes. EPGNet consists of two branches dedicated to predicting PG actions, respectively. Both branches are trained simultaneously within a $Q$ -learning framework. Training data is collected through trial and error, involving the robot performing PG actions. To bridge the gap between simulation and reality, a unique PG dataset is proposed. Additionally, a YOLACT network is trained on the PG dataset to facilitate object detection and segmentation. A comprehensive set of experiments is conducted in simulated environments and real-world scenarios. The results demonstrate that EPGNet outperforms single-stage methods and offers competitive performance compared to multistage methods, all while utilizing fewer parameters. A video is available at https://youtu.be/HNKJjQH0MPc.",

keywords = "Convolutional neural network, Grasping, Heating systems, Object recognition, Robot kinematics, Robots, Task analysis, Training, deep reinforcement learning (DRL), grasping detection, robot",

author = "Sheng Yu and Zhai, {Di Hua} and Yuanqing Xia and Yuyin Guan",

note = "Publisher Copyright: IEEE",

year = "2024",

doi = "10.1109/TCYB.2024.3381639",

language = "English",

pages = "1--14",

journal = "IEEE Transactions on Cybernetics",

issn = "2168-2267",

publisher = "IEEE Advancing Technology for Humanity",

}

TY - JOUR

T1 - An Efficient Robotic Pushing and Grasping Method in Cluttered Scene

AU - Yu, Sheng

AU - Zhai, Di Hua

AU - Xia, Yuanqing

AU - Guan, Yuyin

N1 - Publisher Copyright: IEEE

PY - 2024

Y1 - 2024

N2 - Pushing and grasping (PG) are crucial skills for intelligent robots. These skills enable robots to perform complex grasping tasks in various scenarios. These PG methods can be categorized into single-stage and multistage approaches. Single-stage methods are faster but less accurate, while multistage methods offer high accuracy at the expense of time efficiency. To address this issue, a novel end-to-end PG method called efficient PG network (EPGNet) is proposed in this article. EPGNet achieves both high accuracy and efficiency simultaneously. To optimize performance with fewer parameters, EfficientNet-B0 is used as the backbone of EPGNet. Additionally, a novel cross-fusion module is introduced to enhance network performance in robotic PG tasks. This module fuses and utilizes local and global features, aiding the network in handling objects of varying sizes in different scenes. EPGNet consists of two branches dedicated to predicting PG actions, respectively. Both branches are trained simultaneously within a $Q$ -learning framework. Training data is collected through trial and error, involving the robot performing PG actions. To bridge the gap between simulation and reality, a unique PG dataset is proposed. Additionally, a YOLACT network is trained on the PG dataset to facilitate object detection and segmentation. A comprehensive set of experiments is conducted in simulated environments and real-world scenarios. The results demonstrate that EPGNet outperforms single-stage methods and offers competitive performance compared to multistage methods, all while utilizing fewer parameters. A video is available at https://youtu.be/HNKJjQH0MPc.

AB - Pushing and grasping (PG) are crucial skills for intelligent robots. These skills enable robots to perform complex grasping tasks in various scenarios. These PG methods can be categorized into single-stage and multistage approaches. Single-stage methods are faster but less accurate, while multistage methods offer high accuracy at the expense of time efficiency. To address this issue, a novel end-to-end PG method called efficient PG network (EPGNet) is proposed in this article. EPGNet achieves both high accuracy and efficiency simultaneously. To optimize performance with fewer parameters, EfficientNet-B0 is used as the backbone of EPGNet. Additionally, a novel cross-fusion module is introduced to enhance network performance in robotic PG tasks. This module fuses and utilizes local and global features, aiding the network in handling objects of varying sizes in different scenes. EPGNet consists of two branches dedicated to predicting PG actions, respectively. Both branches are trained simultaneously within a $Q$ -learning framework. Training data is collected through trial and error, involving the robot performing PG actions. To bridge the gap between simulation and reality, a unique PG dataset is proposed. Additionally, a YOLACT network is trained on the PG dataset to facilitate object detection and segmentation. A comprehensive set of experiments is conducted in simulated environments and real-world scenarios. The results demonstrate that EPGNet outperforms single-stage methods and offers competitive performance compared to multistage methods, all while utilizing fewer parameters. A video is available at https://youtu.be/HNKJjQH0MPc.

KW - Convolutional neural network

KW - Grasping

KW - Heating systems

KW - Object recognition

KW - Robot kinematics

KW - Robots

KW - Task analysis

KW - Training

KW - deep reinforcement learning (DRL)

KW - grasping detection

KW - robot

UR - http://www.scopus.com/inward/record.url?scp=85190729543&partnerID=8YFLogxK

U2 - 10.1109/TCYB.2024.3381639

DO - 10.1109/TCYB.2024.3381639

M3 - Article

AN - SCOPUS:85190729543

SN - 2168-2267

SP - 1

EP - 14

JO - IEEE Transactions on Cybernetics

JF - IEEE Transactions on Cybernetics

ER -

An Efficient Robotic Pushing and Grasping Method in Cluttered Scene

摘要

访问文件

其它文件与链接

指纹

引用此