Abstract
Transparent objects are commonly found in real life and industrial production. Unlike opaque objects, transparent objects are not easily identifiable in RGB images and often require depth information to determine their position in the image. However, due to the influence of other environmental factors such as reflection and refraction, the depth information of transparent objects is often inaccurate. This leads to difficulties for robots in grasping transparent objects, as incorrect depth information can result in the robot being unable to predict or predict incorrectly the grasping pose. Therefore, it is necessary to complete the depth information for transparent objects. Previous methods for depth completion of transparent objects often struggle to balance accuracy and real-time performance simultaneously. To achieve this goal, in this paper, we propose a transparent object depth completion network called TCRNet based on a cascade refinement structure, which balances accuracy and real-time performance simultaneously. First, the network incorporates a cascade refinement structure in the decoding stage to refine features multiple times, improving the accuracy of depth information. Additionally, an attention module is designed to adjust the extracted features, enabling the network to focus on depth information features in transparent object regions. Finally, a transformer-based error module is implemented in the network’s final output stage to predict and adjust the error between the depth image and the ground truth. TCRNet is trained and tested on three datasets: ClearGrasp, Omniverse Object, and TransCG. It outperforms previous methods in terms of performance. Furthermore, TCRNet is applied to existing grasp detection methods to conduct grasping experiments on transparent objects using a real Baxter robot. <italic>Note to Practitioners</italic>—With the development of RGB-D camera technology, RGB-D cameras are now widely used in various scenarios such as industrial production, autonomous driving, and robot grasping. However, in certain situations where the camera faces transparent or highly reflective objects, the depth information captured by the camera is often not accurate enough, which can lead to subsequent accidents. Therefore, it is necessary to repair and complete the depth images to achieve accurate understanding of the scene’s depth information. In recent years, with the advancement of deep learning, deep learning-based depth image processing and restoration techniques have been widely applied. In this paper, we propose a high-accuracy network for repairing depth images of transparent objects, which can accurately restore and estimate the depth information of transparent objects in various scenarios. Moreover, experimental results demonstrate that our proposed method can generalize well to other unknown scenes, achieving excellent results.
Original language | English |
---|---|
Pages (from-to) | 1-20 |
Number of pages | 20 |
Journal | IEEE Transactions on Automation Science and Engineering |
DOIs | |
Publication status | Accepted/In press - 2024 |
Keywords
- Cameras
- Depth completion
- Electronic mail
- Feature extraction
- Grasping
- Real-time systems
- Robots
- Transformers
- cascade refinement
- robotic grasp
- transparent objects