Abstract
In open-world robotic manipulation tasks, language-guided model-free grasping has garnered increasing attention. However, existing approaches often overlook the geometric structure of target objects, which limits the effectiveness of subsequent tasks such as manipulation and placement. To address this limitation, a novel method called Language-Guided Grasping via Primitive Fitting is proposed. This approach integrates language instructions with multimodal perception to enhance the semantic interpretability and downstream usability of the grasp through structured geometric modeling. Specifically, the user-specified object using 2D images and depth data via multimodal understanding is first localized. Then, primitive fitting on the object's point cloud using basic geometric shapes (e.g., cuboids, ellipsoids, truncated cones) to extract approximate size and structural features is performed. Based on the geometric information, a grasp pose generation strategy guided by semantic geometry is defined, and modules for grasp feasibility filtering and task-oriented optimization to select the optimal grasp pose are introduced. This method is validated in real-world complex environments and achieved grasp success rates of 95% in structured and 90% in cluttered scenes. Geometric fitting enhances post-grasp predictability and semantic consistency, enabling better generalization and planning.
| Original language | English |
|---|---|
| Journal | Advanced Intelligent Systems |
| DOIs | |
| Publication status | Accepted/In press - 2025 |
Keywords
- robot grasping
- robotics
- shape fitting
Fingerprint
Dive into the research topics of 'Language-Guided Robot Grasping Based on Basic Geometric Shape Fitting'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver