Abstract
In this article, we focus on the challenging problem of robot scene recognition (SR) with uncertain view and position in unknown domestic environments. Inspired by active vision, we propose an active scene recognition (ASR) method that integrates an active view changing based on Markov decision with SR based on multiview images. We design a deep Q-learning-based action model to generate suitable movement actions, adjusting the robot’s observation to acquire some beneficial multiview images for SR. To handle these scene images, we introduce a multiview SR model. This model includes a scene score model (SSM) to rate each image and a scene prediction module (SPM) to determine the SR result as well as to stop actions automatically for SR efficiency. To train the recognition model, we devise a method for generating multiview scene images, creating ample training data from existing scene datasets without manual, time-consuming image capturing. We conducted comparative experiments and ablation studies in plenty of simulated domestic environments to extensively evaluate the ASR method. The results indicate that our method surpasses the current SR methods in accuracy and efficiency. Furthermore, SR experiments by a TurtleBot 4 robot in a real-world domestic environment validate the effectiveness of our method.
| Original language | English |
|---|---|
| Pages (from-to) | 9591-9603 |
| Number of pages | 13 |
| Journal | IEEE Transactions on Systems, Man, and Cybernetics: Systems |
| Volume | 55 |
| Issue number | 12 |
| DOIs | |
| Publication status | Published - 2025 |
| Externally published | Yes |
Keywords
- Data generation
- domestic robot
- multi-input recognition model
- robot active vision
- scene recognition (SR)