Abstract
Visual focus of attention recognition is usually based on head pose estimation. However, in a real application, it is difficult to accurately estimate the head pose due to large pose variations, low resolution images and varying illuminations. To handle the problem, we propose a dynamic Bayesian network model to infer the visual focus of attention. The head pose is not explicitly computed but measured by a similarity vector which represents the likelihoods of multiple face pose clusters. The model encodes the probabilistic relations among multiple foci of attention, multiple user locations and faces captured by multiple cameras. Data are collected in a prototype ambient kitchen and results show that the model is effective.
Original language | English |
---|---|
Pages (from-to) | 140-146 |
Number of pages | 7 |
Journal | Tien Tzu Hsueh Pao/Acta Electronica Sinica |
Volume | 39 |
Issue number | 3 A |
Publication status | Published - Mar 2011 |
Externally published | Yes |
Keywords
- Dynamic Bayesian network
- The ambient kitchen
- Visual focus of attention recognition