摘要
Visual focus of attention recognition is usually based on head pose estimation. However, in a real application, it is difficult to accurately estimate the head pose due to large pose variations, low resolution images and varying illuminations. To handle the problem, we propose a dynamic Bayesian network model to infer the visual focus of attention. The head pose is not explicitly computed but measured by a similarity vector which represents the likelihoods of multiple face pose clusters. The model encodes the probabilistic relations among multiple foci of attention, multiple user locations and faces captured by multiple cameras. Data are collected in a prototype ambient kitchen and results show that the model is effective.
源语言 | 英语 |
---|---|
页(从-至) | 140-146 |
页数 | 7 |
期刊 | Tien Tzu Hsueh Pao/Acta Electronica Sinica |
卷 | 39 |
期 | 3 A |
出版状态 | 已出版 - 3月 2011 |
已对外发布 | 是 |