TY - GEN
T1 - Cluster-Phys
T2 - 32nd ACM International Conference on Multimedia, MM 2024
AU - Qia, Nwei
AU - Li, Kun
AU - Guo, Dan
AU - Hu, Bin
AU - Wang, Meng
N1 - Publisher Copyright:
© 2024 ACM.
PY - 2024/10/28
Y1 - 2024/10/28
N2 - Remote photoplethysmography (rPPG) measurement aims to estimate physiological signals by analyzing subtle skin color changes induced by heartbeats in facial videos. Existing methods primarily rely on the fundamental video frame features or vanilla facial ROI (region of interest) features. Recognizing the varying light absorption and reactions of different facial regions over time, we adopt a new perspective to conduct a more fine-grained exploration of the key clues present in different facial regions within each frame and across temporal frames. Concretely, we propose a novel clustering-driven remote physiological measurement framework called Cluster-Phys, which employs a facial ROI prototypical clustering module to adaptively cluster the representative facial ROI features as facial prototypes and then update facial prototypes with highly semantic correlated base ROI features. In this way, our approach can mine facial clues from a more compact and informative prototype level rather than the conventional video/ROI level. Furthermore, we also propose a spatial-temporal prototype interaction module to learn facial prototype correlation from both spatial (across prototypes) and temporal (within prototype) perspectives. Extensive experiments are conducted on both intra-dataset and cross-dataset tests. The results show that our Cluster-Phys achieves significant performance improvement with less computation consumption. The source code will be available at https://github.com/VUT-HFUT/ClusterPhys.
AB - Remote photoplethysmography (rPPG) measurement aims to estimate physiological signals by analyzing subtle skin color changes induced by heartbeats in facial videos. Existing methods primarily rely on the fundamental video frame features or vanilla facial ROI (region of interest) features. Recognizing the varying light absorption and reactions of different facial regions over time, we adopt a new perspective to conduct a more fine-grained exploration of the key clues present in different facial regions within each frame and across temporal frames. Concretely, we propose a novel clustering-driven remote physiological measurement framework called Cluster-Phys, which employs a facial ROI prototypical clustering module to adaptively cluster the representative facial ROI features as facial prototypes and then update facial prototypes with highly semantic correlated base ROI features. In this way, our approach can mine facial clues from a more compact and informative prototype level rather than the conventional video/ROI level. Furthermore, we also propose a spatial-temporal prototype interaction module to learn facial prototype correlation from both spatial (across prototypes) and temporal (within prototype) perspectives. Extensive experiments are conducted on both intra-dataset and cross-dataset tests. The results show that our Cluster-Phys achieves significant performance improvement with less computation consumption. The source code will be available at https://github.com/VUT-HFUT/ClusterPhys.
KW - facial videos
KW - physiological measurement
KW - prototypical clustering
KW - remote photoplethysmography
UR - http://www.scopus.com/inward/record.url?scp=85209824783&partnerID=8YFLogxK
U2 - 10.1145/3664647.3680670
DO - 10.1145/3664647.3680670
M3 - Conference contribution
AN - SCOPUS:85209824783
T3 - MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia
SP - 330
EP - 339
BT - MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia
PB - Association for Computing Machinery, Inc
Y2 - 28 October 2024 through 1 November 2024
ER -