TY - JOUR
T1 - Facial-Based Heterogeneous Graph Representation Learning for Autism Spectrum Disorder Detection
AU - Tao, Yongfeng
AU - Su, Tongxu
AU - Yang, Minqiang
AU - Ni, Minghui
AU - She, Yingying
AU - Zheng, Weihao
AU - Hu, Bin
N1 - Publisher Copyright:
© 2010-2012 IEEE.
PY - 2026
Y1 - 2026
N2 - Autism spectrum disorder (ASD) is a neurodevelopmental disorder whose early manifestations may include atypical facial behaviors. Prior facial-based ASD studies often focus on localized static cues or short-range temporal variations, while under-exploring the joint modeling of temporal dynamics and non-local semantic consistency across facial states. In this work, we propose a graph representation learning framework consisting of a Graph Construction module (GC) and a heterogeneous graph neural network, HGMoE. The GC module builds a temporal–semantic heterogeneous graph from facial videos, where video frames are treated as nodes and heterogeneous edges encode both local temporal continuity and semantic similarity across frames. To better model heterogeneous relations and capture diverse graph signals, HGMoE integrates a Mixture-of-Experts (MoE) architecture with relation-aware edge attention. Specifically, the MoE dynamically routes messages to specialized experts, while the edge attention mechanism explicitly leverages edge types and initial semantic weights to modulate message passing across relations. Experiments on our recruited dataset achieve a classification accuracy of 97.4%, demonstrating the potential of the proposed framework for supporting facial-based ASD screening.
AB - Autism spectrum disorder (ASD) is a neurodevelopmental disorder whose early manifestations may include atypical facial behaviors. Prior facial-based ASD studies often focus on localized static cues or short-range temporal variations, while under-exploring the joint modeling of temporal dynamics and non-local semantic consistency across facial states. In this work, we propose a graph representation learning framework consisting of a Graph Construction module (GC) and a heterogeneous graph neural network, HGMoE. The GC module builds a temporal–semantic heterogeneous graph from facial videos, where video frames are treated as nodes and heterogeneous edges encode both local temporal continuity and semantic similarity across frames. To better model heterogeneous relations and capture diverse graph signals, HGMoE integrates a Mixture-of-Experts (MoE) architecture with relation-aware edge attention. Specifically, the MoE dynamically routes messages to specialized experts, while the edge attention mechanism explicitly leverages edge types and initial semantic weights to modulate message passing across relations. Experiments on our recruited dataset achieve a classification accuracy of 97.4%, demonstrating the potential of the proposed framework for supporting facial-based ASD screening.
KW - Autism spectrum disorder
KW - facial expressions
KW - graph neural network
KW - heterogeneous graph
KW - mixture of experts
UR - https://www.scopus.com/pages/publications/105039098366
U2 - 10.1109/TAFFC.2026.3692785
DO - 10.1109/TAFFC.2026.3692785
M3 - Article
AN - SCOPUS:105039098366
SN - 1949-3045
JO - IEEE Transactions on Affective Computing
JF - IEEE Transactions on Affective Computing
ER -