TY - JOUR
T1 - An Optimal-Transport-Based Multimodal Big Data Clustering
AU - Yang, Zheng
AU - Shi, Chongyang
AU - Guan, Ying
N1 - Publisher Copyright:
© 2025 by the authors.
PY - 2025/2
Y1 - 2025/2
N2 - Multimodal clustering achieves outstanding performance in various applications by aggregating information from heterogeneous devices. However, previous methods rely on strong-notion distances to fuse crossmodal complementary knowledge, established on a fragile assumption about the existence of a ubiquitous non-negligible intersection between heterogeneous manifolds of modalities. Due to this unstable theoretical basis, previous methods are essentially challenged by limited performance on general multimodal data. To address this challenge, an optimal-transport-based multimodal clustering (OTMC) method is defined as the optimal transport (OT) from multimodal data distributions to clustering distributions, which leverages a weak-topology measure to capture complementary knowledge with clear discriminative structures. OTMC consists of a modality-specific OT delivering private structures and a modality-common OT delivering shared structures, which transports category structures scattered in manifolds of each modality and all modalities to common prototypes, respectively. Furthermore, variational solutions to OTMC are derived by matching the data-prototype joint distribution, which induces the multimodal OT clustering network, to capture discriminative structures. Finally, the experimental results from four real-world datasets demonstrate the superiority of OTMC, helped by never relying on the phantom of heterogeneous manifold intersections. In particular, OTMC obtains 92.15% ACC, 84.96% NMI, and 83.35% ARI on Handwritten, improving by 2.25%, 2.82%, and 3.28%, respectively.
AB - Multimodal clustering achieves outstanding performance in various applications by aggregating information from heterogeneous devices. However, previous methods rely on strong-notion distances to fuse crossmodal complementary knowledge, established on a fragile assumption about the existence of a ubiquitous non-negligible intersection between heterogeneous manifolds of modalities. Due to this unstable theoretical basis, previous methods are essentially challenged by limited performance on general multimodal data. To address this challenge, an optimal-transport-based multimodal clustering (OTMC) method is defined as the optimal transport (OT) from multimodal data distributions to clustering distributions, which leverages a weak-topology measure to capture complementary knowledge with clear discriminative structures. OTMC consists of a modality-specific OT delivering private structures and a modality-common OT delivering shared structures, which transports category structures scattered in manifolds of each modality and all modalities to common prototypes, respectively. Furthermore, variational solutions to OTMC are derived by matching the data-prototype joint distribution, which induces the multimodal OT clustering network, to capture discriminative structures. Finally, the experimental results from four real-world datasets demonstrate the superiority of OTMC, helped by never relying on the phantom of heterogeneous manifold intersections. In particular, OTMC obtains 92.15% ACC, 84.96% NMI, and 83.35% ARI on Handwritten, improving by 2.25%, 2.82%, and 3.28%, respectively.
KW - big data
KW - deep multimodal clustering
KW - heterogeneous manifold
KW - multimodal data
KW - optimal transport
KW - variational solution
UR - http://www.scopus.com/inward/record.url?scp=85218912963&partnerID=8YFLogxK
U2 - 10.3390/electronics14040666
DO - 10.3390/electronics14040666
M3 - Article
AN - SCOPUS:85218912963
SN - 2079-9292
VL - 14
JO - Electronics (Switzerland)
JF - Electronics (Switzerland)
IS - 4
M1 - 666
ER -