An Optimal-Transport-Based Multimodal Big Data Clustering

Zheng Yang*, Chongyang Shi, Ying Guan

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

摘要

Multimodal clustering achieves outstanding performance in various applications by aggregating information from heterogeneous devices. However, previous methods rely on strong-notion distances to fuse crossmodal complementary knowledge, established on a fragile assumption about the existence of a ubiquitous non-negligible intersection between heterogeneous manifolds of modalities. Due to this unstable theoretical basis, previous methods are essentially challenged by limited performance on general multimodal data. To address this challenge, an optimal-transport-based multimodal clustering (OTMC) method is defined as the optimal transport (OT) from multimodal data distributions to clustering distributions, which leverages a weak-topology measure to capture complementary knowledge with clear discriminative structures. OTMC consists of a modality-specific OT delivering private structures and a modality-common OT delivering shared structures, which transports category structures scattered in manifolds of each modality and all modalities to common prototypes, respectively. Furthermore, variational solutions to OTMC are derived by matching the data-prototype joint distribution, which induces the multimodal OT clustering network, to capture discriminative structures. Finally, the experimental results from four real-world datasets demonstrate the superiority of OTMC, helped by never relying on the phantom of heterogeneous manifold intersections. In particular, OTMC obtains 92.15% ACC, 84.96% NMI, and 83.35% ARI on Handwritten, improving by 2.25%, 2.82%, and 3.28%, respectively.

源语言英语
文章编号666
期刊Electronics (Switzerland)
14
4
DOI
出版状态已出版 - 2月 2025

指纹

探究 'An Optimal-Transport-Based Multimodal Big Data Clustering' 的科研主题。它们共同构成独一无二的指纹。

引用此

Yang, Z., Shi, C., & Guan, Y. (2025). An Optimal-Transport-Based Multimodal Big Data Clustering. Electronics (Switzerland), 14(4), 文章 666. https://doi.org/10.3390/electronics14040666