跳到主要导航 跳到搜索 跳到主要内容

Enhancing Robustness and Generalization Capability for Multimodal Recommender Systems via Sharpness-Aware Minimization

  • Jinfeng Xu
  • , Zheyu Chen
  • , Jinze Li
  • , Shuo Yang
  • , Wei Wang
  • , Xiping Hu
  • , Raymond Chi Wing Wong
  • , Edith C.H. Ngai*
  • *此作品的通讯作者
  • The University of Hong Kong
  • Hong Kong Polytechnic University
  • Shenzhen MSU-BIT University
  • Beijing Institute of Technology
  • Hong Kong University of Science and Technology

科研成果: 期刊稿件文章同行评审

摘要

Multimodal recommender systems utilize a variety of information types to model user preferences and item properties, aiding in the discovery of items that align with user interests. Rich multimodal information alleviates inherent challenges in recommendation systems, such as data sparsity and cold start problems. However, multimodal information further introduces challenges in terms of robustness and generalization capability. Regarding robustness, multimodal information magnifies the risks associated with information adjustment and inherent noise, posing severe challenges to the stability of recommendation models. For generalization capability, multimodal recommender systems are more complex and difficult to train, making it harder for models to handle data beyond the training set, posing significant challenges to model generalization capability. In this paper, we analyze the shortcomings of existing robustness and generalization capability enhancement strategies in the multimodal recommendation field. We propose a sharpness-aware minimization strategy focused on batch data (BSAM), which effectively enhances the robustness and generalization capability of multimodal recommender systems without requiring extensive hyper-parameter tuning. Furthermore, we introduce a mixed loss variant strategy (BSAM+), which accelerates convergence and achieves remarkable performance improvement. We provide rigorous theoretical proofs and conduct experiments with nine advanced models on five widely used datasets to validate the superiority of our strategies. Moreover, our strategies can be integrated with existing robust training and data augmentation strategies to achieve further improvement, providing a superior training paradigm for multimodal recommendations.

源语言英语
页(从-至)6406-6419
页数14
期刊IEEE Transactions on Knowledge and Data Engineering
37
11
DOI
出版状态已出版 - 2025
已对外发布

指纹

探究 'Enhancing Robustness and Generalization Capability for Multimodal Recommender Systems via Sharpness-Aware Minimization' 的科研主题。它们共同构成独一无二的指纹。

引用此