摘要
Multimodal Aspect-Based Sentiment Analysis (MABSA) is challenging in data-heterologous settings, where images provide only weak or noisy context for textual aspects. Existing methods based on unconditional fusion or generic MLLM captions often suffer from granularity mismatch, hallucination, and irrelevant visual noise. We propose MADSC (Multimodal Aspect-aware Description with Similarity and Calibration), which strengthens aspect-aware grounding by refining generic captions into aspect-centric descriptions. MADSC uses a dual-similarity estimator to align aspects with caption objects through CLIP-based semantic compatibility and box-mediated visual grounding, and employs confidence calibration to gate unreliable visual cues during decoding. Experiments on Twitter-2015 and Twitter-2017 demonstrate state-of-the-art results on MATE, MABSA, and JMASA, confirming the effectiveness of aspect-aware refinement and calibrated alignment.
| 源语言 | 英语 |
|---|---|
| 文章编号 | 113712 |
| 期刊 | Pattern Recognition |
| 卷 | 179 |
| DOI | |
| 出版状态 | 已出版 - 11月 2026 |
| 已对外发布 | 是 |
指纹
探究 'MADSC: Aspect-aware description and calibrated alignment for unified Multimodal Aspect-Based Sentiment Analysis' 的科研主题。它们共同构成独一无二的指纹。引用此
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver