A novel multimodal retrieval model based on ELM

Yu Zhang*, Ye Yuan, Yishu Wang, Guoren Wang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

6 Citations (Scopus)

Abstract

In this paper, we propose a novel multimodal retrieval model based on the Extreme Learning Machine (ELM). We exploit two multimedia modalities, the image and text, to achieve the multimodal retrieval. To begin with, we employ the probabilistic Latent Semantic Analysis (pLSA) to respectively simulate the generating processes of texts and images. So we obtain the appropriate representations of the images and those of the texts. Furthermore, ELM is used for training the correlation between the representations of the images and those of the texts. So the multimodal retrieval is implemented by the learned single-hidden layer feedforward neural networks (SLFNs). Additionally, the binary classifiers are trained to improve the accuracy of the multimodal retrieval model. This multimodal model can easily be extended into other modalities and extensive experimental results demonstrate the effectiveness and efficiency of this model based on ELM.

Original languageEnglish
Pages (from-to)65-77
Number of pages13
JournalNeurocomputing
Volume277
DOIs
Publication statusPublished - 14 Feb 2018
Externally publishedYes

Keywords

  • Extreme Learning Machine
  • Modality
  • Multimedia
  • Multimodal
  • Probabilistic Latent Semantic Analysis
  • Regression
  • Single hidden-layer feedforward neural networks

Fingerprint

Dive into the research topics of 'A novel multimodal retrieval model based on ELM'. Together they form a unique fingerprint.

Cite this