TY - JOUR
T1 - Unsupervised Cross-Modal Hashing via Semantic Text Mining
AU - Tu, Rong Cheng
AU - Mao, Xian Ling
AU - Lin, Qinghong
AU - Ji, Wenjin
AU - Qin, Weize
AU - Wei, Wei
AU - Huang, Heyan
N1 - Publisher Copyright:
© 1999-2012 IEEE.
PY - 2023
Y1 - 2023
N2 - Cross-modal hashing has been widely used in multimedia retrieval tasks due to its fast retrieval speed and low storage cost. Recently, many deep unsupervised cross-modal hashing methods have been proposed to deal the unlabeled datasets. These methods usually construct an instance similarity matrix by fusing the image and text modality-specific similarity matrices as the guiding information to train the hashing networks. However, most of them directly use cosine similarities between the bag-of-words (BoW) vectors of text datapoints to define the text modality-specific similarity matrix, which fails to mine the semantic similarity information contained in the text modal datapoints and leads to the poor quality of the instance similarity matrix. To tackle the aforementioned problem, in this paper, we propose a novel Unsupervised Cross-modal Hashing via Semantic Text Mining, called UCHSTM. Specifically, UCHSTM first mines the correlations between the words of text datapoints. Then, UCHSTM constructs the text modality-specific similarity matrix for the training instances based on the mined correlations between their words. Next, UCHSTM fuses the image and text modality-specific similarity matrices as the final instance similarity matrix to guide the training of hashing model. Furthermore, during the process of training the hashing networks, a novel self-redefined-similarity loss is proposed to further correct some wrong defined similarities in the constructed instance similarity matrix, thereby further enhancing the retrieval performance. Extensive experiments on two widely used datasets show that the proposed UCHSTM outperforms state-of-the-art baselines on cross-modal retrieval tasks.
AB - Cross-modal hashing has been widely used in multimedia retrieval tasks due to its fast retrieval speed and low storage cost. Recently, many deep unsupervised cross-modal hashing methods have been proposed to deal the unlabeled datasets. These methods usually construct an instance similarity matrix by fusing the image and text modality-specific similarity matrices as the guiding information to train the hashing networks. However, most of them directly use cosine similarities between the bag-of-words (BoW) vectors of text datapoints to define the text modality-specific similarity matrix, which fails to mine the semantic similarity information contained in the text modal datapoints and leads to the poor quality of the instance similarity matrix. To tackle the aforementioned problem, in this paper, we propose a novel Unsupervised Cross-modal Hashing via Semantic Text Mining, called UCHSTM. Specifically, UCHSTM first mines the correlations between the words of text datapoints. Then, UCHSTM constructs the text modality-specific similarity matrix for the training instances based on the mined correlations between their words. Next, UCHSTM fuses the image and text modality-specific similarity matrices as the final instance similarity matrix to guide the training of hashing model. Furthermore, during the process of training the hashing networks, a novel self-redefined-similarity loss is proposed to further correct some wrong defined similarities in the constructed instance similarity matrix, thereby further enhancing the retrieval performance. Extensive experiments on two widely used datasets show that the proposed UCHSTM outperforms state-of-the-art baselines on cross-modal retrieval tasks.
KW - Cross-modal retrieval
KW - deep supervised hashing
KW - self-redefined-similarity loss
KW - semantic text mining
UR - http://www.scopus.com/inward/record.url?scp=85149407617&partnerID=8YFLogxK
U2 - 10.1109/TMM.2023.3243608
DO - 10.1109/TMM.2023.3243608
M3 - Article
AN - SCOPUS:85149407617
SN - 1520-9210
VL - 25
SP - 8946
EP - 8957
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
ER -