Deep Heterogeneous Multi-Task Metric Learning for Visual Recognition and Retrieval

Shikang Gan, Yong Luo, Yonggang Wen, Tongliang Liu, Han Hu*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Citations (Scopus)

Abstract

How to estimate the distance between data instances is a fundamental problem in many artificial intelligence algorithms, and critical in diverse multimedia applications. A major challenge in the estimation is how to find an appropriate distance function when labeled data are insufficient for a certain task. Multi-task metric learning (MTML) is able to alleviate such data deficiency issue by learning distance metrics for multiple tasks together and sharing information between the different tasks. Recently, heterogeneous MTML (HMTML) has attracted much attention since it can handle multiple tasks with varied data representations. A major drawback of the current HMTML approaches is that only linear transformations are learned to connect different domains. This is suboptimal since the correlations between different domains may be very complex and highly nonlinear. To overcome this drawback, we propose a deep heterogeneous MTML (DHMTML) method, in which a nonlinear mapping is learned for each task by using a deep neural network. The correlations of different domains are exploited by sharing some parameters at the top layers of different networks. More importantly, the auto-encoder scheme and the adversarial learning mechanism are integrated and incorporated to help exploit the feature correlations in and between different tasks and the specific properties are preserved by learning additional task-specific layers together with the common layers. Experiments demonstrated that the proposed method outperforms single-task deep metric learning algorithms and other HMTML approaches consistently on several benchmark datasets.

Original languageEnglish
Title of host publicationMM 2020 - Proceedings of the 28th ACM International Conference on Multimedia
PublisherAssociation for Computing Machinery, Inc
Pages1837-1845
Number of pages9
ISBN (Electronic)9781450379885
DOIs
Publication statusPublished - 12 Oct 2020
Event28th ACM International Conference on Multimedia, MM 2020 - Virtual, Online, United States
Duration: 12 Oct 202016 Oct 2020

Publication series

NameMM 2020 - Proceedings of the 28th ACM International Conference on Multimedia

Conference

Conference28th ACM International Conference on Multimedia, MM 2020
Country/TerritoryUnited States
CityVirtual, Online
Period12/10/2016/10/20

Keywords

  • deep neural networks
  • heterogeneous
  • metric learning
  • multi-task
  • visual applications

Fingerprint

Dive into the research topics of 'Deep Heterogeneous Multi-Task Metric Learning for Visual Recognition and Retrieval'. Together they form a unique fingerprint.

Cite this

Gan, S., Luo, Y., Wen, Y., Liu, T., & Hu, H. (2020). Deep Heterogeneous Multi-Task Metric Learning for Visual Recognition and Retrieval. In MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia (pp. 1837-1845). (MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia). Association for Computing Machinery, Inc. https://doi.org/10.1145/3394171.3413574