Towards Robust Cross-domain Image Understanding with Unsupervised Noise Removal

Lei Zhu; Zhaojing Luo; Wei Wang; Meihui Zhang; Gang Chen; Kaiping Zheng

doi:10.1145/3474085.3475175

Towards Robust Cross-domain Image Understanding with Unsupervised Noise Removal

Lei Zhu, Zhaojing Luo^*, Wei Wang, Meihui Zhang, Gang Chen, Kaiping Zheng

^*此作品的通讯作者

计算机学院

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

摘要

Deep learning has made a tremendous impact on various applications in multimedia, such as media interpretation and multimodal retrieval. However, deep learning models usually require a large amount of labeled data to achieve satisfactory performance. In multimedia analysis, domain adaptation studies the problem of cross-domain knowledge transfer from a label rich source domain to a label scarce target domain, thus potentially alleviates the annotation requirement for deep learning models. However, we find that contemporary domain adaptation methods for cross-domain image understanding perform poorly when source domain is noisy. Weakly Supervised Domain Adaptation (WSDA) studies the domain adaptation problem under the scenario where source data can be noisy. Prior methods on WSDA remove noisy source data and align the marginal distribution across domains without considering the fine-grained semantic structure in the embedding space, which have the problem of class misalignment, e.g., features of cats in the target domain might be mapped near features of dogs in the source domain. In this paper, we propose a novel method, termed Noise Tolerant Domain Adaptation (NTDA), for WSDA. Specifically, we adopt the cluster assumption and learn cluster discriminatively with class prototypes (centroids) in the embedding space. We propose to leverage the location information of the data points in the embedding space and model the location information with a Gaussian mixture model to identify noisy source data. We then design a network which incorporates the Gaussian mixture noise model as a sub-module for unsupervised noise removal and propose a novel cluster-level adversarial adaptation method based on the Generative Adversarial Network (GAN) framework which aligns unlabeled target data with the less noisy class prototypes for mapping the semantic structure across domains. Finally, we devise a simple and effective algorithm to train the network from end to end. We conduct extensive experiments to evaluate the effectiveness of our method on both general images and medical images from COVID-19 and e-commerce datasets. The results show that our method significantly outperforms state-of-the-art WSDA methods.

源语言	英语
主期刊名	MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia
出版商	Association for Computing Machinery, Inc
页	3024-3033
页数	10
ISBN（电子版）	9781450386517
DOI	https://doi.org/10.1145/3474085.3475175
出版状态	已出版 - 17 10月 2021
活动	29th ACM International Conference on Multimedia, MM 2021 - Virtual, Online, 中国期限: 20 10月 2021 → 24 10月 2021

出版系列

姓名	MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia

会议

会议	29th ACM International Conference on Multimedia, MM 2021
国家/地区	中国
市	Virtual, Online
时期	20/10/21 → 24/10/21

访问文件

10.1145/3474085.3475175

其它文件与链接

链接到 Scopus 的出版物

引用此

Zhu, L., Luo, Z., Wang, W., Zhang, M., Chen, G., & Zheng, K. (2021). Towards Robust Cross-domain Image Understanding with Unsupervised Noise Removal. 在 MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia (页码 3024-3033). (MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia). Association for Computing Machinery, Inc. https://doi.org/10.1145/3474085.3475175

@inproceedings{861850d6985f4f9f8e24fe5c564f62d4,

title = "Towards Robust Cross-domain Image Understanding with Unsupervised Noise Removal",

abstract = "Deep learning has made a tremendous impact on various applications in multimedia, such as media interpretation and multimodal retrieval. However, deep learning models usually require a large amount of labeled data to achieve satisfactory performance. In multimedia analysis, domain adaptation studies the problem of cross-domain knowledge transfer from a label rich source domain to a label scarce target domain, thus potentially alleviates the annotation requirement for deep learning models. However, we find that contemporary domain adaptation methods for cross-domain image understanding perform poorly when source domain is noisy. Weakly Supervised Domain Adaptation (WSDA) studies the domain adaptation problem under the scenario where source data can be noisy. Prior methods on WSDA remove noisy source data and align the marginal distribution across domains without considering the fine-grained semantic structure in the embedding space, which have the problem of class misalignment, e.g., features of cats in the target domain might be mapped near features of dogs in the source domain. In this paper, we propose a novel method, termed Noise Tolerant Domain Adaptation (NTDA), for WSDA. Specifically, we adopt the cluster assumption and learn cluster discriminatively with class prototypes (centroids) in the embedding space. We propose to leverage the location information of the data points in the embedding space and model the location information with a Gaussian mixture model to identify noisy source data. We then design a network which incorporates the Gaussian mixture noise model as a sub-module for unsupervised noise removal and propose a novel cluster-level adversarial adaptation method based on the Generative Adversarial Network (GAN) framework which aligns unlabeled target data with the less noisy class prototypes for mapping the semantic structure across domains. Finally, we devise a simple and effective algorithm to train the network from end to end. We conduct extensive experiments to evaluate the effectiveness of our method on both general images and medical images from COVID-19 and e-commerce datasets. The results show that our method significantly outperforms state-of-the-art WSDA methods.",

keywords = "adversarial learning, representation learning, weakly supervised domain adaptation",

author = "Lei Zhu and Zhaojing Luo and Wei Wang and Meihui Zhang and Gang Chen and Kaiping Zheng",

note = "Publisher Copyright: {\textcopyright} 2021 Owner/Author.; 29th ACM International Conference on Multimedia, MM 2021 ; Conference date: 20-10-2021 Through 24-10-2021",

year = "2021",

month = oct,

day = "17",

doi = "10.1145/3474085.3475175",

language = "English",

series = "MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia",

publisher = "Association for Computing Machinery, Inc",

pages = "3024--3033",

booktitle = "MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia",

}

Zhu, L, Luo, Z, Wang, W, Zhang, M, Chen, G & Zheng, K 2021, Towards Robust Cross-domain Image Understanding with Unsupervised Noise Removal. 在 MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia. MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia, Association for Computing Machinery, Inc, 页码 3024-3033, 29th ACM International Conference on Multimedia, MM 2021, Virtual, Online, 中国, 20/10/21. https://doi.org/10.1145/3474085.3475175

Towards Robust Cross-domain Image Understanding with Unsupervised Noise Removal. / Zhu, Lei; Luo, Zhaojing; Wang, Wei 等.
MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia. Association for Computing Machinery, Inc, 2021. 页码 3024-3033 (MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Towards Robust Cross-domain Image Understanding with Unsupervised Noise Removal

AU - Zhu, Lei

AU - Luo, Zhaojing

AU - Wang, Wei

AU - Zhang, Meihui

AU - Chen, Gang

AU - Zheng, Kaiping

PY - 2021/10/17

Y1 - 2021/10/17

N2 - Deep learning has made a tremendous impact on various applications in multimedia, such as media interpretation and multimodal retrieval. However, deep learning models usually require a large amount of labeled data to achieve satisfactory performance. In multimedia analysis, domain adaptation studies the problem of cross-domain knowledge transfer from a label rich source domain to a label scarce target domain, thus potentially alleviates the annotation requirement for deep learning models. However, we find that contemporary domain adaptation methods for cross-domain image understanding perform poorly when source domain is noisy. Weakly Supervised Domain Adaptation (WSDA) studies the domain adaptation problem under the scenario where source data can be noisy. Prior methods on WSDA remove noisy source data and align the marginal distribution across domains without considering the fine-grained semantic structure in the embedding space, which have the problem of class misalignment, e.g., features of cats in the target domain might be mapped near features of dogs in the source domain. In this paper, we propose a novel method, termed Noise Tolerant Domain Adaptation (NTDA), for WSDA. Specifically, we adopt the cluster assumption and learn cluster discriminatively with class prototypes (centroids) in the embedding space. We propose to leverage the location information of the data points in the embedding space and model the location information with a Gaussian mixture model to identify noisy source data. We then design a network which incorporates the Gaussian mixture noise model as a sub-module for unsupervised noise removal and propose a novel cluster-level adversarial adaptation method based on the Generative Adversarial Network (GAN) framework which aligns unlabeled target data with the less noisy class prototypes for mapping the semantic structure across domains. Finally, we devise a simple and effective algorithm to train the network from end to end. We conduct extensive experiments to evaluate the effectiveness of our method on both general images and medical images from COVID-19 and e-commerce datasets. The results show that our method significantly outperforms state-of-the-art WSDA methods.

AB - Deep learning has made a tremendous impact on various applications in multimedia, such as media interpretation and multimodal retrieval. However, deep learning models usually require a large amount of labeled data to achieve satisfactory performance. In multimedia analysis, domain adaptation studies the problem of cross-domain knowledge transfer from a label rich source domain to a label scarce target domain, thus potentially alleviates the annotation requirement for deep learning models. However, we find that contemporary domain adaptation methods for cross-domain image understanding perform poorly when source domain is noisy. Weakly Supervised Domain Adaptation (WSDA) studies the domain adaptation problem under the scenario where source data can be noisy. Prior methods on WSDA remove noisy source data and align the marginal distribution across domains without considering the fine-grained semantic structure in the embedding space, which have the problem of class misalignment, e.g., features of cats in the target domain might be mapped near features of dogs in the source domain. In this paper, we propose a novel method, termed Noise Tolerant Domain Adaptation (NTDA), for WSDA. Specifically, we adopt the cluster assumption and learn cluster discriminatively with class prototypes (centroids) in the embedding space. We propose to leverage the location information of the data points in the embedding space and model the location information with a Gaussian mixture model to identify noisy source data. We then design a network which incorporates the Gaussian mixture noise model as a sub-module for unsupervised noise removal and propose a novel cluster-level adversarial adaptation method based on the Generative Adversarial Network (GAN) framework which aligns unlabeled target data with the less noisy class prototypes for mapping the semantic structure across domains. Finally, we devise a simple and effective algorithm to train the network from end to end. We conduct extensive experiments to evaluate the effectiveness of our method on both general images and medical images from COVID-19 and e-commerce datasets. The results show that our method significantly outperforms state-of-the-art WSDA methods.

KW - adversarial learning

KW - representation learning

KW - weakly supervised domain adaptation

UR - http://www.scopus.com/inward/record.url?scp=85119328147&partnerID=8YFLogxK

U2 - 10.1145/3474085.3475175

DO - 10.1145/3474085.3475175

M3 - Conference contribution

AN - SCOPUS:85119328147

T3 - MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia

SP - 3024

EP - 3033

BT - MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia

PB - Association for Computing Machinery, Inc

T2 - 29th ACM International Conference on Multimedia, MM 2021

Y2 - 20 October 2021 through 24 October 2021

ER -

Zhu L, Luo Z, Wang W, Zhang M, Chen G, Zheng K. Towards Robust Cross-domain Image Understanding with Unsupervised Noise Removal. 在 MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia. Association for Computing Machinery, Inc. 2021. 页码 3024-3033. (MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia). doi: 10.1145/3474085.3475175

Towards Robust Cross-domain Image Understanding with Unsupervised Noise Removal

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此