Abstract
Cross-domain word representation aims to learn high-quality semantic representations in an under-resourced domain by leveraging information in a resourceful domain. However, most existing methods mainly transfer the semantics of common words across domains, ignoring the semantic relations among domain-specific words. In this paper, we propose a domain structure-based transfer learning method to learn cross-domain representations by leveraging the relations among domain-specific words. To accomplish this, we first construct a semantic graph to capture the latent domain structure using domain-specific co-occurrence information. Then, in the domain adaptation process, beyond domain alignment, we employ Laplacian Eigenmaps to ensure the domain structure is consistently distributed in the learned embedding space. As such, the learned cross-domain word representations not only capture shared semantics across domains, but also maintain the latent domain structure. We performed extensive experiments on two tasks, namely sentiment analysis and query expansion. The experiment results show the effectiveness of our method for tasks in under-resourced domains.
Original language | English |
---|---|
Pages (from-to) | 145-156 |
Number of pages | 12 |
Journal | Information Fusion |
Volume | 76 |
DOIs | |
Publication status | Published - Dec 2021 |
Keywords
- Semantic structure
- Transfer learning
- Word representation