Doge Tickets: Uncovering Domain-General Language Models by Playing Lottery Tickets

Yi Yang, Chen Zhang, Benyou Wang, Dawei Song*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Citations (Scopus)

Abstract

Over-parameterized pre-trained language models (LMs), have shown an appealing expressive power due to their small learning bias. However, the huge learning capacity of LMs can also lead to large learning variance. In a pilot study, we find that, when faced with multiple domains, a critical portion of parameters behave unexpectedly in a domain-specific manner while others behave in a domain-general one. Motivated by this phenomenon, we for the first time posit that domain-general parameters can underpin a domain-general LM that can be derived from the original LM. To uncover the domain-general LM, we propose to identify domain-general parameters by playing lottery tickets (dubbed doge tickets). In order to intervene the lottery, we propose a domain-general score, which depicts how domain-invariant a parameter is by associating it with the variance. Comprehensive experiments are conducted on the Amazon, Mnli, and OntoNotes datasets. The results show that the doge tickets obtains an improved out-of-domain generalization in comparison with a range of competitive baselines. Analysis results further hint the existence of domain-general parameters and the performance consistency of doge tickets.

Original languageEnglish
Title of host publicationNatural Language Processing and Chinese Computing - 11th CCF International Conference, NLPCC 2022, Proceedings
EditorsWei Lu, Shujian Huang, Yu Hong, Xiabing Zhou
PublisherSpringer Science and Business Media Deutschland GmbH
Pages144-156
Number of pages13
ISBN (Print)9783031171192
DOIs
Publication statusPublished - 2022
Event11th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2022 - Guilin, China
Duration: 24 Sept 202225 Sept 2022

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13551 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference11th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2022
Country/TerritoryChina
CityGuilin
Period24/09/2225/09/22

Keywords

  • Domain generalization
  • Lottery tickets hypothesis
  • Pre-trained language model

Fingerprint

Dive into the research topics of 'Doge Tickets: Uncovering Domain-General Language Models by Playing Lottery Tickets'. Together they form a unique fingerprint.

Cite this