Doge Tickets: Uncovering Domain-General Language Models by Playing Lottery Tickets

Yi Yang; Chen Zhang; Benyou Wang; Dawei Song

doi:10.1007/978-3-031-17120-8_12

Doge Tickets: Uncovering Domain-General Language Models by Playing Lottery Tickets

Yi Yang, Chen Zhang, Benyou Wang, Dawei Song^*

^*此作品的通讯作者

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

4 引用（Scopus）

摘要

Over-parameterized pre-trained language models (LMs), have shown an appealing expressive power due to their small learning bias. However, the huge learning capacity of LMs can also lead to large learning variance. In a pilot study, we find that, when faced with multiple domains, a critical portion of parameters behave unexpectedly in a domain-specific manner while others behave in a domain-general one. Motivated by this phenomenon, we for the first time posit that domain-general parameters can underpin a domain-general LM that can be derived from the original LM. To uncover the domain-general LM, we propose to identify domain-general parameters by playing lottery tickets (dubbed doge tickets). In order to intervene the lottery, we propose a domain-general score, which depicts how domain-invariant a parameter is by associating it with the variance. Comprehensive experiments are conducted on the Amazon, Mnli, and OntoNotes datasets. The results show that the doge tickets obtains an improved out-of-domain generalization in comparison with a range of competitive baselines. Analysis results further hint the existence of domain-general parameters and the performance consistency of doge tickets.

源语言	英语
主期刊名	Natural Language Processing and Chinese Computing - 11th CCF International Conference, NLPCC 2022, Proceedings
编辑	Wei Lu, Shujian Huang, Yu Hong, Xiabing Zhou
出版商	Springer Science and Business Media Deutschland GmbH
页	144-156
页数	13
ISBN（印刷版）	9783031171192
DOI	https://doi.org/10.1007/978-3-031-17120-8_12
出版状态	已出版 - 2022
活动	11th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2022 - Guilin, 中国期限: 24 9月 2022 → 25 9月 2022

出版系列

姓名	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
卷	13551 LNAI
ISSN（印刷版）	0302-9743
ISSN（电子版）	1611-3349

会议

会议	11th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2022
国家/地区	中国
市	Guilin
时期	24/09/22 → 25/09/22

访问文件

10.1007/978-3-031-17120-8_12

其它文件与链接

链接到 Scopus 的出版物

引用此

Yang, Y., Zhang, C., Wang, B., & Song, D. (2022). Doge Tickets: Uncovering Domain-General Language Models by Playing Lottery Tickets. 在 W. Lu, S. Huang, Y. Hong, & X. Zhou (编辑), Natural Language Processing and Chinese Computing - 11th CCF International Conference, NLPCC 2022, Proceedings (页码 144-156). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 卷 13551 LNAI). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-17120-8_12

Yang, Yi ; Zhang, Chen ; Wang, Benyou 等. / Doge Tickets : Uncovering Domain-General Language Models by Playing Lottery Tickets. Natural Language Processing and Chinese Computing - 11th CCF International Conference, NLPCC 2022, Proceedings. 编辑 / Wei Lu ; Shujian Huang ; Yu Hong ; Xiabing Zhou. Springer Science and Business Media Deutschland GmbH, 2022. 页码 144-156 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{738ceef76f254dc7a34b6afd30ed2dd8,

title = "Doge Tickets: Uncovering Domain-General Language Models by Playing Lottery Tickets",

abstract = "Over-parameterized pre-trained language models (LMs), have shown an appealing expressive power due to their small learning bias. However, the huge learning capacity of LMs can also lead to large learning variance. In a pilot study, we find that, when faced with multiple domains, a critical portion of parameters behave unexpectedly in a domain-specific manner while others behave in a domain-general one. Motivated by this phenomenon, we for the first time posit that domain-general parameters can underpin a domain-general LM that can be derived from the original LM. To uncover the domain-general LM, we propose to identify domain-general parameters by playing lottery tickets (dubbed doge tickets). In order to intervene the lottery, we propose a domain-general score, which depicts how domain-invariant a parameter is by associating it with the variance. Comprehensive experiments are conducted on the Amazon, Mnli, and OntoNotes datasets. The results show that the doge tickets obtains an improved out-of-domain generalization in comparison with a range of competitive baselines. Analysis results further hint the existence of domain-general parameters and the performance consistency of doge tickets.",

keywords = "Domain generalization, Lottery tickets hypothesis, Pre-trained language model",

author = "Yi Yang and Chen Zhang and Benyou Wang and Dawei Song",

note = "Publisher Copyright: {\textcopyright} 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.; 11th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2022 ; Conference date: 24-09-2022 Through 25-09-2022",

year = "2022",

doi = "10.1007/978-3-031-17120-8_12",

language = "English",

isbn = "9783031171192",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "144--156",

editor = "Wei Lu and Shujian Huang and Yu Hong and Xiabing Zhou",

booktitle = "Natural Language Processing and Chinese Computing - 11th CCF International Conference, NLPCC 2022, Proceedings",

address = "Germany",

}

Yang, Y, Zhang, C, Wang, B & Song, D 2022, Doge Tickets: Uncovering Domain-General Language Models by Playing Lottery Tickets. 在 W Lu, S Huang, Y Hong & X Zhou (编辑), Natural Language Processing and Chinese Computing - 11th CCF International Conference, NLPCC 2022, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 卷 13551 LNAI, Springer Science and Business Media Deutschland GmbH, 页码 144-156, 11th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2022, Guilin, 中国, 24/09/22. https://doi.org/10.1007/978-3-031-17120-8_12

Doge Tickets: Uncovering Domain-General Language Models by Playing Lottery Tickets. / Yang, Yi; Zhang, Chen; Wang, Benyou 等.
Natural Language Processing and Chinese Computing - 11th CCF International Conference, NLPCC 2022, Proceedings. 编辑 / Wei Lu; Shujian Huang; Yu Hong; Xiabing Zhou. Springer Science and Business Media Deutschland GmbH, 2022. 页码 144-156 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 卷 13551 LNAI).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Doge Tickets

T2 - 11th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2022

AU - Yang, Yi

AU - Zhang, Chen

AU - Wang, Benyou

AU - Song, Dawei

PY - 2022

Y1 - 2022

N2 - Over-parameterized pre-trained language models (LMs), have shown an appealing expressive power due to their small learning bias. However, the huge learning capacity of LMs can also lead to large learning variance. In a pilot study, we find that, when faced with multiple domains, a critical portion of parameters behave unexpectedly in a domain-specific manner while others behave in a domain-general one. Motivated by this phenomenon, we for the first time posit that domain-general parameters can underpin a domain-general LM that can be derived from the original LM. To uncover the domain-general LM, we propose to identify domain-general parameters by playing lottery tickets (dubbed doge tickets). In order to intervene the lottery, we propose a domain-general score, which depicts how domain-invariant a parameter is by associating it with the variance. Comprehensive experiments are conducted on the Amazon, Mnli, and OntoNotes datasets. The results show that the doge tickets obtains an improved out-of-domain generalization in comparison with a range of competitive baselines. Analysis results further hint the existence of domain-general parameters and the performance consistency of doge tickets.

AB - Over-parameterized pre-trained language models (LMs), have shown an appealing expressive power due to their small learning bias. However, the huge learning capacity of LMs can also lead to large learning variance. In a pilot study, we find that, when faced with multiple domains, a critical portion of parameters behave unexpectedly in a domain-specific manner while others behave in a domain-general one. Motivated by this phenomenon, we for the first time posit that domain-general parameters can underpin a domain-general LM that can be derived from the original LM. To uncover the domain-general LM, we propose to identify domain-general parameters by playing lottery tickets (dubbed doge tickets). In order to intervene the lottery, we propose a domain-general score, which depicts how domain-invariant a parameter is by associating it with the variance. Comprehensive experiments are conducted on the Amazon, Mnli, and OntoNotes datasets. The results show that the doge tickets obtains an improved out-of-domain generalization in comparison with a range of competitive baselines. Analysis results further hint the existence of domain-general parameters and the performance consistency of doge tickets.

KW - Domain generalization

KW - Lottery tickets hypothesis

KW - Pre-trained language model

UR - http://www.scopus.com/inward/record.url?scp=85140484409&partnerID=8YFLogxK

U2 - 10.1007/978-3-031-17120-8_12

DO - 10.1007/978-3-031-17120-8_12

M3 - Conference contribution

AN - SCOPUS:85140484409

SN - 9783031171192

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 144

EP - 156

BT - Natural Language Processing and Chinese Computing - 11th CCF International Conference, NLPCC 2022, Proceedings

A2 - Lu, Wei

A2 - Huang, Shujian

A2 - Hong, Yu

A2 - Zhou, Xiabing

PB - Springer Science and Business Media Deutschland GmbH

Y2 - 24 September 2022 through 25 September 2022

ER -

Yang Y, Zhang C, Wang B, Song D. Doge Tickets: Uncovering Domain-General Language Models by Playing Lottery Tickets. 在 Lu W, Huang S, Hong Y, Zhou X, 编辑, Natural Language Processing and Chinese Computing - 11th CCF International Conference, NLPCC 2022, Proceedings. Springer Science and Business Media Deutschland GmbH. 2022. 页码 144-156. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-031-17120-8_12

Doge Tickets: Uncovering Domain-General Language Models by Playing Lottery Tickets

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此