Turning Dust into Gold: Distilling Complex Reasoning Capabilities from LLMs by Leveraging Negative Data

Yiwei Li, Peiwen Yuan, Shaoxiong Feng, Boyuan Pan, Bin Sun, Xinglin Wang, Heda Wang, Kan Li*

*此作品的通讯作者

科研成果: 期刊稿件会议文章同行评审

1 引用 (Scopus)

摘要

Large Language Models (LLMs) have performed well on various reasoning tasks, but their inaccessibility and numerous parameters hinder wide application in practice. One promising way is distilling the reasoning ability from LLMs to small models by the generated chain-of-thought reasoning paths. In some cases, however, LLMs may produce incorrect reasoning chains, especially when facing complex mathematical problems. Previous studies only transfer knowledge from positive samples and drop the synthesized data with wrong answers. In this work, we illustrate the merit of negative data and propose a model specialization framework to distill LLMs with negative samples besides positive ones. The framework consists of three progressive steps, covering from training to inference stages, to absorb knowledge from negative data. We conduct extensive experiments across arithmetic reasoning tasks to demonstrate the role of negative data in distillation from LLM.

源语言英语
页(从-至)18591-18599
页数9
期刊Proceedings of the AAAI Conference on Artificial Intelligence
38
17
DOI
出版状态已出版 - 25 3月 2024
活动38th AAAI Conference on Artificial Intelligence, AAAI 2024 - Vancouver, 加拿大
期限: 20 2月 202427 2月 2024

指纹

探究 'Turning Dust into Gold: Distilling Complex Reasoning Capabilities from LLMs by Leveraging Negative Data' 的科研主题。它们共同构成独一无二的指纹。

引用此