Turning Dust into Gold: Distilling Complex Reasoning Capabilities from LLMs by Leveraging Negative Data

Yiwei Li; Peiwen Yuan; Shaoxiong Feng; Boyuan Pan; Bin Sun; Xinglin Wang; Heda Wang; Kan Li

doi:10.1609/aaai.v38i17.29821

Turning Dust into Gold: Distilling Complex Reasoning Capabilities from LLMs by Leveraging Negative Data

Yiwei Li, Peiwen Yuan, Shaoxiong Feng, Boyuan Pan, Bin Sun, Xinglin Wang, Heda Wang, Kan Li^*

^*此作品的通讯作者

计算机学院

科研成果: 期刊稿件 › 会议文章 › 同行评审

1 引用（Scopus）

摘要

Large Language Models (LLMs) have performed well on various reasoning tasks, but their inaccessibility and numerous parameters hinder wide application in practice. One promising way is distilling the reasoning ability from LLMs to small models by the generated chain-of-thought reasoning paths. In some cases, however, LLMs may produce incorrect reasoning chains, especially when facing complex mathematical problems. Previous studies only transfer knowledge from positive samples and drop the synthesized data with wrong answers. In this work, we illustrate the merit of negative data and propose a model specialization framework to distill LLMs with negative samples besides positive ones. The framework consists of three progressive steps, covering from training to inference stages, to absorb knowledge from negative data. We conduct extensive experiments across arithmetic reasoning tasks to demonstrate the role of negative data in distillation from LLM.

源语言	英语
页（从-至）	18591-18599
页数	9
期刊	Proceedings of the AAAI Conference on Artificial Intelligence
卷	38
期	17
DOI	https://doi.org/10.1609/aaai.v38i17.29821
出版状态	已出版 - 25 3月 2024
活动	38th AAAI Conference on Artificial Intelligence, AAAI 2024 - Vancouver, 加拿大期限: 20 2月 2024 → 27 2月 2024

访问文件

10.1609/aaai.v38i17.29821

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{26b280dd8f8b4bf0ae273935dc2047c1,

title = "Turning Dust into Gold: Distilling Complex Reasoning Capabilities from LLMs by Leveraging Negative Data",

abstract = "Large Language Models (LLMs) have performed well on various reasoning tasks, but their inaccessibility and numerous parameters hinder wide application in practice. One promising way is distilling the reasoning ability from LLMs to small models by the generated chain-of-thought reasoning paths. In some cases, however, LLMs may produce incorrect reasoning chains, especially when facing complex mathematical problems. Previous studies only transfer knowledge from positive samples and drop the synthesized data with wrong answers. In this work, we illustrate the merit of negative data and propose a model specialization framework to distill LLMs with negative samples besides positive ones. The framework consists of three progressive steps, covering from training to inference stages, to absorb knowledge from negative data. We conduct extensive experiments across arithmetic reasoning tasks to demonstrate the role of negative data in distillation from LLM.",

author = "Yiwei Li and Peiwen Yuan and Shaoxiong Feng and Boyuan Pan and Bin Sun and Xinglin Wang and Heda Wang and Kan Li",

note = "Publisher Copyright: {\textcopyright} 2024, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.; 38th AAAI Conference on Artificial Intelligence, AAAI 2024 ; Conference date: 20-02-2024 Through 27-02-2024",

year = "2024",

month = mar,

day = "25",

doi = "10.1609/aaai.v38i17.29821",

language = "English",

volume = "38",

pages = "18591--18599",

journal = "Proceedings of the AAAI Conference on Artificial Intelligence",

issn = "2159-5399",

publisher = "Association for the Advancement of Artificial Intelligence",

number = "17",

}

TY - JOUR

T1 - Turning Dust into Gold

T2 - 38th AAAI Conference on Artificial Intelligence, AAAI 2024

AU - Li, Yiwei

AU - Yuan, Peiwen

AU - Feng, Shaoxiong

AU - Pan, Boyuan

AU - Sun, Bin

AU - Wang, Xinglin

AU - Wang, Heda

AU - Li, Kan

PY - 2024/3/25

Y1 - 2024/3/25

N2 - Large Language Models (LLMs) have performed well on various reasoning tasks, but their inaccessibility and numerous parameters hinder wide application in practice. One promising way is distilling the reasoning ability from LLMs to small models by the generated chain-of-thought reasoning paths. In some cases, however, LLMs may produce incorrect reasoning chains, especially when facing complex mathematical problems. Previous studies only transfer knowledge from positive samples and drop the synthesized data with wrong answers. In this work, we illustrate the merit of negative data and propose a model specialization framework to distill LLMs with negative samples besides positive ones. The framework consists of three progressive steps, covering from training to inference stages, to absorb knowledge from negative data. We conduct extensive experiments across arithmetic reasoning tasks to demonstrate the role of negative data in distillation from LLM.

AB - Large Language Models (LLMs) have performed well on various reasoning tasks, but their inaccessibility and numerous parameters hinder wide application in practice. One promising way is distilling the reasoning ability from LLMs to small models by the generated chain-of-thought reasoning paths. In some cases, however, LLMs may produce incorrect reasoning chains, especially when facing complex mathematical problems. Previous studies only transfer knowledge from positive samples and drop the synthesized data with wrong answers. In this work, we illustrate the merit of negative data and propose a model specialization framework to distill LLMs with negative samples besides positive ones. The framework consists of three progressive steps, covering from training to inference stages, to absorb knowledge from negative data. We conduct extensive experiments across arithmetic reasoning tasks to demonstrate the role of negative data in distillation from LLM.

UR - http://www.scopus.com/inward/record.url?scp=85189643705&partnerID=8YFLogxK

U2 - 10.1609/aaai.v38i17.29821

DO - 10.1609/aaai.v38i17.29821

M3 - Conference article

AN - SCOPUS:85189643705

SN - 2159-5399

VL - 38

SP - 18591

EP - 18599

JO - Proceedings of the AAAI Conference on Artificial Intelligence

JF - Proceedings of the AAAI Conference on Artificial Intelligence

IS - 17

Y2 - 20 February 2024 through 27 February 2024

ER -

Turning Dust into Gold: Distilling Complex Reasoning Capabilities from LLMs by Leveraging Negative Data

摘要

访问文件

其它文件与链接

指纹

引用此