Turning Dust into Gold: Distilling Complex Reasoning Capabilities from LLMs by Leveraging Negative Data

Yiwei Li; Peiwen Yuan; Shaoxiong Feng; Boyuan Pan; Bin Sun; Xinglin Wang; Heda Wang; Kan Li

doi:10.1609/aaai.v38i17.29821

Turning Dust into Gold: Distilling Complex Reasoning Capabilities from LLMs by Leveraging Negative Data

Yiwei Li, Peiwen Yuan, Shaoxiong Feng, Boyuan Pan, Bin Sun, Xinglin Wang, Heda Wang, Kan Li^*

^*Corresponding author for this work

School of Computer Science and Technology

Research output: Contribution to journal › Conference article › peer-review

Abstract

Large Language Models (LLMs) have performed well on various reasoning tasks, but their inaccessibility and numerous parameters hinder wide application in practice. One promising way is distilling the reasoning ability from LLMs to small models by the generated chain-of-thought reasoning paths. In some cases, however, LLMs may produce incorrect reasoning chains, especially when facing complex mathematical problems. Previous studies only transfer knowledge from positive samples and drop the synthesized data with wrong answers. In this work, we illustrate the merit of negative data and propose a model specialization framework to distill LLMs with negative samples besides positive ones. The framework consists of three progressive steps, covering from training to inference stages, to absorb knowledge from negative data. We conduct extensive experiments across arithmetic reasoning tasks to demonstrate the role of negative data in distillation from LLM.

Original language	English
Pages (from-to)	18591-18599
Number of pages	9
Journal	Proceedings of the AAAI Conference on Artificial Intelligence
Volume	38
Issue number	17
DOIs	https://doi.org/10.1609/aaai.v38i17.29821
Publication status	Published - 25 Mar 2024
Event	38th AAAI Conference on Artificial Intelligence, AAAI 2024 - Vancouver, Canada Duration: 20 Feb 2024 → 27 Feb 2024

Access to Document

10.1609/aaai.v38i17.29821

Cite this

@article{26b280dd8f8b4bf0ae273935dc2047c1,

title = "Turning Dust into Gold: Distilling Complex Reasoning Capabilities from LLMs by Leveraging Negative Data",

abstract = "Large Language Models (LLMs) have performed well on various reasoning tasks, but their inaccessibility and numerous parameters hinder wide application in practice. One promising way is distilling the reasoning ability from LLMs to small models by the generated chain-of-thought reasoning paths. In some cases, however, LLMs may produce incorrect reasoning chains, especially when facing complex mathematical problems. Previous studies only transfer knowledge from positive samples and drop the synthesized data with wrong answers. In this work, we illustrate the merit of negative data and propose a model specialization framework to distill LLMs with negative samples besides positive ones. The framework consists of three progressive steps, covering from training to inference stages, to absorb knowledge from negative data. We conduct extensive experiments across arithmetic reasoning tasks to demonstrate the role of negative data in distillation from LLM.",

author = "Yiwei Li and Peiwen Yuan and Shaoxiong Feng and Boyuan Pan and Bin Sun and Xinglin Wang and Heda Wang and Kan Li",

note = "Publisher Copyright: {\textcopyright} 2024, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.; 38th AAAI Conference on Artificial Intelligence, AAAI 2024 ; Conference date: 20-02-2024 Through 27-02-2024",

year = "2024",

month = mar,

day = "25",

doi = "10.1609/aaai.v38i17.29821",

language = "English",

volume = "38",

pages = "18591--18599",

journal = "Proceedings of the AAAI Conference on Artificial Intelligence",

issn = "2159-5399",

publisher = "Association for the Advancement of Artificial Intelligence",

number = "17",

}

TY - JOUR

T1 - Turning Dust into Gold

T2 - 38th AAAI Conference on Artificial Intelligence, AAAI 2024

AU - Li, Yiwei

AU - Yuan, Peiwen

AU - Feng, Shaoxiong

AU - Pan, Boyuan

AU - Sun, Bin

AU - Wang, Xinglin

AU - Wang, Heda

AU - Li, Kan

PY - 2024/3/25

Y1 - 2024/3/25

N2 - Large Language Models (LLMs) have performed well on various reasoning tasks, but their inaccessibility and numerous parameters hinder wide application in practice. One promising way is distilling the reasoning ability from LLMs to small models by the generated chain-of-thought reasoning paths. In some cases, however, LLMs may produce incorrect reasoning chains, especially when facing complex mathematical problems. Previous studies only transfer knowledge from positive samples and drop the synthesized data with wrong answers. In this work, we illustrate the merit of negative data and propose a model specialization framework to distill LLMs with negative samples besides positive ones. The framework consists of three progressive steps, covering from training to inference stages, to absorb knowledge from negative data. We conduct extensive experiments across arithmetic reasoning tasks to demonstrate the role of negative data in distillation from LLM.

AB - Large Language Models (LLMs) have performed well on various reasoning tasks, but their inaccessibility and numerous parameters hinder wide application in practice. One promising way is distilling the reasoning ability from LLMs to small models by the generated chain-of-thought reasoning paths. In some cases, however, LLMs may produce incorrect reasoning chains, especially when facing complex mathematical problems. Previous studies only transfer knowledge from positive samples and drop the synthesized data with wrong answers. In this work, we illustrate the merit of negative data and propose a model specialization framework to distill LLMs with negative samples besides positive ones. The framework consists of three progressive steps, covering from training to inference stages, to absorb knowledge from negative data. We conduct extensive experiments across arithmetic reasoning tasks to demonstrate the role of negative data in distillation from LLM.

UR - http://www.scopus.com/inward/record.url?scp=85189643705&partnerID=8YFLogxK

U2 - 10.1609/aaai.v38i17.29821

DO - 10.1609/aaai.v38i17.29821

M3 - Conference article

AN - SCOPUS:85189643705

SN - 2159-5399

VL - 38

SP - 18591

EP - 18599

JO - Proceedings of the AAAI Conference on Artificial Intelligence

JF - Proceedings of the AAAI Conference on Artificial Intelligence

IS - 17

Y2 - 20 February 2024 through 27 February 2024

ER -

Turning Dust into Gold: Distilling Complex Reasoning Capabilities from LLMs by Leveraging Negative Data

Abstract

Access to Document

Other files and links

Fingerprint

Cite this