TY - JOUR
T1 - Bridging the gap between data distribution and model
T2 - Dynamic data distribution optimization for improving critique capabilities of large language models
AU - Xu, Chen
AU - Lan, Tian
AU - Lv, Zhenyu
AU - Dong, Qunxi
AU - Zhang, Jieshuo
AU - Huang, Heyan
AU - Yang, Minqiang
AU - Hu, Bin
N1 - Publisher Copyright:
© 2025 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license. http://creativecommons.org/licenses/by-nc-nd/4.0/
PY - 2026/3/5
Y1 - 2026/3/5
N2 - Critique ability, defined as the capacity to identify and rectify flaws in text generation, is crucial for the applications of Large Language Models (LLMs). As a meta-cognitive capability, enhancing the critique ability of LLMs poses significant challenges. Recent studies have proposed improving this ability through fine-tuning on critique datasets. However, the static data distribution of existing datasets often leads to a mismatch between the training data and the diverse optimization needs of target models, thereby hindering their effectiveness. To address this issue, we introduce a novel Dynamic Iterative Data Distribution Optimization Method (DIDD) that dynamically adjusts training data distributions to align with the specific optimization requirements of target models. Specifically, DIDD detects the vulnerable data distribution of target optimization models by conducting the meta-critique on synthesized test set. The detected vulnerable data distribution are then leveraged to construct the training dataset that aligns with target model more closely, improving the effectiveness of the training dataset. Extensive experimental results across four benchmarks demonstrate that our proposed DIDD effectively alleviates the mismatch between the training dataset and target optimization models.
AB - Critique ability, defined as the capacity to identify and rectify flaws in text generation, is crucial for the applications of Large Language Models (LLMs). As a meta-cognitive capability, enhancing the critique ability of LLMs poses significant challenges. Recent studies have proposed improving this ability through fine-tuning on critique datasets. However, the static data distribution of existing datasets often leads to a mismatch between the training data and the diverse optimization needs of target models, thereby hindering their effectiveness. To address this issue, we introduce a novel Dynamic Iterative Data Distribution Optimization Method (DIDD) that dynamically adjusts training data distributions to align with the specific optimization requirements of target models. Specifically, DIDD detects the vulnerable data distribution of target optimization models by conducting the meta-critique on synthesized test set. The detected vulnerable data distribution are then leveraged to construct the training dataset that aligns with target model more closely, improving the effectiveness of the training dataset. Extensive experimental results across four benchmarks demonstrate that our proposed DIDD effectively alleviates the mismatch between the training dataset and target optimization models.
KW - Automatic evaluation
KW - Critique ability
KW - Large language models
UR - https://www.scopus.com/pages/publications/105024197320
U2 - 10.1016/j.eswa.2025.129878
DO - 10.1016/j.eswa.2025.129878
M3 - Article
AN - SCOPUS:105024197320
SN - 0957-4174
VL - 300
JO - Expert Systems with Applications
JF - Expert Systems with Applications
M1 - 129878
ER -