TY - JOUR
T1 - Selecting the best rather than ranking correctly
T2 - A multi-metrics ranker for summarization
AU - Zhao, Jianfei
AU - Zhang, Feng
AU - Sun, Xin
AU - Feng, Chong
N1 - Publisher Copyright:
© 2025 Elsevier Ltd
PY - 2025/6/1
Y1 - 2025/6/1
N2 - Abstractive summarization models generally face the challenge of underutilizing the search space, meaning that a larger beam size tends to decrease the model's performance. Most works address this issue by proposing novel models with improved results, but the potential of existing models is always overlooked. In this work, we aim to enhance established models by proposing a ranking model that selects the best candidate in the search paths to improve performance directly. Ranking candidates is a wildly used method for performance improvement, but the primary goal is to identify the best candidate rather than achieve perfect ranking accuracy. To achieve this, we adjust the ranking granularity based on candidate similarity and distribute ranking margins according to evaluated scores. This method aligns the ranking objective more closely with the primary goal and reduces CUDA memory usage during training by 30%. Furthermore, we design our model as a Mixture-of-Experts system, where each expert specializes in a specific criterion, enabling the provision of diverse ranking services. We evaluate our method on three wildly used datasets: CNN/DM, XSUM, and NYT. Experimental results demonstrate that our method achieves higher Top-1 accuracy compared to other ranking models and effectively enhances existing summarization models. Further analyses demonstrate that our method possesses strong generalization capabilities, allowing it to perform ranking tasks on unlearned metrics and untrained datasets.
AB - Abstractive summarization models generally face the challenge of underutilizing the search space, meaning that a larger beam size tends to decrease the model's performance. Most works address this issue by proposing novel models with improved results, but the potential of existing models is always overlooked. In this work, we aim to enhance established models by proposing a ranking model that selects the best candidate in the search paths to improve performance directly. Ranking candidates is a wildly used method for performance improvement, but the primary goal is to identify the best candidate rather than achieve perfect ranking accuracy. To achieve this, we adjust the ranking granularity based on candidate similarity and distribute ranking margins according to evaluated scores. This method aligns the ranking objective more closely with the primary goal and reduces CUDA memory usage during training by 30%. Furthermore, we design our model as a Mixture-of-Experts system, where each expert specializes in a specific criterion, enabling the provision of diverse ranking services. We evaluate our method on three wildly used datasets: CNN/DM, XSUM, and NYT. Experimental results demonstrate that our method achieves higher Top-1 accuracy compared to other ranking models and effectively enhances existing summarization models. Further analyses demonstrate that our method possesses strong generalization capabilities, allowing it to perform ranking tasks on unlearned metrics and untrained datasets.
KW - Abstractive summarization
KW - Contrastive learning
KW - Mixture of experts
KW - Ranking model
UR - http://www.scopus.com/inward/record.url?scp=86000741806&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2025.127144
DO - 10.1016/j.eswa.2025.127144
M3 - Article
AN - SCOPUS:86000741806
SN - 0957-4174
VL - 276
JO - Expert Systems with Applications
JF - Expert Systems with Applications
M1 - 127144
ER -