TY - GEN
T1 - Automated Software Entity Matching Between Successive Versions
AU - Liu, Bo
AU - Liu, Hui
AU - Niu, Nan
AU - Zhang, Yuxia
AU - Li, Guangjie
AU - Jiang, Yanjie
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Version control systems are widely used to manage the evolution of software applications. However, such version control systems take source code as lines of plain text, and thus they cannot present the evolution of software entities embedded in the source code. To this end, a few approaches have been proposed to match software entities before and after a given commit, known as software entity matching algorithms. However, the accuracy of such algorithms requires further improvement. In this paper, we propose an automated iterative algorithm (called ReMapper) to match software entities between two successive versions. The key insight of ReMapper is that the qualified name, the implementation, and the references of a software entity together can distinguish it from others. It matches software entities iteratively because the mapping depends on the reference-based similarity whereas the reference-based similarity depends on the mapping of entities as well. We evaluated ReMapper on a benchmark consisting of 215 commits from 21 real-world projects. Our evaluation results suggest that ReMapper substantially outperformed the state of the art, reducing the number of mistakes (false positives plus false negatives) substantially by 85.8%. We also evaluated to what extent it may improve the automated refactoring discovery (mining) that relies heavily on automated entity matching. Our evaluation results suggest that it substantially improved the state of the art in refactoring discovery, improving recall by 6.9% and reducing the number of false positives by 72.6%.
AB - Version control systems are widely used to manage the evolution of software applications. However, such version control systems take source code as lines of plain text, and thus they cannot present the evolution of software entities embedded in the source code. To this end, a few approaches have been proposed to match software entities before and after a given commit, known as software entity matching algorithms. However, the accuracy of such algorithms requires further improvement. In this paper, we propose an automated iterative algorithm (called ReMapper) to match software entities between two successive versions. The key insight of ReMapper is that the qualified name, the implementation, and the references of a software entity together can distinguish it from others. It matches software entities iteratively because the mapping depends on the reference-based similarity whereas the reference-based similarity depends on the mapping of entities as well. We evaluated ReMapper on a benchmark consisting of 215 commits from 21 real-world projects. Our evaluation results suggest that ReMapper substantially outperformed the state of the art, reducing the number of mistakes (false positives plus false negatives) substantially by 85.8%. We also evaluated to what extent it may improve the automated refactoring discovery (mining) that relies heavily on automated entity matching. Our evaluation results suggest that it substantially improved the state of the art in refactoring discovery, improving recall by 6.9% and reducing the number of false positives by 72.6%.
KW - Entity Matching
KW - Entity Tracking
KW - Software Evolution
KW - Software Refactoring
UR - http://www.scopus.com/inward/record.url?scp=85178998374&partnerID=8YFLogxK
U2 - 10.1109/ASE56229.2023.00132
DO - 10.1109/ASE56229.2023.00132
M3 - Conference contribution
AN - SCOPUS:85178998374
T3 - Proceedings - 2023 38th IEEE/ACM International Conference on Automated Software Engineering, ASE 2023
SP - 1615
EP - 1627
BT - Proceedings - 2023 38th IEEE/ACM International Conference on Automated Software Engineering, ASE 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 38th IEEE/ACM International Conference on Automated Software Engineering, ASE 2023
Y2 - 11 September 2023 through 15 September 2023
ER -