Automated Software Entity Matching Between Successive Versions

Bo Liu, Hui Liu*, Nan Niu, Yuxia Zhang, Guangjie Li, Yanjie Jiang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

Version control systems are widely used to manage the evolution of software applications. However, such version control systems take source code as lines of plain text, and thus they cannot present the evolution of software entities embedded in the source code. To this end, a few approaches have been proposed to match software entities before and after a given commit, known as software entity matching algorithms. However, the accuracy of such algorithms requires further improvement. In this paper, we propose an automated iterative algorithm (called ReMapper) to match software entities between two successive versions. The key insight of ReMapper is that the qualified name, the implementation, and the references of a software entity together can distinguish it from others. It matches software entities iteratively because the mapping depends on the reference-based similarity whereas the reference-based similarity depends on the mapping of entities as well. We evaluated ReMapper on a benchmark consisting of 215 commits from 21 real-world projects. Our evaluation results suggest that ReMapper substantially outperformed the state of the art, reducing the number of mistakes (false positives plus false negatives) substantially by 85.8%. We also evaluated to what extent it may improve the automated refactoring discovery (mining) that relies heavily on automated entity matching. Our evaluation results suggest that it substantially improved the state of the art in refactoring discovery, improving recall by 6.9% and reducing the number of false positives by 72.6%.

Original languageEnglish
Title of host publicationProceedings - 2023 38th IEEE/ACM International Conference on Automated Software Engineering, ASE 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1615-1627
Number of pages13
ISBN (Electronic)9798350329964
DOIs
Publication statusPublished - 2023
Event38th IEEE/ACM International Conference on Automated Software Engineering, ASE 2023 - Echternach, Luxembourg
Duration: 11 Sept 202315 Sept 2023

Publication series

NameProceedings - 2023 38th IEEE/ACM International Conference on Automated Software Engineering, ASE 2023

Conference

Conference38th IEEE/ACM International Conference on Automated Software Engineering, ASE 2023
Country/TerritoryLuxembourg
CityEchternach
Period11/09/2315/09/23

Keywords

  • Entity Matching
  • Entity Tracking
  • Software Evolution
  • Software Refactoring

Fingerprint

Dive into the research topics of 'Automated Software Entity Matching Between Successive Versions'. Together they form a unique fingerprint.

Cite this