Test-based clone detection: An initial try on semantically equivalent methods

Guangjie Li; Hui Liu; Yanjie Jiang; Jiahao Jin

doi:10.1109/ACCESS.2018.2883699

Test-based clone detection: An initial try on semantically equivalent methods

Guangjie Li, Hui Liu^*, Yanjie Jiang, Jiahao Jin

^*Corresponding author for this work

School of Computer Science and Technology

Research output: Contribution to journal › Article › peer-review

9 Citations (Scopus)

Abstract

Most code clone detection approaches identify clones via static source code analysis. Such approaches are effective and efficient in detecting lexically similar clones. However, they are less effective in detecting semantic clones that are similar in functionality but different in implementation. As an initial try to detect semantic clones, in this paper, we propose a test-based approach to detecting methods that are semantically equivalent to API methods. For a given method m, we generate its test cases automatically and search for semantically equivalent API methods by running the generated test cases. If two methods generate the same output on each of the test cases, they are taken as semantically equivalent methods. One of the weakness of test-based clone detection is that it is often time consuming. To reduce the time complexity, we take the following measures. First, we focus on methods instead of arbitrary fragments. Second, for a given method, we only compare it against such API methods whose signatures are highly similar to that of the given method. We evaluate the proposed approach on 10 well-known applications. Evaluation results suggest that it is efficient and accurate, and its precision is up to 98%.

Original language	English
Article number	8550632
Pages (from-to)	77643-77655
Number of pages	13
Journal	IEEE Access
Volume	6
DOIs	https://doi.org/10.1109/ACCESS.2018.2883699
Publication status	Published - 2018

Keywords

Clone detection
lexical similarity
semantic equivalence
test-driven

Access to Document

10.1109/ACCESS.2018.2883699

Cite this

Li, G., Liu, H., Jiang, Y., & Jin, J. (2018). Test-based clone detection: An initial try on semantically equivalent methods. IEEE Access, 6, 77643-77655. Article 8550632. https://doi.org/10.1109/ACCESS.2018.2883699

@article{9fb490207e024f5a83ee46468bddd534,

title = "Test-based clone detection: An initial try on semantically equivalent methods",

abstract = "Most code clone detection approaches identify clones via static source code analysis. Such approaches are effective and efficient in detecting lexically similar clones. However, they are less effective in detecting semantic clones that are similar in functionality but different in implementation. As an initial try to detect semantic clones, in this paper, we propose a test-based approach to detecting methods that are semantically equivalent to API methods. For a given method m, we generate its test cases automatically and search for semantically equivalent API methods by running the generated test cases. If two methods generate the same output on each of the test cases, they are taken as semantically equivalent methods. One of the weakness of test-based clone detection is that it is often time consuming. To reduce the time complexity, we take the following measures. First, we focus on methods instead of arbitrary fragments. Second, for a given method, we only compare it against such API methods whose signatures are highly similar to that of the given method. We evaluate the proposed approach on 10 well-known applications. Evaluation results suggest that it is efficient and accurate, and its precision is up to 98%.",

keywords = "Clone detection, lexical similarity, semantic equivalence, test-driven",

author = "Guangjie Li and Hui Liu and Yanjie Jiang and Jiahao Jin",

note = "Publisher Copyright: {\textcopyright} 2013 IEEE.",

year = "2018",

doi = "10.1109/ACCESS.2018.2883699",

language = "English",

volume = "6",

pages = "77643--77655",

journal = "IEEE Access",

issn = "2169-3536",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Test-based clone detection

T2 - An initial try on semantically equivalent methods

AU - Li, Guangjie

AU - Liu, Hui

AU - Jiang, Yanjie

AU - Jin, Jiahao

PY - 2018

Y1 - 2018

N2 - Most code clone detection approaches identify clones via static source code analysis. Such approaches are effective and efficient in detecting lexically similar clones. However, they are less effective in detecting semantic clones that are similar in functionality but different in implementation. As an initial try to detect semantic clones, in this paper, we propose a test-based approach to detecting methods that are semantically equivalent to API methods. For a given method m, we generate its test cases automatically and search for semantically equivalent API methods by running the generated test cases. If two methods generate the same output on each of the test cases, they are taken as semantically equivalent methods. One of the weakness of test-based clone detection is that it is often time consuming. To reduce the time complexity, we take the following measures. First, we focus on methods instead of arbitrary fragments. Second, for a given method, we only compare it against such API methods whose signatures are highly similar to that of the given method. We evaluate the proposed approach on 10 well-known applications. Evaluation results suggest that it is efficient and accurate, and its precision is up to 98%.

AB - Most code clone detection approaches identify clones via static source code analysis. Such approaches are effective and efficient in detecting lexically similar clones. However, they are less effective in detecting semantic clones that are similar in functionality but different in implementation. As an initial try to detect semantic clones, in this paper, we propose a test-based approach to detecting methods that are semantically equivalent to API methods. For a given method m, we generate its test cases automatically and search for semantically equivalent API methods by running the generated test cases. If two methods generate the same output on each of the test cases, they are taken as semantically equivalent methods. One of the weakness of test-based clone detection is that it is often time consuming. To reduce the time complexity, we take the following measures. First, we focus on methods instead of arbitrary fragments. Second, for a given method, we only compare it against such API methods whose signatures are highly similar to that of the given method. We evaluate the proposed approach on 10 well-known applications. Evaluation results suggest that it is efficient and accurate, and its precision is up to 98%.

KW - Clone detection

KW - lexical similarity

KW - semantic equivalence

KW - test-driven

UR - http://www.scopus.com/inward/record.url?scp=85057845886&partnerID=8YFLogxK

U2 - 10.1109/ACCESS.2018.2883699

DO - 10.1109/ACCESS.2018.2883699

M3 - Article

AN - SCOPUS:85057845886

SN - 2169-3536

VL - 6

SP - 77643

EP - 77655

JO - IEEE Access

JF - IEEE Access

M1 - 8550632

ER -

Test-based clone detection: An initial try on semantically equivalent methods

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this