JLeaks: A Featured Resource Leak Repository Collected From Hundreds of Open-Source Java Projects

Tianyang Liu, Weixing Ji*, Xiaohui Dong, Wuhuang Yao, Yizhuo Wang, Hui Liu, Haiyang Peng, Yuxuan Wang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

High-quality defect repositories are vital in defect detection, local-ization, and repair. However, existing repositories collected from open-source projects are either small-scale or inadequately labeled and packed. This paper systematically summarizes the program-ming APIs of system resources (i.e., file, socket, and thread) in Java. Additionally, this paper demonstrates the exceptions that may cause resource leaks in the chained and nested streaming operations. A semi-automatic toolchain is built to improve the efficiency of de-fect extraction, including automatic building for large legacy Java projects. Accordingly, 1,094 resource leaks were collected from 321 open-source projects on GitHub. This repository, named JLeaks, was built by round-by-round filtering and cross-validation, involving the review of approximately 3,185 commits from hundreds of projects. JLeaks is currently the largest resource leak repository, and each defect in JLeaks is well-labeled and packed, including causes, locations, patches, source files, and compiled bytecode files for 254 defects. We have conducted a detailed analysis of JLeaks for defect distribution, root causes, and fix approaches. We compare JLeaks with two well-known resource leak repositories, and the results show that JLeaks is more informative and complete, with high availability, uniqueness, and consistency. Additionally, we show the usability of JLeaks in two application scenarios. Future studies can leverage our repository to encourage better design and implementation of defect-related algorithms and tools.

Original languageEnglish
Title of host publicationProceedings - 2024 ACM/IEEE 44th International Conference on Software Engineering, ICSE 2024
PublisherIEEE Computer Society
Pages1723-1735
Number of pages13
ISBN (Electronic)9798400702174
DOIs
Publication statusPublished - 2024
Event44th ACM/IEEE International Conference on Software Engineering, ICSE 2024 - Lisbon, Portugal
Duration: 14 Apr 202420 Apr 2024

Publication series

NameProceedings - International Conference on Software Engineering
ISSN (Print)0270-5257

Conference

Conference44th ACM/IEEE International Conference on Software Engineering, ICSE 2024
Country/TerritoryPortugal
CityLisbon
Period14/04/2420/04/24

Keywords

  • Defect repository
  • Java language
  • Open-source projects
  • Resource leak

Fingerprint

Dive into the research topics of 'JLeaks: A Featured Resource Leak Repository Collected From Hundreds of Open-Source Java Projects'. Together they form a unique fingerprint.

Cite this