TY - GEN
T1 - A First Look at Conventional Commits Classification
AU - Zeng, Qunhong
AU - Zhang, Yuxia
AU - Qiu, Zhiqing
AU - Liu, Hui
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Modern distributed software development relies on commits to control system versions. Commit classification plays a vital role in both industry and academia. The widely-used commit classification framework was proposed in 1976 by Swanson and includes three base classes: perfective, corrective, and adaptive. With the increasing complexity of software development, the industry has shifted towards a more fine-grained commit category, i.e., adopting Conventional Commits Specification (CCS) for delicacy management. The new commit framework requires developers to classify commits into ten distinct categories, such as 'feat', 'fix', and 'docs'. However, existing studies mainly focus on the three-category classification, leaving the definition and application of the fine-grained commit categories as knowledge gaps. This paper reports a preliminary study on this mechanism from its application status and problems. We also explore ways to address these identified problems. We find that a growing number of projects on GitHub are adopting CCS. By qualitatively analyzing 194 issues from GitHub and 100 questions from Stack Overflow about the CCS application, we categorized four main challenges developers encountered when using CCS. The most common one is CCS-type confusion. To address these challenges, we propose a clear definition of CCS types based on existing variants. Further, we designed an approach to automatically classify commits into CCS types, and the evaluation results demonstrate a promising performance. Our work facilitates a deeper comprehension of the present fine-grained commit categorization and holds the potential to alleviate application challenges significantly.
AB - Modern distributed software development relies on commits to control system versions. Commit classification plays a vital role in both industry and academia. The widely-used commit classification framework was proposed in 1976 by Swanson and includes three base classes: perfective, corrective, and adaptive. With the increasing complexity of software development, the industry has shifted towards a more fine-grained commit category, i.e., adopting Conventional Commits Specification (CCS) for delicacy management. The new commit framework requires developers to classify commits into ten distinct categories, such as 'feat', 'fix', and 'docs'. However, existing studies mainly focus on the three-category classification, leaving the definition and application of the fine-grained commit categories as knowledge gaps. This paper reports a preliminary study on this mechanism from its application status and problems. We also explore ways to address these identified problems. We find that a growing number of projects on GitHub are adopting CCS. By qualitatively analyzing 194 issues from GitHub and 100 questions from Stack Overflow about the CCS application, we categorized four main challenges developers encountered when using CCS. The most common one is CCS-type confusion. To address these challenges, we propose a clear definition of CCS types based on existing variants. Further, we designed an approach to automatically classify commits into CCS types, and the evaluation results demonstrate a promising performance. Our work facilitates a deeper comprehension of the present fine-grained commit categorization and holds the potential to alleviate application challenges significantly.
KW - Commit Classification
KW - Conventional Commits
KW - Large Language Model
UR - https://www.scopus.com/pages/publications/105010331989
U2 - 10.1109/ICSE55347.2025.00011
DO - 10.1109/ICSE55347.2025.00011
M3 - Conference contribution
AN - SCOPUS:105010331989
T3 - Proceedings - International Conference on Software Engineering
SP - 2277
EP - 2289
BT - Proceedings - 2025 IEEE/ACM 47th International Conference on Software Engineering, ICSE 2025
PB - IEEE Computer Society
T2 - 47th IEEE/ACM International Conference on Software Engineering, ICSE 2025
Y2 - 27 April 2025 through 3 May 2025
ER -