TY - JOUR
T1 - Obfuscation-resilient detection of Android third-party libraries using multi-scale code dependency fusion
AU - Zhang, Zhao
AU - Luo, Senlin
AU - Lu, Yongxin
AU - Pan, Limin
N1 - Publisher Copyright:
© 2024 Elsevier B.V.
PY - 2025/5
Y1 - 2025/5
N2 - Third-Party Library (TPL) detection is a crucial aspect of Android application security assessment, but it faces significant challenges due to code obfuscation. Existing methods often rely on single-scale features, such as class dependencies or instruction opcodes. This reliance can overlook critical dependencies, leading to incomplete library representation and reduced detection recall. Furthermore, the high similarity between a TPL and its adjacent versions causes overlaps in the feature space, reducing the accuracy of version identification. To address these limitations, we propose LibMD, a multi-scale code dependency fusion approach for TPL detection in Android apps. LibMD enhances library code representation by combining class reference syntax augmentation, cross-scale function mapping, and control flow reconstruction of basic blocks. It also extracts metadata dependencies and constructs a library dependency graph that integrates app-code similarity with multiple libraries. By applying Bayes’ theorem to compute posterior probabilities, LibMD effectively evaluates the likelihood of TPL integration and improves the precision of library version identification. Experimental results demonstrate that LibMD outperforms state-of-the-art methods across diverse datasets, achieving robust TPL detection and accurate version identification, even under various obfuscation techniques.
AB - Third-Party Library (TPL) detection is a crucial aspect of Android application security assessment, but it faces significant challenges due to code obfuscation. Existing methods often rely on single-scale features, such as class dependencies or instruction opcodes. This reliance can overlook critical dependencies, leading to incomplete library representation and reduced detection recall. Furthermore, the high similarity between a TPL and its adjacent versions causes overlaps in the feature space, reducing the accuracy of version identification. To address these limitations, we propose LibMD, a multi-scale code dependency fusion approach for TPL detection in Android apps. LibMD enhances library code representation by combining class reference syntax augmentation, cross-scale function mapping, and control flow reconstruction of basic blocks. It also extracts metadata dependencies and constructs a library dependency graph that integrates app-code similarity with multiple libraries. By applying Bayes’ theorem to compute posterior probabilities, LibMD effectively evaluates the likelihood of TPL integration and improves the precision of library version identification. Experimental results demonstrate that LibMD outperforms state-of-the-art methods across diverse datasets, achieving robust TPL detection and accurate version identification, even under various obfuscation techniques.
KW - Code obfuscation
KW - Identification of adjacent versions
KW - Library dependency graph
KW - Third-party library
UR - http://www.scopus.com/inward/record.url?scp=85213943176&partnerID=8YFLogxK
U2 - 10.1016/j.inffus.2024.102908
DO - 10.1016/j.inffus.2024.102908
M3 - Article
AN - SCOPUS:85213943176
SN - 1566-2535
VL - 117
JO - Information Fusion
JF - Information Fusion
M1 - 102908
ER -