TY - GEN
T1 - Design and Implementation of Binary Function Similarity Analysis System Based on Deep Learning
AU - Sun, Borui
AU - Xu, Yong
AU - Song, Jinyi
AU - Tan, Xiao
AU - Wen, Xuan
AU - Zhao, Yifei
N1 - Publisher Copyright:
© 2025 Technical Committee on Control Theory, Chinese Association of Automation.
PY - 2025
Y1 - 2025
N2 - Binary function similarity analysis is vital for tasks like malware detection, vulnerability identification, and software maintenance. Traditional methods - relying on handcrafted features or rigid structures - often fail to handle diverse architectures, compiler optimizations, and obfuscation. This paper presents a deep learning-based approach that learns semantic vector representations of binary functions directly from assembly instructions. A two-phase embedding model transforms instructions into context-aware vectors, and a self-attention network highlights key instructions and structural patterns. Using a Siamese architecture and contrastive learning, the system maps similar functions closer together, improving accuracy and scalability. Experimental results on a large, varied dataset show consistently high performance under diverse conditions, demonstrating the method's robustness and potential for practical applications. The method's robustness against code variations makes it applicable to aerospace systems, such as verifying avionics firmware integrity or detecting tampered flight control modules.
AB - Binary function similarity analysis is vital for tasks like malware detection, vulnerability identification, and software maintenance. Traditional methods - relying on handcrafted features or rigid structures - often fail to handle diverse architectures, compiler optimizations, and obfuscation. This paper presents a deep learning-based approach that learns semantic vector representations of binary functions directly from assembly instructions. A two-phase embedding model transforms instructions into context-aware vectors, and a self-attention network highlights key instructions and structural patterns. Using a Siamese architecture and contrastive learning, the system maps similar functions closer together, improving accuracy and scalability. Experimental results on a large, varied dataset show consistently high performance under diverse conditions, demonstrating the method's robustness and potential for practical applications. The method's robustness against code variations makes it applicable to aerospace systems, such as verifying avionics firmware integrity or detecting tampered flight control modules.
KW - Aerospace engineering
KW - Binary function similarity
KW - Malicious software
KW - Network security
KW - Self-attention neural network
UR - https://www.scopus.com/pages/publications/105020298764
U2 - 10.23919/CCC64809.2025.11179192
DO - 10.23919/CCC64809.2025.11179192
M3 - Conference contribution
AN - SCOPUS:105020298764
T3 - Chinese Control Conference, CCC
SP - 8513
EP - 8517
BT - Proceedings of the 44th Chinese Control Conference, CCC 2025
A2 - Sun, Jian
A2 - Yin, Hongpeng
PB - IEEE Computer Society
T2 - 44th Chinese Control Conference, CCC 2025
Y2 - 28 July 2025 through 30 July 2025
ER -