TY - JOUR
T1 - An underwater acoustic target recognition model based on heterogeneous spectral attention feature fusion
AU - Gao, Wei
AU - Chen, Desheng
AU - Liu, Yining
AU - Zhang, Xianda
AU - Zhang, Junhui
N1 - Publisher Copyright:
© 2026 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
PY - 2026/9
Y1 - 2026/9
N2 - Because of the complexity of the marine environment, underwater acoustic target recognition (UATR) faces challenges such as difficulties in feature extraction and low robustness when dealing with low signal-to-noise ratio (SNR) data. To address these issues, an underwater target recognition method based on heterogeneous spectral attention feature fusion is proposed. A multi-feature representation strategy is adopted, in which low-frequency narrowband line spectral features of underwater acoustic signals are comprehensively captured through feature extraction methods including short-time Fourier transform (STFT), Mel-spectrogram (Mel), and constant-Q transform (CQT), thereby constructing a more complete and enriched signal representation space. A multi-stage, hierarchical deep feature modeling framework is systematically designed, comprising a feature enhancement module, a feature fusion module, and a recognition module. The Time-Frequency Transformer (TF-Transformer) and convolutional neural network (CNN) block (TCB) module, which combines TF-Transformers and CNNs, are used to jointly model global and local dependencies in underwater signals, while a cross-attention mechanism is employed to achieve adaptive fusion of heterogeneous spectral features. Experimental results show that the proposed method achieves recognition accuracies of 98.26% and 95.58% on the ShipsEar and DeepShip datasets under low SNR conditions, confirming its superior recognition performance and robustness across diverse datasets and noise environments.
AB - Because of the complexity of the marine environment, underwater acoustic target recognition (UATR) faces challenges such as difficulties in feature extraction and low robustness when dealing with low signal-to-noise ratio (SNR) data. To address these issues, an underwater target recognition method based on heterogeneous spectral attention feature fusion is proposed. A multi-feature representation strategy is adopted, in which low-frequency narrowband line spectral features of underwater acoustic signals are comprehensively captured through feature extraction methods including short-time Fourier transform (STFT), Mel-spectrogram (Mel), and constant-Q transform (CQT), thereby constructing a more complete and enriched signal representation space. A multi-stage, hierarchical deep feature modeling framework is systematically designed, comprising a feature enhancement module, a feature fusion module, and a recognition module. The Time-Frequency Transformer (TF-Transformer) and convolutional neural network (CNN) block (TCB) module, which combines TF-Transformers and CNNs, are used to jointly model global and local dependencies in underwater signals, while a cross-attention mechanism is employed to achieve adaptive fusion of heterogeneous spectral features. Experimental results show that the proposed method achieves recognition accuracies of 98.26% and 95.58% on the ShipsEar and DeepShip datasets under low SNR conditions, confirming its superior recognition performance and robustness across diverse datasets and noise environments.
KW - Attention mechanism
KW - Feature enhancement
KW - Feature fusion
KW - Underwater acoustic target recognition (UATR)
UR - https://www.scopus.com/pages/publications/105034624201
U2 - 10.1016/j.sigpro.2026.110601
DO - 10.1016/j.sigpro.2026.110601
M3 - Article
AN - SCOPUS:105034624201
SN - 0165-1684
VL - 246
JO - Signal Processing
JF - Signal Processing
M1 - 110601
ER -