TY - GEN
T1 - High-Fidelity Single-Pixel Imaging Using Multi-Head Attention Mechanism
AU - Lu, Hui
AU - Zhan, Xinrui
AU - Bian, Liheng
N1 - Publisher Copyright:
© 2024 SPIE.
PY - 2024
Y1 - 2024
N2 - Single-pixel imaging has gained prominence for its wide working wavelength and high sensitivity. Deep learning-based single-pixel imaging shows superiority in real-time reconstruction, particularly with limited resources. In this work, we report a novel encoder-decoder method for single-pixel imaging, which aims at enhancing imaging quality from extremely low measurement amounts. First, we encode the high-dimensional target information into one-dimensional measurements using globally optimized modulation patterns, implemented by a fully connected or convolutional layer. Second, we integrate a U-Net neural network with an advanced multi-head self-attention mechanism and a pyramid pooling module to decode the measurements and reconstruct high-fidelity images. Under such a strategy, the skip connections within the U-Net structure enhance the preservation of fine image features, and the incorporation of the multi-head self-attention mechanism and pyramid pooling module effectively captures contextual dependencies among low-dimensional measurements, thereby extracting significant image features and enhancing reconstruction quality. The simulation results conducted on the STL-10 dataset validate the efficiency of the reported technique. With a resolution of 96 × 96 pixels and an ultra-low sampling rate of 1%, we consistently achieved the highest image fidelity compared to traditional single-pixel reconstruction methods for both grayscale and color images.
AB - Single-pixel imaging has gained prominence for its wide working wavelength and high sensitivity. Deep learning-based single-pixel imaging shows superiority in real-time reconstruction, particularly with limited resources. In this work, we report a novel encoder-decoder method for single-pixel imaging, which aims at enhancing imaging quality from extremely low measurement amounts. First, we encode the high-dimensional target information into one-dimensional measurements using globally optimized modulation patterns, implemented by a fully connected or convolutional layer. Second, we integrate a U-Net neural network with an advanced multi-head self-attention mechanism and a pyramid pooling module to decode the measurements and reconstruct high-fidelity images. Under such a strategy, the skip connections within the U-Net structure enhance the preservation of fine image features, and the incorporation of the multi-head self-attention mechanism and pyramid pooling module effectively captures contextual dependencies among low-dimensional measurements, thereby extracting significant image features and enhancing reconstruction quality. The simulation results conducted on the STL-10 dataset validate the efficiency of the reported technique. With a resolution of 96 × 96 pixels and an ultra-low sampling rate of 1%, we consistently achieved the highest image fidelity compared to traditional single-pixel reconstruction methods for both grayscale and color images.
KW - multi-head self-attention
KW - pyramid pooling modules
KW - single-pixel imaging
KW - ultra-low sampling rate
UR - http://www.scopus.com/inward/record.url?scp=85214558559&partnerID=8YFLogxK
U2 - 10.1117/12.3036102
DO - 10.1117/12.3036102
M3 - Conference contribution
AN - SCOPUS:85214558559
T3 - Proceedings of SPIE - The International Society for Optical Engineering
BT - Optoelectronic Imaging and Multimedia Technology XI
A2 - Suo, Jinli
A2 - Zheng, Zhenrong
PB - SPIE
T2 - Optoelectronic Imaging and Multimedia Technology XI 2024
Y2 - 13 October 2024 through 15 October 2024
ER -