TY - JOUR
T1 - Measurements-to-Tokens
T2 - A pure Transformer for image-free recognition at arbitrary sampling ratios
AU - Mi, Jia Shuai
AU - Jiang, Hu
AU - Wei, Yu Xiao
AU - Geng, Wen Bin
AU - Xu, Wen Biao
AU - Zhang, Hui Juan
AU - Yu, Yuan Jin
N1 - Publisher Copyright:
© 2026 Elsevier Ltd
PY - 2026/11
Y1 - 2026/11
N2 - Derived from single-pixel imaging, image-free sensing enables efficient semantic interpretation directly from compressed measurements. However, existing deep learning methods are often constrained by fixed sampling ratios, and their architectural designs typically overlook the intrinsic physical modulation mechanisms of the sensing process. In this work, we introduce Measurements-to-Tokens (M2T), a unified framework leveraging a pure Transformer architecture for image-free recognition at arbitrary sampling ratios. Specifically, M2T adapts to varying sampling ratios by employing arbitrarily cropped long-range observation sequences. By treating the sensing process as a sequence modeling task, we explicitly integrate the physical correlation between intensity measurements and their corresponding modulation patterns into semantic tokens. This design enables the network to naturally process inputs of variable lengths, effectively decoupling the model architecture from specific sampling ratios. Extensive analysis demonstrates that M2T achieves state-of-the-art recognition accuracy and adapts to arbitrary sampling ratios. At a 1% sampling ratio, it reaches 96.51% average accuracy on MNIST and outperforms competing methods by 4.15 percentage points on average across two datasets, while remaining robust to noise.
AB - Derived from single-pixel imaging, image-free sensing enables efficient semantic interpretation directly from compressed measurements. However, existing deep learning methods are often constrained by fixed sampling ratios, and their architectural designs typically overlook the intrinsic physical modulation mechanisms of the sensing process. In this work, we introduce Measurements-to-Tokens (M2T), a unified framework leveraging a pure Transformer architecture for image-free recognition at arbitrary sampling ratios. Specifically, M2T adapts to varying sampling ratios by employing arbitrarily cropped long-range observation sequences. By treating the sensing process as a sequence modeling task, we explicitly integrate the physical correlation between intensity measurements and their corresponding modulation patterns into semantic tokens. This design enables the network to naturally process inputs of variable lengths, effectively decoupling the model architecture from specific sampling ratios. Extensive analysis demonstrates that M2T achieves state-of-the-art recognition accuracy and adapts to arbitrary sampling ratios. At a 1% sampling ratio, it reaches 96.51% average accuracy on MNIST and outperforms competing methods by 4.15 percentage points on average across two datasets, while remaining robust to noise.
KW - Arbitrary sampling ratios
KW - Image-free recognition
KW - Single-pixel imaging
KW - Transformer
UR - https://www.scopus.com/pages/publications/105039020000
U2 - 10.1016/j.optlastec.2026.115558
DO - 10.1016/j.optlastec.2026.115558
M3 - Article
AN - SCOPUS:105039020000
SN - 0030-3992
VL - 203
JO - Optics and Laser Technology
JF - Optics and Laser Technology
M1 - 115558
ER -