Abstract
Decoding speech from neural recordings has critical importance in application and scientific research. However, this task is still challenging with non-invasive recordings. Previous research has shown significant improvement in speech perception decoding task by leveraging wav2vec vectors and gives the potential for applications. To further explore this problem, we proposed a novel multimodal method by using functional magnetic resonance imaging (fMRI) and magnetoencephalography (MEG). In our method, separate encoders for fMRI and MEG are considered, then features extracted from both modalities are integrated and aligned with wav2vec vectors that were extracted from the speech. The multimodal method reaches averaged performance of 72.6% in top-10 accuracy with a negative sample size of 128. Performance evaluated with various metrics achieves steady improvement across subjects, demonstrating the effectiveness of the proposed data fusion method. Interpretation of the performance increment was also investigated by testing the correlation between encoder hidden outputs and different level of features extracted from the speech. Results demonstrate that MEG encoder learns more low-level information and fMRI encoder learns more high-level information, which indicates both complementary characteristics lead to the improvement. The result of this work shows the potential of multimodal methods for speech decoding.
| Original language | English |
|---|---|
| Journal | Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing |
| DOIs | |
| Publication status | Published - 2025 |
| Externally published | Yes |
| Event | 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 - Hyderabad, India Duration: 6 Apr 2025 → 11 Apr 2025 |
Keywords
- brain-computer interface
- data fusion
- fMRI
- MEG
- speech perception decoding
Fingerprint
Dive into the research topics of 'A Novel Multimodal Method for Decoding Speech Perception from Brain Activities'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver