Congruent audiovisual speech enhances auditory attention decoding with EEG

Zhen Fu, Xihong Wu, Jing Chen*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

8 Citations (Scopus)

Abstract

Objective. The auditory attention decoding (AAD) approach can be used to determine the identity of the attended speaker during an auditory selective attention task, by analyzing measurements of electroencephalography (EEG) data. The AAD approach has the potential to guide the design of speech enhancement algorithms in hearing aids, i.e.To identify the speech stream of listener's interest so that hearing aids algorithms can amplify the target speech and attenuate other distracting sounds. This would consequently result in improved speech understanding and communication and reduced cognitive load, etc. The present work aimed to investigate whether additional visual input (i.e. lipreading) would enhance the AAD performance for normal-hearing listeners. Approach. In a two-Talker scenario, where auditory stimuli of audiobooks narrated by two speakers were presented, multi-channel EEG signals were recorded while participants were selectively attending to one speaker and ignoring the other one. Speakers' mouth movements were recorded during narrating for providing visual stimuli. Stimulus conditions included audio-only, visual input congruent with either (i.e. attended or unattended) speaker, and visual input incongruent with either speaker. The AAD approach was performed separately for each condition to evaluate the effect of additional visual input on AAD. Main results. Relative to the audio-only condition, the AAD performance was found improved by visual input only when it was congruent with the attended speech stream, and the improvement was about 14 percentage points on decoding accuracy. Cortical envelope tracking activities in both auditory and visual cortex were demonstrated stronger for the congruent audiovisual speech condition than other conditions. In addition, a higher AAD robustness was revealed for the congruent audiovisual condition, with reduced channel number and trial duration achieving higher accuracy than the audio-only condition. Significance. The present work complements previous studies and further manifests the feasibility of the AAD-guided design of hearing aids in daily face-To-face conversations. The present work also has a directive significance for designing a low-density EEG setup for the AAD approach.

Original languageEnglish
Article number066033
JournalJournal of Neural Engineering
Volume16
Issue number6
DOIs
Publication statusPublished - 6 Nov 2019
Externally publishedYes

Fingerprint

Dive into the research topics of 'Congruent audiovisual speech enhances auditory attention decoding with EEG'. Together they form a unique fingerprint.

Cite this