Correlation between audio-visual enhancement of speech in different noise environments and SNR: A combined behavioral and electrophysiological study

B. Liu; Y. Lin; X. Gao; J. Dang

doi:10.1016/j.neuroscience.2013.05.007

Correlation between audio-visual enhancement of speech in different noise environments and SNR: A combined behavioral and electrophysiological study

B. Liu^*, Y. Lin, X. Gao, J. Dang

^*此作品的通讯作者

科研成果: 期刊稿件 › 文章 › 同行评审

19 引用（Scopus）

摘要

In the present study, we investigated the multisensory gain as the difference of speech recognition accuracies between the audio-visual (AV) and auditory-only (A) conditions, and the multisensory gain as the difference between the event-related potentials (ERPs) evoked under the AV condition and the sum of the ERPs evoked under the A and visual-only (V) conditions in different noise environments. Videos of a female speaker articulating the Chinese monosyllable words accompanied with different levels of pink noise were used as the stimulus materials. The selected signal-to-noise ratios (SNRs) were -16, -12, -8, -4 and 0 dB. Under the A, V and AV conditions the accuracy of the speech recognition was measured and the ERPs evoked under different conditions were analyzed, respectively. The behavioral results showed that the maximum gain as the difference of speech recognition accuracies between the AV and A conditions was at the -12 dB SNR. The ERP results showed that the multisensory gain as the difference between the ERPs evoked under the AV condition and the sum of ERPs evoked under the A and V conditions at the -12 dB SNR was significantly higher than those at the other SNRs in the time window of 130-200. ms in the area from frontal to central region. The multisensory gains in audio-visual speech recognition at different SNRs were not completely accordant with the principle of inverse effectiveness, but confirmed to cross-modal stochastic resonance.

源语言	英语
页（从-至）	145-151
页数	7
期刊	Neuroscience
卷	247
DOI	https://doi.org/10.1016/j.neuroscience.2013.05.007
出版状态	已出版 - 5 9月 2013
已对外发布	是

访问文件

10.1016/j.neuroscience.2013.05.007

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{29e6fe3f154e414cab40cea1362a474e,

title = "Correlation between audio-visual enhancement of speech in different noise environments and SNR: A combined behavioral and electrophysiological study",

abstract = "In the present study, we investigated the multisensory gain as the difference of speech recognition accuracies between the audio-visual (AV) and auditory-only (A) conditions, and the multisensory gain as the difference between the event-related potentials (ERPs) evoked under the AV condition and the sum of the ERPs evoked under the A and visual-only (V) conditions in different noise environments. Videos of a female speaker articulating the Chinese monosyllable words accompanied with different levels of pink noise were used as the stimulus materials. The selected signal-to-noise ratios (SNRs) were -16, -12, -8, -4 and 0 dB. Under the A, V and AV conditions the accuracy of the speech recognition was measured and the ERPs evoked under different conditions were analyzed, respectively. The behavioral results showed that the maximum gain as the difference of speech recognition accuracies between the AV and A conditions was at the -12 dB SNR. The ERP results showed that the multisensory gain as the difference between the ERPs evoked under the AV condition and the sum of ERPs evoked under the A and V conditions at the -12 dB SNR was significantly higher than those at the other SNRs in the time window of 130-200. ms in the area from frontal to central region. The multisensory gains in audio-visual speech recognition at different SNRs were not completely accordant with the principle of inverse effectiveness, but confirmed to cross-modal stochastic resonance.",

keywords = "Audio-visual speech recognition, Cross-modal stochastic resonance, ERPs, Multisensory gain, SNR",

author = "B. Liu and Y. Lin and X. Gao and J. Dang",

year = "2013",

month = sep,

day = "5",

doi = "10.1016/j.neuroscience.2013.05.007",

language = "English",

volume = "247",

pages = "145--151",

journal = "Neuroscience",

issn = "0306-4522",

publisher = "Elsevier Ltd.",

}

TY - JOUR

T1 - Correlation between audio-visual enhancement of speech in different noise environments and SNR

T2 - A combined behavioral and electrophysiological study

AU - Liu, B.

AU - Lin, Y.

AU - Gao, X.

AU - Dang, J.

PY - 2013/9/5

Y1 - 2013/9/5

N2 - In the present study, we investigated the multisensory gain as the difference of speech recognition accuracies between the audio-visual (AV) and auditory-only (A) conditions, and the multisensory gain as the difference between the event-related potentials (ERPs) evoked under the AV condition and the sum of the ERPs evoked under the A and visual-only (V) conditions in different noise environments. Videos of a female speaker articulating the Chinese monosyllable words accompanied with different levels of pink noise were used as the stimulus materials. The selected signal-to-noise ratios (SNRs) were -16, -12, -8, -4 and 0 dB. Under the A, V and AV conditions the accuracy of the speech recognition was measured and the ERPs evoked under different conditions were analyzed, respectively. The behavioral results showed that the maximum gain as the difference of speech recognition accuracies between the AV and A conditions was at the -12 dB SNR. The ERP results showed that the multisensory gain as the difference between the ERPs evoked under the AV condition and the sum of ERPs evoked under the A and V conditions at the -12 dB SNR was significantly higher than those at the other SNRs in the time window of 130-200. ms in the area from frontal to central region. The multisensory gains in audio-visual speech recognition at different SNRs were not completely accordant with the principle of inverse effectiveness, but confirmed to cross-modal stochastic resonance.

AB - In the present study, we investigated the multisensory gain as the difference of speech recognition accuracies between the audio-visual (AV) and auditory-only (A) conditions, and the multisensory gain as the difference between the event-related potentials (ERPs) evoked under the AV condition and the sum of the ERPs evoked under the A and visual-only (V) conditions in different noise environments. Videos of a female speaker articulating the Chinese monosyllable words accompanied with different levels of pink noise were used as the stimulus materials. The selected signal-to-noise ratios (SNRs) were -16, -12, -8, -4 and 0 dB. Under the A, V and AV conditions the accuracy of the speech recognition was measured and the ERPs evoked under different conditions were analyzed, respectively. The behavioral results showed that the maximum gain as the difference of speech recognition accuracies between the AV and A conditions was at the -12 dB SNR. The ERP results showed that the multisensory gain as the difference between the ERPs evoked under the AV condition and the sum of ERPs evoked under the A and V conditions at the -12 dB SNR was significantly higher than those at the other SNRs in the time window of 130-200. ms in the area from frontal to central region. The multisensory gains in audio-visual speech recognition at different SNRs were not completely accordant with the principle of inverse effectiveness, but confirmed to cross-modal stochastic resonance.

KW - Audio-visual speech recognition

KW - Cross-modal stochastic resonance

KW - ERPs

KW - Multisensory gain

KW - SNR

UR - http://www.scopus.com/inward/record.url?scp=84879474611&partnerID=8YFLogxK

U2 - 10.1016/j.neuroscience.2013.05.007

DO - 10.1016/j.neuroscience.2013.05.007

M3 - Article

C2 - 23673276

AN - SCOPUS:84879474611

SN - 0306-4522

VL - 247

SP - 145

EP - 151

JO - Neuroscience

JF - Neuroscience

ER -

Correlation between audio-visual enhancement of speech in different noise environments and SNR: A combined behavioral and electrophysiological study

摘要

访问文件

其它文件与链接

指纹

引用此