跳到主要导航 跳到搜索 跳到主要内容

Towards improving statistical model based voice activity detection

  • Ming Tu*
  • , Xiang Xie
  • , Yishan Jiao
  • *此作品的通讯作者
  • Beijing Institute of Technology

科研成果: 期刊稿件会议文章同行评审

摘要

Statistical model based voice activity detection (VAD) is commonly used in various speech related research and applications. In this paper, we try to improve the performance of statistical model based VAD via new feature extraction method. Our main innovation focuses on that we apply Mel-frequency subband coefficients with power-law nonlinearity as feature for statistical model based VAD instead of Discrete Fourier Transform (DFT) coefficients. This proposed feature is then modeled by Gaussian distribution. Performances of this method are comprehensively compared with existing methods. Meanwhile we also test power-law nonlinearity on existing methods. Experimental results prove that with proposed subband coefficients the performance of statistical model based VAD could be improved a lot. Power-law nonlinearity on DFT coefficients could also bring some improvement.

指纹

探究 'Towards improving statistical model based voice activity detection' 的科研主题。它们共同构成独一无二的指纹。

引用此