An improved fully convolutional network based on post-processing with global variance equalization and noise-aware training for speech enhancement

Wenlong Li; Kaoru Hirota; Yaping Dai; Zhiyang Jia

doi:10.20965/JACIII.2021.P0130

An improved fully convolutional network based on post-processing with global variance equalization and noise-aware training for speech enhancement

Wenlong Li, Kaoru Hirota, Yaping Dai, Zhiyang Jia^*

^*Corresponding author for this work

School of Automation

Beijing Institute of Technology

Research output: Contribution to journal › Article › peer-review

1 Citation (Scopus)

Abstract

An improved fully convolutional network based on post-processing with global variance (GV) equalization and noise-aware training (PN-FCN) for speech enhancement model is proposed. It aims at reducing the complexity of the speech improvement system, and it solves overly smooth speech signal spectrogram problem and poor generalization capability. The PN-FCN is fed with the noisy speech samples augmented with an estimate of the noise. In this way, the PN-FCN uses additional online noise information to better predict the clean speech. Besides, PN-FCN uses the global variance information, which improve the subjective score in a voice conversion task. Finally, the proposed framework adopts FCN, and the number of parameters is one-seventh of deep neural network (DNN). Results of experiments on the Valentini-Botinhaos dataset demonstrate that the proposed framework achieves improvements in both denoising effect and model training speed.

Original language	English
Pages (from-to)	130-137
Number of pages	8
Journal	Journal of Advanced Computational Intelligence and Intelligent Informatics
Volume	25
Issue number	1
DOIs	https://doi.org/10.20965/JACIII.2021.P0130
Publication status	Published - 20 Jan 2021

Keywords

Fully convolutional network
Noise-aware training
Post-processing with global variance equalization
Speech enhancement

Access to Document

10.20965/JACIII.2021.P0130

Cite this

Li, W., Hirota, K., Dai, Y., & Jia, Z. (2021). An improved fully convolutional network based on post-processing with global variance equalization and noise-aware training for speech enhancement. Journal of Advanced Computational Intelligence and Intelligent Informatics, 25(1), 130-137. https://doi.org/10.20965/JACIII.2021.P0130

@article{70d219c0d8744839a70c52bad4a05f8f,

title = "An improved fully convolutional network based on post-processing with global variance equalization and noise-aware training for speech enhancement",

abstract = "An improved fully convolutional network based on post-processing with global variance (GV) equalization and noise-aware training (PN-FCN) for speech enhancement model is proposed. It aims at reducing the complexity of the speech improvement system, and it solves overly smooth speech signal spectrogram problem and poor generalization capability. The PN-FCN is fed with the noisy speech samples augmented with an estimate of the noise. In this way, the PN-FCN uses additional online noise information to better predict the clean speech. Besides, PN-FCN uses the global variance information, which improve the subjective score in a voice conversion task. Finally, the proposed framework adopts FCN, and the number of parameters is one-seventh of deep neural network (DNN). Results of experiments on the Valentini-Botinhaos dataset demonstrate that the proposed framework achieves improvements in both denoising effect and model training speed.",

keywords = "Fully convolutional network, Noise-aware training, Post-processing with global variance equalization, Speech enhancement",

author = "Wenlong Li and Kaoru Hirota and Yaping Dai and Zhiyang Jia",

year = "2021",

month = jan,

day = "20",

doi = "10.20965/JACIII.2021.P0130",

language = "English",

volume = "25",

pages = "130--137",

journal = "Journal of Advanced Computational Intelligence and Intelligent Informatics",

issn = "1343-0130",

publisher = "Fuji Technology Press",

number = "1",

}

An improved fully convolutional network based on post-processing with global variance equalization and noise-aware training for speech enhancement. / Li, Wenlong; Hirota, Kaoru; Dai, Yaping et al.
In: Journal of Advanced Computational Intelligence and Intelligent Informatics, Vol. 25, No. 1, 20.01.2021, p. 130-137.

Research output: Contribution to journal › Article › peer-review

TY - JOUR

T1 - An improved fully convolutional network based on post-processing with global variance equalization and noise-aware training for speech enhancement

AU - Li, Wenlong

AU - Hirota, Kaoru

AU - Dai, Yaping

AU - Jia, Zhiyang

PY - 2021/1/20

Y1 - 2021/1/20

N2 - An improved fully convolutional network based on post-processing with global variance (GV) equalization and noise-aware training (PN-FCN) for speech enhancement model is proposed. It aims at reducing the complexity of the speech improvement system, and it solves overly smooth speech signal spectrogram problem and poor generalization capability. The PN-FCN is fed with the noisy speech samples augmented with an estimate of the noise. In this way, the PN-FCN uses additional online noise information to better predict the clean speech. Besides, PN-FCN uses the global variance information, which improve the subjective score in a voice conversion task. Finally, the proposed framework adopts FCN, and the number of parameters is one-seventh of deep neural network (DNN). Results of experiments on the Valentini-Botinhaos dataset demonstrate that the proposed framework achieves improvements in both denoising effect and model training speed.

AB - An improved fully convolutional network based on post-processing with global variance (GV) equalization and noise-aware training (PN-FCN) for speech enhancement model is proposed. It aims at reducing the complexity of the speech improvement system, and it solves overly smooth speech signal spectrogram problem and poor generalization capability. The PN-FCN is fed with the noisy speech samples augmented with an estimate of the noise. In this way, the PN-FCN uses additional online noise information to better predict the clean speech. Besides, PN-FCN uses the global variance information, which improve the subjective score in a voice conversion task. Finally, the proposed framework adopts FCN, and the number of parameters is one-seventh of deep neural network (DNN). Results of experiments on the Valentini-Botinhaos dataset demonstrate that the proposed framework achieves improvements in both denoising effect and model training speed.

KW - Fully convolutional network

KW - Noise-aware training

KW - Post-processing with global variance equalization

KW - Speech enhancement

UR - http://www.scopus.com/inward/record.url?scp=85100507691&partnerID=8YFLogxK

U2 - 10.20965/JACIII.2021.P0130

DO - 10.20965/JACIII.2021.P0130

M3 - Article

AN - SCOPUS:85100507691

SN - 1343-0130

VL - 25

SP - 130

EP - 137

JO - Journal of Advanced Computational Intelligence and Intelligent Informatics

JF - Journal of Advanced Computational Intelligence and Intelligent Informatics

IS - 1

ER -

An improved fully convolutional network based on post-processing with global variance equalization and noise-aware training for speech enhancement

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this