An improvement of the degradation of speaker recognition in continuous cold speech for home assistant

Haojun Ai, Yifeng Wang, Yuhong Yang*, Quanxin Zhang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Citations (Scopus)

Abstract

Home assistant with speech user interfaces is quite welcomed due to its convenience in recent years. With speaker recognition (SR) technology in this application, personalized services (e.g., playing music, making to-do lists) for different family members become reality. However, the SR accuracy may decline sharply when a family has a cold due to the restriction of hardware and response time. In this paper, we propose a dual model updating strategy based on cold detection to maintain all speaker voice models. In this method, time domain and frequency domain features would be combined to detect continuous cold speech. And then, corresponding models would be selected to determine the identity according to the results of the detection. In order to continuously track SR performance based on data of mobile phone usage, a new mobile phone-based speech dataset (PBSD) which contains voice, phone model, and user’s state of physical wellness has been constructed. Besides, the relationship between SR accuracy and users’ state of physical wellness also has been analyzed based on a GMM-UBM framework. Finally, to evaluate performance of the proposed method, experiments focused on SR accuracy of 10 speakers from both cold-suffering and healthy states have been conducted. The results demonstrated that the SR accuracy can be improved effectively by the cold detection-based model updating strategy, especially in a cold-suffering circumstance.

Original languageEnglish
Title of host publicationCyberspace Safety and Security - 11th International Symposium, CSS 2019, Proceedings
EditorsJaideep Vaidya, Xiao Zhang, Jin Li
PublisherSpringer
Pages363-373
Number of pages11
ISBN (Print)9783030373368
DOIs
Publication statusPublished - 2019
Event11th International Symposium on Cyberspace Safety and Security, CSS 2019 - Guangzhou, China
Duration: 1 Dec 20193 Dec 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11982 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference11th International Symposium on Cyberspace Safety and Security, CSS 2019
Country/TerritoryChina
CityGuangzhou
Period1/12/193/12/19

Keywords

  • Cold
  • Database
  • GMM
  • MFCC
  • Speaker recognition

Fingerprint

Dive into the research topics of 'An improvement of the degradation of speaker recognition in continuous cold speech for home assistant'. Together they form a unique fingerprint.

Cite this