An Improved LSTM for Language Identification

Qingran Zhan, Liqiang Zhang, Hui Deng, Xiang Xie

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Citations (Scopus)

Abstract

In this paper, we propose a novel framework by combining the phonetic temporal neural model (PTN) with an improved LSTM (IM-LSTM). This is achieved by using an up-down connection from the time t to t+1 in the LSTM structure, which aims to capture the latent information from the previous time step. This updated structure can perform better to discriminate the frame-level phonetic information produced by PTN. On the AP16-OLR language identification dataset, our final model achieves relative growth rate 5.04%, 2.19%, 2.73% on EER and 6.55%, 5.81%, 2.23% on Cavg in 1s, 3s and full-length utterance condition than the standard PTN, respectively. The proposed framework receives a better performance than the standard PTN and other proposed models, particularly in 1s condition. This shows the efficacy and flexibility of the proposed method.

Original languageEnglish
Title of host publication2018 14th IEEE International Conference on Signal Processing Proceedings, ICSP 2018
EditorsYuan Baozong, Ruan Qiuqi, Zhao Yao, An Gaoyun
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages609-612
Number of pages4
ISBN (Electronic)9781538646724
DOIs
Publication statusPublished - 2 Feb 2019
Event14th IEEE International Conference on Signal Processing, ICSP 2018 - Beijing, China
Duration: 12 Aug 201816 Aug 2018

Publication series

NameInternational Conference on Signal Processing Proceedings, ICSP
Volume2018-August

Conference

Conference14th IEEE International Conference on Signal Processing, ICSP 2018
Country/TerritoryChina
CityBeijing
Period12/08/1816/08/18

Keywords

  • LSTM
  • Language identification
  • Temporal information

Fingerprint

Dive into the research topics of 'An Improved LSTM for Language Identification'. Together they form a unique fingerprint.

Cite this