Named entity recognition based on bilingual co-training

Yegang Li, Heyan Huang*, Xingjian Zhao, Shumin Shi

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Citations (Scopus)

Abstract

Named entity recognition (NER) is a very important task in natural language processing (NLP). In this paper we present a semi-supervised approach to extract bilingual named entity, starting from a bilingual corpus where the named entities are extracted independently for each language. Then a bilingual co-training algorithm is used to improve the named entity annotation quality, and iterative process is applied to extract named entity pairs with higher bilingual conformity ratio. This leads to a significant improvement of the monolingual named entity annotation quality for both languages. Experimental result shows that the annotation quality of Chinese NE is improved from 87.17 to 88.28, and improved 80.37 to 81.76 of English NE in F-measure.

Original languageEnglish
Title of host publicationChinese Lexical Semantics - 14th Workshop, CLSW 2013, Revised Selected Papers
Pages480-489
Number of pages10
DOIs
Publication statusPublished - 2013
Event14th Workshop on Chinese Lexical Semantics, CLSW 2013 - Zhengzhou, China
Duration: 10 May 201312 May 2013

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8229 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference14th Workshop on Chinese Lexical Semantics, CLSW 2013
Country/TerritoryChina
CityZhengzhou
Period10/05/1312/05/13

Keywords

  • bilingual co-training
  • named entity recognition
  • natural language processing

Fingerprint

Dive into the research topics of 'Named entity recognition based on bilingual co-training'. Together they form a unique fingerprint.

Cite this