TY - GEN
T1 - Named entity recognition based on bilingual co-training
AU - Li, Yegang
AU - Huang, Heyan
AU - Zhao, Xingjian
AU - Shi, Shumin
PY - 2013
Y1 - 2013
N2 - Named entity recognition (NER) is a very important task in natural language processing (NLP). In this paper we present a semi-supervised approach to extract bilingual named entity, starting from a bilingual corpus where the named entities are extracted independently for each language. Then a bilingual co-training algorithm is used to improve the named entity annotation quality, and iterative process is applied to extract named entity pairs with higher bilingual conformity ratio. This leads to a significant improvement of the monolingual named entity annotation quality for both languages. Experimental result shows that the annotation quality of Chinese NE is improved from 87.17 to 88.28, and improved 80.37 to 81.76 of English NE in F-measure.
AB - Named entity recognition (NER) is a very important task in natural language processing (NLP). In this paper we present a semi-supervised approach to extract bilingual named entity, starting from a bilingual corpus where the named entities are extracted independently for each language. Then a bilingual co-training algorithm is used to improve the named entity annotation quality, and iterative process is applied to extract named entity pairs with higher bilingual conformity ratio. This leads to a significant improvement of the monolingual named entity annotation quality for both languages. Experimental result shows that the annotation quality of Chinese NE is improved from 87.17 to 88.28, and improved 80.37 to 81.76 of English NE in F-measure.
KW - bilingual co-training
KW - named entity recognition
KW - natural language processing
UR - http://www.scopus.com/inward/record.url?scp=84893393509&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-45185-0_50
DO - 10.1007/978-3-642-45185-0_50
M3 - Conference contribution
AN - SCOPUS:84893393509
SN - 9783642451843
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 480
EP - 489
BT - Chinese Lexical Semantics - 14th Workshop, CLSW 2013, Revised Selected Papers
T2 - 14th Workshop on Chinese Lexical Semantics, CLSW 2013
Y2 - 10 May 2013 through 12 May 2013
ER -