Title recognition of maximal-length noun phrase based on bilingual co-training

Ye Gang Li, He Yan Huang*, Shu Min Shi, Ping Jian, Chao Su

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

This article focuses on the problem of weak cross-domain ability on bilingual maximal-length noun phrase recognition. A bilingual noun phrase recognition algorithm based on semi-supervised learning is proposed. The approach can make full use of both the English features and the Chinese features in a unified framework, and it regards the two language corpus as different view of one dataset. Instances with the highest confidence score are selected and merged, and then added to the labeled data set to train the classifier. Experimental results on test sets show the effectiveness of the proposed approach which outperforms 4.52% over the baseline in cross-domain, and 3.08% over the baseline in similar domain.

Original languageEnglish
Pages (from-to)1615-1625
Number of pages11
JournalRuan Jian Xue Bao/Journal of Software
Volume26
Issue number7
DOIs
Publication statusPublished - 1 Jul 2015

Keywords

  • Bilingual co-training
  • Label projection
  • Maximal-length noun phrase
  • Phrase identification
  • Semi-supervised learning

Fingerprint

Dive into the research topics of 'Title recognition of maximal-length noun phrase based on bilingual co-training'. Together they form a unique fingerprint.

Cite this