Title recognition of maximal-length noun phrase based on bilingual co-training

Ye Gang Li; He Yan Huang; Shu Min Shi; Ping Jian; Chao Su

doi:10.13328/j.cnki.jos.004630

Title recognition of maximal-length noun phrase based on bilingual co-training

Ye Gang Li, He Yan Huang^*, Shu Min Shi, Ping Jian, Chao Su

^*Corresponding author for this work

School of Computer Science and Technology

Research output: Contribution to journal › Article › peer-review

1 Citation (Scopus)

Abstract

This article focuses on the problem of weak cross-domain ability on bilingual maximal-length noun phrase recognition. A bilingual noun phrase recognition algorithm based on semi-supervised learning is proposed. The approach can make full use of both the English features and the Chinese features in a unified framework, and it regards the two language corpus as different view of one dataset. Instances with the highest confidence score are selected and merged, and then added to the labeled data set to train the classifier. Experimental results on test sets show the effectiveness of the proposed approach which outperforms 4.52% over the baseline in cross-domain, and 3.08% over the baseline in similar domain.

Original language	English
Pages (from-to)	1615-1625
Number of pages	11
Journal	Ruan Jian Xue Bao/Journal of Software
Volume	26
Issue number	7
DOIs	https://doi.org/10.13328/j.cnki.jos.004630
Publication status	Published - 1 Jul 2015

Keywords

Bilingual co-training
Label projection
Maximal-length noun phrase
Phrase identification
Semi-supervised learning

Access to Document

10.13328/j.cnki.jos.004630

Cite this

Li, Y. G., Huang, H. Y., Shi, S. M., Jian, P., & Su, C. (2015). Title recognition of maximal-length noun phrase based on bilingual co-training. Ruan Jian Xue Bao/Journal of Software, 26(7), 1615-1625. https://doi.org/10.13328/j.cnki.jos.004630

@article{286a8936fc214be0a21369d3809aa9bc,

title = "Title recognition of maximal-length noun phrase based on bilingual co-training",

abstract = "This article focuses on the problem of weak cross-domain ability on bilingual maximal-length noun phrase recognition. A bilingual noun phrase recognition algorithm based on semi-supervised learning is proposed. The approach can make full use of both the English features and the Chinese features in a unified framework, and it regards the two language corpus as different view of one dataset. Instances with the highest confidence score are selected and merged, and then added to the labeled data set to train the classifier. Experimental results on test sets show the effectiveness of the proposed approach which outperforms 4.52% over the baseline in cross-domain, and 3.08% over the baseline in similar domain.",

keywords = "Bilingual co-training, Label projection, Maximal-length noun phrase, Phrase identification, Semi-supervised learning",

author = "Li, {Ye Gang} and Huang, {He Yan} and Shi, {Shu Min} and Ping Jian and Chao Su",

year = "2015",

month = jul,

day = "1",

doi = "10.13328/j.cnki.jos.004630",

language = "English",

volume = "26",

pages = "1615--1625",

journal = "Ruan Jian Xue Bao/Journal of Software",

issn = "1000-9825",

publisher = "Chinese Academy of Sciences",

number = "7",

}

TY - JOUR

T1 - Title recognition of maximal-length noun phrase based on bilingual co-training

AU - Li, Ye Gang

AU - Huang, He Yan

AU - Shi, Shu Min

AU - Jian, Ping

AU - Su, Chao

PY - 2015/7/1

Y1 - 2015/7/1

N2 - This article focuses on the problem of weak cross-domain ability on bilingual maximal-length noun phrase recognition. A bilingual noun phrase recognition algorithm based on semi-supervised learning is proposed. The approach can make full use of both the English features and the Chinese features in a unified framework, and it regards the two language corpus as different view of one dataset. Instances with the highest confidence score are selected and merged, and then added to the labeled data set to train the classifier. Experimental results on test sets show the effectiveness of the proposed approach which outperforms 4.52% over the baseline in cross-domain, and 3.08% over the baseline in similar domain.

AB - This article focuses on the problem of weak cross-domain ability on bilingual maximal-length noun phrase recognition. A bilingual noun phrase recognition algorithm based on semi-supervised learning is proposed. The approach can make full use of both the English features and the Chinese features in a unified framework, and it regards the two language corpus as different view of one dataset. Instances with the highest confidence score are selected and merged, and then added to the labeled data set to train the classifier. Experimental results on test sets show the effectiveness of the proposed approach which outperforms 4.52% over the baseline in cross-domain, and 3.08% over the baseline in similar domain.

KW - Bilingual co-training

KW - Label projection

KW - Maximal-length noun phrase

KW - Phrase identification

KW - Semi-supervised learning

UR - http://www.scopus.com/inward/record.url?scp=84938393054&partnerID=8YFLogxK

U2 - 10.13328/j.cnki.jos.004630

DO - 10.13328/j.cnki.jos.004630

M3 - Article

AN - SCOPUS:84938393054

SN - 1000-9825

VL - 26

SP - 1615

EP - 1625

JO - Ruan Jian Xue Bao/Journal of Software

JF - Ruan Jian Xue Bao/Journal of Software

IS - 7

ER -

Title recognition of maximal-length noun phrase based on bilingual co-training

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this