Automatic identifying of maximal length noun phrase

Yegang Li, Heyan Huang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The automatic recognition of the maximal-length noun phrase (MNP) helps to the shallow parsing. In this paper, automatic labeling of Chinese MNP is regarded as a sequential labeling task and Support Vector Machine model (SVM) is employed in the model. We propose a method which takes 2-phase hybrid approach which first identifies base chunk and then identifies MNP. Furthermore, the base chunk features can be exploited to improve performance of MNP recognition. In addition, both left-right and right-left sequential labeling were employed to identify Chinese MNP by bidirectional sequence labeling merging. The data set in the experiments is selected from Penn Chinese Treebank 5.0 Corpus, and split into train set, development set and test set according to the proportion of 4:4:1. Experimental result shows a high quality performance of 90.13% in F1-measure.

Original languageEnglish
Title of host publicationProceedings - 2012 IEEE 2nd International Conference on Cloud Computing and Intelligence Systems, IEEE CCIS 2012
Pages1445-1448
Number of pages4
DOIs
Publication statusPublished - 13 Nov 2013
Event2012 2nd IEEE International Conference on Cloud Computing and Intelligence Systems, IEEE CCIS 2012 - Hangzhou, China
Duration: 30 Oct 20121 Nov 2012

Publication series

NameProceedings - 2012 IEEE 2nd International Conference on Cloud Computing and Intelligence Systems, IEEE CCIS 2012
Volume3

Conference

Conference2012 2nd IEEE International Conference on Cloud Computing and Intelligence Systems, IEEE CCIS 2012
Country/TerritoryChina
CityHangzhou
Period30/10/121/11/12

Keywords

  • 2-phase
  • MNP
  • base chunk feature
  • bidirectional sequence labeling merging

Fingerprint

Dive into the research topics of 'Automatic identifying of maximal length noun phrase'. Together they form a unique fingerprint.

Cite this