Time expression recognition using a constituent-based tagging scheme

Xiaoshi Zhong, Erik Cambria

科研成果: 书/报告/会议事项章节会议稿件同行评审

22 引用 (Scopus)

摘要

We find from four datasets that time expressions are formed by loose structure and the words used to express time information can differentiate time expressions from common text. The findings drive us to design a learning method named TOMN to model time expressions. TOMN defines a time-related tagging scheme named TOMN scheme with four tags, namely \tomnT,\tomnO, \tomnM,and \tomnN, indicating the constituents of time expression, namely \tomnT ime token, \tomnM odifier, \tomnN umeral, and the words \tomnO utside time expression. In modeling, TOMN assigns a word with a TOMN tag under conditional random fields with minimal features. Essentially, our constituent-based TOMN scheme overcomes the problem of inconsistent tag assignment that is caused by the conventional position-based tagging schemes (\eg BIO scheme and BILOU scheme). Experiments show that TOMN is equally or more effective than state-of-the-art methods on various datasets, and much more robust on cross-datasets. Moreover, our analysis can explain many empirical observations in other works about time expression recognition and named entity recognition.

源语言英语
主期刊名The Web Conference 2018 - Proceedings of the World Wide Web Conference, WWW 2018
出版商Association for Computing Machinery, Inc
983-992
页数10
ISBN(电子版)9781450356398
DOI
出版状态已出版 - 10 4月 2018
已对外发布
活动27th International World Wide Web, WWW 2018 - Lyon, 法国
期限: 23 4月 201827 4月 2018

出版系列

姓名The Web Conference 2018 - Proceedings of the World Wide Web Conference, WWW 2018

会议

会议27th International World Wide Web, WWW 2018
国家/地区法国
Lyon
时期23/04/1827/04/18

指纹

探究 'Time expression recognition using a constituent-based tagging scheme' 的科研主题。它们共同构成独一无二的指纹。

引用此