Time expression analysis and recognition using syntactic token types and general heuristic rules

Xiaoshi Zhong, Aixin Sun, Erik Cambria

科研成果: 书/报告/会议事项章节会议稿件同行评审

58 引用 (Scopus)

摘要

Extracting time expressions from free text is a fundamental task for many applications. We analyze time expressions from four different datasets and find that only a small group of words are used to express time information and that the words in time expressions demonstrate similar syntactic behaviour. Based on the findings, we propose a type-based approach named SynTime1 for time expression recognition. Specifically, we define three main syntactic token types, namely time token, modifier, and numeral, to group time-related token regular expressions. On the types we design general heuristic rules to recognize time expressions. In recognition, SynTime first identifies time tokens from raw text, then searches their surroundings for modifiers and numerals to form time segments, and finally merges the time segments to time expressions. As a lightweight rule-based tagger, SynTime runs in real time, and can be easily expanded by simply adding keywords for the text from different domains and different text types. Experiments on benchmark datasets and tweets data show that SynTime outperforms state-of-the-art methods.

源语言英语
主期刊名ACL 2017 - 55th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)
出版商Association for Computational Linguistics (ACL)
420-429
页数10
ISBN(电子版)9781945626753
DOI
出版状态已出版 - 2017
已对外发布
活动55th Annual Meeting of the Association for Computational Linguistics, ACL 2017 - Vancouver, 加拿大
期限: 30 7月 20174 8月 2017

出版系列

姓名ACL 2017 - 55th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)
1

会议

会议55th Annual Meeting of the Association for Computational Linguistics, ACL 2017
国家/地区加拿大
Vancouver
时期30/07/174/08/17

指纹

探究 'Time expression analysis and recognition using syntactic token types and general heuristic rules' 的科研主题。它们共同构成独一无二的指纹。

引用此