中文文本蕴含气象灾害事件信息多模型融合抽取方法

Translated title of the contribution: Multi-model Fusion Extraction Method for Chinese Text Implicative Meteorological Disasters Event Information

Duanmu Hu, Wu Yuan, Fangqu Niu, Wen Yuan*, Aiai Han

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

4 Citations (Scopus)

Abstract

With global warming, the frequency of extreme weather events and major meteorological disasters is increasing globally. It is important to study the relationship between climate change and the frequency of meteorological disasters for disaster prevention and mitigation in the context of climate change. In this paper, a method is proposed for automatic extraction of spatial and temporal events of meteorological disasters based on natural language processing technology. Because there is a huge amount of spatial and temporal information of meteorological disasters available in literature and web data. Specifically, (1) A coarse-to-fine method was proposed to build a training corpus of meteorological disaster annotations based on professional literature. Firstly, a unified meteorological disaster knowledge system oriented to textual events is constructed to address the problems of ambiguity and incompatibility of different literature materials. Then a coarse annotation method based on chapter structure was constructed, and a Labeled LDA model-based and a fine-grained annotated corpus screening method based on TF-IDF and N-gram models were developed for long texts (modern texts) and short texts (literary texts), respectively, solving the problem of rapid corpus construction; (2) A method for automatic classification of spatiotemporal events of meteorological disasters based on the BERT-CNN model, which integrates contextual semantic features and local semantic features at multiple granularities, was developed for the integrated processing of short and long texts; (3) Using this method, the spatiotemporal events of meteorological disasters were automatically extracted from the textual and web data, and their macro F1 values reached 89.09% and 80.06%, respectively. The spatiotemporal distributions of major events of meteorological disasters were highly correlated with professional statistics; (4) Based on the above results, the spatiotemporal evolution of disasters in various historical periods in China was also reconstructed. We found that the overall volume of disaster data in each period showed a gradual increasing trend, with heavy rainfall disasters, floods, and droughts being the main types of disasters in China. Our method enables both the automatic extraction of long text events from the web and the automatic detection of short text events from literatures, providing a new technique for application of text data to meteorological disaster research and monitoring.

Translated title of the contributionMulti-model Fusion Extraction Method for Chinese Text Implicative Meteorological Disasters Event Information
Original languageChinese (Traditional)
Pages (from-to)2342-2355
Number of pages14
JournalJournal of Geo-Information Science
Volume24
Issue number12
DOIs
Publication statusPublished - 25 Dec 2022

Fingerprint

Dive into the research topics of 'Multi-model Fusion Extraction Method for Chinese Text Implicative Meteorological Disasters Event Information'. Together they form a unique fingerprint.

Cite this