Automated Construction and Mining of Text-Based Modern Chinese Character Databases: A Case Study of Fujian

Xueyan Jian, Wen Yuan*, Wu Yuan, Xinqi Gao, Rong Wang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Historical figures are crucial for understanding historical processes and social changes. However, existing databases of historical figures primarily focused on ancient Chinese individuals and are limited by the simplistic organization of textual information, lacking structured processing. Therefore, this study proposes an automatic method for constructing a spatio-temporal database of modern Chinese figures. The character state transition matrix reveals the spatio-temporal evolution of historical figures, while the random walk algorithm identifies their primary migration patterns. Using historical figures from Fujian Province (1840–2009) as a case study, the results demonstrate that this method effectively constructs the spatio-temporal chain of figures, encompassing time, space, and events. The character state transition matrix indicates a fluctuating trend of state change from 1840 to 2009, initially increasing and then decreasing. By applying keyword extraction and the random walk method, this study finds that the state transitions and their causes align with the historical trends. The four-dimensional analytical framework of “character-time-space-event” established in this study holds significant value for the field of digital humanities.

Original languageEnglish
Article number324
JournalInformation (Switzerland)
Volume16
Issue number4
DOIs
Publication statusPublished - Apr 2025
Externally publishedYes

Keywords

  • character information
  • data mining
  • digital humanities
  • spatio-temporal data

Fingerprint

Dive into the research topics of 'Automated Construction and Mining of Text-Based Modern Chinese Character Databases: A Case Study of Fujian'. Together they form a unique fingerprint.

Cite this