The prediction of hepatitis e through ensemble learning

Tu Peng, Xiaoya Chen, Ming Wan, Lizhu Jin, Xiaofeng Wang, Xuejie Du, Hui Ge*, Xu Yang*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

10 Citations (Scopus)

Abstract

According to the World Health Organization, about 20 million people are infected with Hepatitis E every year. In 2015, there were 44,000 deaths due to HEV infection worldwide. Food, water and climate are key factors that affect the outbreak of Hepatitis E. This paper presents an ensemble learning model for Hepatitis E prediction by studying the correlation between historical epidemic cases of hepatitis E and environmental factors (water quality and meteorological data). Environmental factors include many features, and ones that are most relevant to HEV are selected and input into the ensemble learning model composed by Gradient Boosting Decision Tree (GBDT) and Random Forest for training and prediction. Three indicators, root mean square error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE), are used to evaluate the effectiveness of the ensemble learning model against the classical time series prediction model. It is concluded that the ensemble learning model has a better prediction effect than the classical model, and the prediction effectiveness can be improved by exploiting water quality and meteorological factors (radiation, air pressure, precipitation).

Original languageEnglish
Article number159
Pages (from-to)1-18
Number of pages18
JournalInternational Journal of Environmental Research and Public Health
Volume18
Issue number1
DOIs
Publication statusPublished - 1 Jan 2020

Keywords

  • Ensemble learning
  • Hepatitis E
  • Prediction

Fingerprint

Dive into the research topics of 'The prediction of hepatitis e through ensemble learning'. Together they form a unique fingerprint.

Cite this