Optimizing multi-barrier drinking water treatment through a data-driven process simulator based on machine learning

  • Hongjiao Pang
  • , Yawen Ben
  • , Yanju Zhang
  • , Shen Qu*
  • , Chengzhi Hu*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

The growing variability and complexity of drinking water sources highlight the urgent need for data-driven tools to support sustainable and resource-efficient design of treatment systems. This study develops a machine-learning-based process simulator tailored for multi-barrier drinking water treatment. The simulator integrates global predictive models—LightGBM, XGBoost, and CatBoost—to forecast effluent quality from influent characteristics across multiple treatment configurations, including ozone-biological activated carbon (ozone –BAC) processes positioned either before or after sand filtration. Among these models, CatBoost demonstrated the highest overall predictive accuracy for CODMn, turbidity, pH, temperature, and residual chlorine using operational data from four full-scale water treatment plants, showing consistently strong agreement between predicted and observed values across all parameters. The simulator also achieved robust performance in bacterial classification, with a weighted average precision of 0.84. SHAP (Shapley Additive Explanations) analysis revealed distinct treatment efficiencies: placing ozone –BAC after sand filtration enhanced CODMn removal, while pre-sand filtration positioning favoured turbidity and bacteria removal. Real-world validation at two full-scale water treatment plants confirmed the simulator's ability to guide treatment optimization under poor influent conditions (e.g., CODMn > 7 mg/L), maintaining effluent quality below regulatory thresholds (<3 mg/L). This research demonstrates the potential of integrating machine-learning and process simulation to promote cleaner production, enhance treatment efficiency, and support sustainable drinking water management decisions.

Original languageEnglish
Article number146987
JournalJournal of Cleaner Production
Volume533
DOIs
Publication statusPublished - 20 Nov 2025
Externally publishedYes

Keywords

  • Drinking water treatment
  • Global process simulator
  • Machine learning
  • Multi-barrier water treatment
  • Ozone-biological activated carbon (ozone-BAC)
  • Predictive modelling

Fingerprint

Dive into the research topics of 'Optimizing multi-barrier drinking water treatment through a data-driven process simulator based on machine learning'. Together they form a unique fingerprint.

Cite this