TY - JOUR
T1 - Optimizing multi-barrier drinking water treatment through a data-driven process simulator based on machine learning
AU - Pang, Hongjiao
AU - Ben, Yawen
AU - Zhang, Yanju
AU - Qu, Shen
AU - Hu, Chengzhi
N1 - Publisher Copyright:
© 2025 Elsevier Ltd.
PY - 2025/11/20
Y1 - 2025/11/20
N2 - The growing variability and complexity of drinking water sources highlight the urgent need for data-driven tools to support sustainable and resource-efficient design of treatment systems. This study develops a machine-learning-based process simulator tailored for multi-barrier drinking water treatment. The simulator integrates global predictive models—LightGBM, XGBoost, and CatBoost—to forecast effluent quality from influent characteristics across multiple treatment configurations, including ozone-biological activated carbon (ozone –BAC) processes positioned either before or after sand filtration. Among these models, CatBoost demonstrated the highest overall predictive accuracy for CODMn, turbidity, pH, temperature, and residual chlorine using operational data from four full-scale water treatment plants, showing consistently strong agreement between predicted and observed values across all parameters. The simulator also achieved robust performance in bacterial classification, with a weighted average precision of 0.84. SHAP (Shapley Additive Explanations) analysis revealed distinct treatment efficiencies: placing ozone –BAC after sand filtration enhanced CODMn removal, while pre-sand filtration positioning favoured turbidity and bacteria removal. Real-world validation at two full-scale water treatment plants confirmed the simulator's ability to guide treatment optimization under poor influent conditions (e.g., CODMn > 7 mg/L), maintaining effluent quality below regulatory thresholds (<3 mg/L). This research demonstrates the potential of integrating machine-learning and process simulation to promote cleaner production, enhance treatment efficiency, and support sustainable drinking water management decisions.
AB - The growing variability and complexity of drinking water sources highlight the urgent need for data-driven tools to support sustainable and resource-efficient design of treatment systems. This study develops a machine-learning-based process simulator tailored for multi-barrier drinking water treatment. The simulator integrates global predictive models—LightGBM, XGBoost, and CatBoost—to forecast effluent quality from influent characteristics across multiple treatment configurations, including ozone-biological activated carbon (ozone –BAC) processes positioned either before or after sand filtration. Among these models, CatBoost demonstrated the highest overall predictive accuracy for CODMn, turbidity, pH, temperature, and residual chlorine using operational data from four full-scale water treatment plants, showing consistently strong agreement between predicted and observed values across all parameters. The simulator also achieved robust performance in bacterial classification, with a weighted average precision of 0.84. SHAP (Shapley Additive Explanations) analysis revealed distinct treatment efficiencies: placing ozone –BAC after sand filtration enhanced CODMn removal, while pre-sand filtration positioning favoured turbidity and bacteria removal. Real-world validation at two full-scale water treatment plants confirmed the simulator's ability to guide treatment optimization under poor influent conditions (e.g., CODMn > 7 mg/L), maintaining effluent quality below regulatory thresholds (<3 mg/L). This research demonstrates the potential of integrating machine-learning and process simulation to promote cleaner production, enhance treatment efficiency, and support sustainable drinking water management decisions.
KW - Drinking water treatment
KW - Global process simulator
KW - Machine learning
KW - Multi-barrier water treatment
KW - Ozone-biological activated carbon (ozone-BAC)
KW - Predictive modelling
UR - https://www.scopus.com/pages/publications/105021082268
U2 - 10.1016/j.jclepro.2025.146987
DO - 10.1016/j.jclepro.2025.146987
M3 - Article
AN - SCOPUS:105021082268
SN - 0959-6526
VL - 533
JO - Journal of Cleaner Production
JF - Journal of Cleaner Production
M1 - 146987
ER -