The browser you are using is not supported by this website. All versions of Internet Explorer are no longer supported, either by us or Microsoft (read more here: https://www.microsoft.com/en-us/microsoft-365/windows/end-of-ie-support).

Please use a modern browser to fully experience our website, such as the newest versions of Edge, Chrome, Firefox or Safari etc.

Mapping Groundwater Contamination with Machine Learning in Iran

Cover of the journal "Environmental Science and Pollution Research"

CMES researcher Hossein Hashemi has co-authored a new article published in the journal Environmental Science and Pollution Research.

The article, "A machine learning framework for spatio‑temporal vulnerability mapping of groundwaters to nitrate in a data scarce region in Lenjanat Plain, Iran", was authored by Reza Jalali (Shahid Chamran University of Ahvaz), Parvaneh Tishehzan (Shahid Chamran University of Ahvaz) and Hossein Hashemi (CMES). 

Abstract

The temporal aspect of groundwater vulnerability to contaminants such as nitrate is often overlooked, assuming vulnerability has a static nature. This study bridges this gap by employing machine learning with Detecting Breakpoints and Estimating Segments in Trend (DBEST) algorithm to reveal the underlying relationship between nitrate, water table, vegetation cover, and precipitation time series, that are related to agricultural activities and groundwater demand in a semi-arid region. The contamination probability of Lenjanat Plain has been mapped by comparing random forest (RF), support vector machine (SVM), and K-nearest-neighbors (KNN) models, fed with 32 input variables (dem-derived factors, physiography, distance and density maps, time series data). Also, imbalanced learning and feature selection techniques were investigated as supplementary methods, adding up to four scenarios. Results showed that the RF model, integrated with forward sequential feature selection (SFS) and SMOTE-Tomek resampling method, outperformed the other models (F1-score: 0.94, MCC: 0.83). The SFS techniques outperformed other feature selection methods in enhancing the accuracy of the models with the cost of computational expenses, and the cost-sensitive function proved more efficient in tackling imbalanced data issues than the other investigated methods. The DBEST method identified significant breakpoints within each time series dataset, revealing a clear association between agricultural practices along the Zayandehrood River and substantial nitrate contamination within the Lenjanat region. Additionally, the groundwater vulnerability maps created using the candid RF model and an ensemble of the best RF, SVM, and KNN models predicted mid to high levels of vulnerability in the central parts and the downhills in the southwest.

Keywords: sustainable groundwater management; random forest; DBEST; resampling; imbalaced learning; breakpoint analysis

Read the article