Earth Science Frontiers ›› 2026, Vol. 33 ›› Issue (1): 63-79.DOI: 10.13745/j.esf.sf.2025.10.33

Previous Articles     Next Articles

Distribution prediction of natural low-quality groundwater in the plains of Henan Province based on machine learning

YU Furong1(), LI Rui1, LI Zhiping1,2,*(), WU Lin1, LIU Zhongpei1   

  1. 1. North China University of Water Resources and Electric Power, Zhengzhou 450046, China
    2. Henan Vocational College of Water Conservancy and Environment, Zhengzhou 450008, China
  • Received:2025-05-20 Revised:2025-09-29 Online:2026-01-25 Published:2025-11-10

Abstract:

As a crucial drinking water source for over two billion people worldwide, groundwater quality is intrinsically linked to human health and ecosystem integrity. Geogenic groundwater contamination (GGC), characterized by excessive levels of arsenic (As), fluoride (F), and iodine (I), originates from natural geological processes. The distribution of GGC, influenced by geological structures, hydrogeochemistry, and anthropogenic activities, exhibits regional patterns with local complexities. Research into its formation mechanisms and control strategies is therefore critical for ensuring water security. Using Henan Province as a case study, this research employed methods including Gibbs diagrams to analyze groundwater hydrochemical characteristics and their controlling factors, thereby identifying the origins of GGC. The correlation between GGC and the distribution of endemic diseases was investigated. Furthermore, machine learning models were introduced to achieve accurate spatial prediction of GGC. Subsequently, health risk control zones were proposed. The results indicate that: (1) GGC in the study area is concentrated in the Eastern Henan Plain and the regions along the Yellow River, with contamination levels in phreatic water being significantly higher than those in confined water; (2) Weakly alkaline and reducing environments represent key hydrogeochemical conditions for GGC formation, where rock weathering and dissolution combined with intense evaporation govern the enrichment of characteristic ions; (3) A spatial correlation exists between GGC distribution and endemic disease areas; (4) Arsenic, fluoride, and iodine in groundwater all exhibit significant spatial aggregation. Notably, the high-high (HH) clusters, indicating areas with co-occurrence of high arsenic, fluoride, and iodine, show strong agreement with the high-risk zones predicted by the machine learning models. Based on these findings, scientifically delineating protection zones in key regions such as Puyang, Xinxiang, Zhoukou, Kaifeng, and Shangqiu cities holds significant practical importance for ensuring local residents’ drinking water safety.

Key words: arsenic, fluorine and iodine, groundwater chemistry, health risk, endemic disease, Henan Province

CLC Number: