Earth Science Frontiers ›› 2025, Vol. 32 ›› Issue (4): 95-107.DOI: 10.13745/j.esf.sf.2025.4.55

Previous Articles     Next Articles

Metallogenic prediction of lead-zinc ore based on sample expansion in Yadu-Mangdong of Northwestern Guizhou

XU Kai1,2,3,4(), XU Chengyang1, WU Chonglong1,2,3,4, CAI Jingyun1, KONG Chunfang1,2,3,4,*()   

  1. 1. School of Computer, China University of Geosciences (Wuhan), Wuhan 430074, China
    2. Engineering Technology Innovation Center of Mineral Resources Explorations in Bedrock Zones, Ministry of Natural Resources, Guiyang 550081, China
    3. Guizhou Key Laboratory for Strategic Mineral Intelligent Exploration, Guiyang 550081, China
    4. Hubei Key Laboratory of Intelligent Geo-Information Processing, Wuhan 430074, China
  • Received:2025-01-16 Revised:2025-04-23 Online:2025-07-25 Published:2025-08-04

Abstract:

It has rich lead-zinc mineral resources in Northwest Guizhou. Due to the deep burial of ore bodies, it is difficult to prospecting. Data-driven mineral prospectivity prediction using machine learning (ML) is becoming a powerful tool for exploring deep hidden lead-zinc deposits. However, ML-based prospectivity prediction faces several common issues, particularly insufficient training samples and class imbalance caused by the scarcity of mineralized samples. To address these problems, this paper proposes a K-means clustering-improved conditional tabular generative adversarial network (KC-CTGAN) method for mineralized sample augmentation. Specifically, the density is first judged according to the Euclidean distance between samples in each cluster after K-mean clustering, and expanding more samples in the sparse clusters to increase their density to realize the expansion of the mineralized sample set. Then, the adversarial network generates (GAN) generates new category labels with high abstraction and uses the new category labels for conditional generation, thus improving the quality of augmented samples. Finally, the augmented positive samples and randomly undersampled negative samples are used to construct a sufficiently large and balanced labeled datasets to train a Category Boosting (CatBoost) classifier, and establish a mineral prospectivity prediction model based on KC-CTGAN-CatBoost. The performance of the proposed model was verified by using comparative tests and such as accuracy, recall, precision, F1-score. Experimental results demonstrate that compared to the prediction model constructed without KC-CTGAN-based sample augmentation, the proposed model achieves improvements of 8.7%, 7.4%, 10.2%, and 8.8% in accuracy, recall, precision, and F1-score, respectively, proving the effectiveness of the KC-CTGAN augmentation method in enhancing the performance of the mineral prospectivity prediction model. The prediction results will provide more precise target areas for the exploration of deep-seated concealed lead-zinc ore bodies.

Key words: sample augmentation, conditional table generative adversarial network, lead-zinc ore, mineralization prediction

CLC Number: