地学前缘 ›› 2025, Vol. 32 ›› Issue (4): 95-107.DOI: 10.13745/j.esf.sf.2025.4.55

• 智能矿产预测 • 上一篇    下一篇

基于样本扩充的黔西北垭都-蟒硐矿区铅锌矿成矿预测研究

徐凯1,2,3,4(), 徐城阳1, 吴冲龙1,2,3,4, 蔡婧云1, 孔春芳1,2,3,4,*()   

  1. 1.中国地质大学(武汉) 计算机学院, 湖北 武汉430074
    2.自然资源部基岩区矿产资源勘查工程技术创新中心, 贵州 贵阳 550081
    3.贵州省战略矿产智慧勘查全省重点实验室, 贵州 贵阳 550081
    4.智能地学信息处理湖北省重点实验室, 湖北 武汉 430074
  • 收稿日期:2025-01-16 修回日期:2025-04-23 出版日期:2025-07-25 发布日期:2025-08-04
  • 通信作者: *孔春芳(1973—),女,博士,副教授,主要从事遥感与地理信息系统应用等方面的工作。E-mail: kongcf@cug.edu.cn
  • 作者简介:徐 凯(1972—),男,博士,副教授,主要从事数据挖掘与知识发现、基于大数据的智能找矿、定量遥感与地学信息工程等方面的教学与研究工作。E-mail: xukai@cug.edu.cn
  • 基金资助:
    贵州省找矿突破战略行动重大协同创新项目(黔科合战略找矿[2022]ZD004);湖北省自科基金项目(2021CFB506);智能地学信息处理湖北省重点实验室开放基金项目(KLIGIP-2023-B08);贵州省基础研究计划项目(自然科学)(QKHJC-ZK[2023]G 194);贵州省重大科技成果转化项目(黔科合[2022]重点003);贵州省地质矿产勘查开发局项目(黔地矿科合[2021]3);贵州省地质矿产勘查开发局项目(黔地矿科合[2023]1);毕节试验区优势矿产大普查项目

Metallogenic prediction of lead-zinc ore based on sample expansion in Yadu-Mangdong of Northwestern Guizhou

XU Kai1,2,3,4(), XU Chengyang1, WU Chonglong1,2,3,4, CAI Jingyun1, KONG Chunfang1,2,3,4,*()   

  1. 1. School of Computer, China University of Geosciences (Wuhan), Wuhan 430074, China
    2. Engineering Technology Innovation Center of Mineral Resources Explorations in Bedrock Zones, Ministry of Natural Resources, Guiyang 550081, China
    3. Guizhou Key Laboratory for Strategic Mineral Intelligent Exploration, Guiyang 550081, China
    4. Hubei Key Laboratory of Intelligent Geo-Information Processing, Wuhan 430074, China
  • Received:2025-01-16 Revised:2025-04-23 Online:2025-07-25 Published:2025-08-04

摘要:

黔西北拥有丰富的铅锌矿资源,但由于矿体埋藏较深,找矿难度大。利用机器学习进行的数据驱动的成矿预测正在成为深部隐伏铅锌矿找矿勘探的有力工具。然而,基于机器学习的找矿预测面临着一些普遍的问题,特别是成矿样本少导致训练样本不足和训练样本不平衡等问题。为此,本文提出了一种K均值聚类(K-means Clustering)改进条件表格生成对抗网络(Conditional Tabular Generative Adversarial Network,CTGAN)的见矿样本扩充方法来解决这些问题。具体来说,首先根据K均值聚类后各簇集样本间欧氏距离判断其疏密情况,在稀疏簇集扩充更多的样本以增加其密度实现见矿样本集的扩充。然后,对抗网络生成具有高度抽象的新类别标签,并将新类别标签用于条件生成,从而提高扩充样本的质量。最后,利用扩充后的正样本和随机欠采样的负样本建立数据量充足且平衡的有标签样本集,训练和验证Category Boosting(CatBoost)分类器,建立基于KC-CTGAN-CatBoost成矿预测模型。实验结果表明,相比于未经过KC-CTGAN见矿样本扩充的数据集构建的成矿预测模型,在准确度、召回率、精度和F1-score上分别提高了8.7%、7.4%、10.2%和8.8%,证明KC-CTGAN见矿样本扩充方法的有效性,并提高了成矿预测模型的性能。预测结果将更好地为深部隐伏铅锌矿体的找矿勘探提供更精确的靶区。

关键词: 样本扩充, 条件表格生成对抗网络, 铅锌矿, 成矿预测

Abstract:

It has rich lead-zinc mineral resources in Northwest Guizhou. Due to the deep burial of ore bodies, it is difficult to prospecting. Data-driven mineral prospectivity prediction using machine learning (ML) is becoming a powerful tool for exploring deep hidden lead-zinc deposits. However, ML-based prospectivity prediction faces several common issues, particularly insufficient training samples and class imbalance caused by the scarcity of mineralized samples. To address these problems, this paper proposes a K-means clustering-improved conditional tabular generative adversarial network (KC-CTGAN) method for mineralized sample augmentation. Specifically, the density is first judged according to the Euclidean distance between samples in each cluster after K-mean clustering, and expanding more samples in the sparse clusters to increase their density to realize the expansion of the mineralized sample set. Then, the adversarial network generates (GAN) generates new category labels with high abstraction and uses the new category labels for conditional generation, thus improving the quality of augmented samples. Finally, the augmented positive samples and randomly undersampled negative samples are used to construct a sufficiently large and balanced labeled datasets to train a Category Boosting (CatBoost) classifier, and establish a mineral prospectivity prediction model based on KC-CTGAN-CatBoost. The performance of the proposed model was verified by using comparative tests and such as accuracy, recall, precision, F1-score. Experimental results demonstrate that compared to the prediction model constructed without KC-CTGAN-based sample augmentation, the proposed model achieves improvements of 8.7%, 7.4%, 10.2%, and 8.8% in accuracy, recall, precision, and F1-score, respectively, proving the effectiveness of the KC-CTGAN augmentation method in enhancing the performance of the mineral prospectivity prediction model. The prediction results will provide more precise target areas for the exploration of deep-seated concealed lead-zinc ore bodies.

Key words: sample augmentation, conditional table generative adversarial network, lead-zinc ore, mineralization prediction

中图分类号: