地学前缘 ›› 2025, Vol. 32 ›› Issue (4): 108-121.DOI: 10.13745/j.esf.sf.2025.4.73

• 智能矿产预测 • 上一篇    下一篇

样本不平衡条件下的甘南地区金矿定量预测方法

谢淼1(), 柳炳利2,3,*(), 李芸和2,3, 王政尧2,3, 曹昌杰2,3, 吴艺骁4   

  1. 1.中国地质科学院地球物理地球化学勘查研究所, 河北 廊坊 065000
    2.成都理工大学数学地质四川省重点实验室, 四川 成都 610059
    3.成都理工大学数学科学学院, 四川 成都 610000
    4.中国地质大学(北京)地球科学与资源学院, 北京 100083
  • 收稿日期:2025-01-15 修回日期:2025-04-29 出版日期:2025-07-25 发布日期:2025-08-04
  • 通信作者: *柳炳利(1981—),男,副教授,长期从事数学地质研究。E-mail: liubingli-82@163.com
  • 作者简介:谢 淼(1999—),女,博士研究生,地球化学专业。E-mail: xiemiao0825@163.com
  • 基金资助:
    国家重点研发计划项目(2023YFC2906403);国家重点研发计划项目(2022YFC2905002);四川省自然科学基金(2024NSFSC0009);中国地质调查局委托业务(DD20243233);紫金矿业集团横向委托项目(4502-FW-2024-00055)

Quantitative prediction method of gold deposits in Gannan area under unbalanced sample conditions

XIE Miao1(), LIU Bingli2,3,*(), LI Yunhe2,3, WANG Zhengyao2,3, CAO Changjie2,3, WU Yixiao4   

  1. 1. Institute of Geophysical and Geochemical Exploration, Chinese Academy of Geological Sciences, Langfang 065000, China
    2. Geomathematics Key Laboratory of Sichuan Province, Chengdu University of Technology, Chengdu 610059, China
    3. College of Mathematics and Sciences, Chengdu University of Technology, Chengdu 610000, China
    4. School of Earth Sciences and Resources, China University of Geosciences (Beijing), Beijing 100083, China
  • Received:2025-01-15 Revised:2025-04-29 Online:2025-07-25 Published:2025-08-04

摘要:

深度学习模型因其在数据特征提取方面的强大能力而在成矿预测领域得到了广泛应用。然而,基于监督学习的深度学习方法常常面临着训练样本不足和正负样本不均衡的问题,尤其是成矿事件的稀有性易导致模型的稳健性与泛化能力不足。为了解决这一问题,本文使用了3种不同的数据增强方法:一是使用滑动窗口的数据增强方法,以“已知正负样本”为中心,采用多次滑动的方式完成增强;二是使用生成式模型,如生成对抗网络(generative adversarial networks,GAN);三是带梯度惩罚的Wasserstein生成对抗网络(Wasserstein generative adversarial network with gradient penalty,WGAN-GP),利用真实样本训练网络,基于训练完备的生成器实现增强。3种不同的数据增强方法能够在样本量扩充的同时,尽可能地保留地质意义。为了验证数据增强的有效性,本文使用真实样本与生成样本之间的FID(Frechet inception distance)值和卷积神经网络(convolutional neural network,CNN)进行评估。结果表明,基于WGAN-GP增强后的数据集在CNN模型具有更强的泛化能力,绘制的甘南地区金矿成矿远景图为未来的矿产资源勘查工作提供了重要的启示。

关键词: 样本不均衡, 数据增强, 卷积神经网络, 定量预测

Abstract:

Deep learning models have been widely applied in mineral prospectivity mapping (MPM) due to their powerful ability to extract features from data. However, supervised deep learning methods often face challenges such as insufficient training samples and class imbalance between positive and negative samples. The inherent rarity of mineralization events further compromises model robustness and generalization ability. To address these issues, this study employs three distinct data augmentation methods:1. Sliding Window Augmentation: This method uses known positive and negative samples as centers and performs multiple sliding operations to generate augmented samples; 2. Generative Adversarial Network (GAN) Augmentation: Generative models, specifically GANs, are utilized. The networks are trained on real samples, and augmentation is achieved using the trained generator; 3. Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP) Augmentation: Similarly, the WGAN-GP framework is trained on real samples, and its trained generator is used for sample augmentation. These three data augmentation methods effectively expand the sample size while maximally preserving the geological significance of the samples. To validate the effectiveness of augmentation, this study employs the Fréchet Inception Distance (FID) between real and generated samples alongside evaluation using a Convolutional Neural Network (CNN). The results demonstrate that the CNN model trained on the WGAN-GP-augmented dataset exhibits superior generalization ability. Furthermore, the mineral prospectivity map for gold deposits generated for the Gannan area provides important insights for future mineral resource exploration efforts.

Key words: sample imbalance, data augmentation, convolutional neural network, quantitative prediction

中图分类号: