Quantitative prediction method of gold deposits in Gannan area under unbalanced sample conditions

doi:10.13745/j.esf.sf.2025.4.73

Abstract

Abstract:

Deep learning models have been widely applied in mineral prospectivity mapping (MPM) due to their powerful ability to extract features from data. However, supervised deep learning methods often face challenges such as insufficient training samples and class imbalance between positive and negative samples. The inherent rarity of mineralization events further compromises model robustness and generalization ability. To address these issues, this study employs three distinct data augmentation methods:1. Sliding Window Augmentation: This method uses known positive and negative samples as centers and performs multiple sliding operations to generate augmented samples; 2. Generative Adversarial Network (GAN) Augmentation: Generative models, specifically GANs, are utilized. The networks are trained on real samples, and augmentation is achieved using the trained generator; 3. Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP) Augmentation: Similarly, the WGAN-GP framework is trained on real samples, and its trained generator is used for sample augmentation. These three data augmentation methods effectively expand the sample size while maximally preserving the geological significance of the samples. To validate the effectiveness of augmentation, this study employs the Fréchet Inception Distance (FID) between real and generated samples alongside evaluation using a Convolutional Neural Network (CNN). The results demonstrate that the CNN model trained on the WGAN-GP-augmented dataset exhibits superior generalization ability. Furthermore, the mineral prospectivity map for gold deposits generated for the Gannan area provides important insights for future mineral resource exploration efforts.

Key words: sample imbalance, data augmentation, convolutional neural network, quantitative prediction

CLC Number:

P628
TP183

XIE Miao, LIU Bingli, LI Yunhe, WANG Zhengyao, CAO Changjie, WU Yixiao. Quantitative prediction method of gold deposits in Gannan area under unbalanced sample conditions[J]. Earth Science Frontiers, 2025, 32(4): 108-121.

Figures/Tables 15

References 56

[1]	张振杰, 成秋明, 杨玠, 等. 机器学习与成矿预测: 以闽西南铁多金属矿预测为例[J]. 地学前缘, 2021, 28(3): 221-235. DOI
[2]	左仁广. 勘查地球化学数据挖掘与弱异常识别[J]. 地学前缘, 2019, 26(4): 67-75. DOI
[3]	左仁广. 基于数据科学的矿产资源定量预测的理论与方法探索[J]. 地学前缘, 2021, 28(3): 49-55. DOI
[4]	ZUO R G, XIONG Y H, WANG J, et al. Deep learning and its application in geochemical mapping[J]. Earth-Science Reviews, 2019, 192: 1-14. DOI
[5]	XIONG Y H, ZUO R G, CARRANZA E J M. Mapping mineral prospectivity through big data analytics and a deep learning algorithm[J]. Ore Geology Reviews, 2018, 102: 811-817.
[6]	SUN T, LI H, WU K X, et al. Data-driven predictive modelling of mineral prospectivity using machine learning and deep learning methods: a case study from southern Jiangxi Province, China[J]. Minerals, 2020, 10(2): 102.
[7]	LI S, CHEN J P, XIANG J. Applications of deep convolutional neural networks in prospecting prediction based on two-dimensional geological big data[J]. Neural Computing and Applications, 2020, 32(7): 2037-2053.
[8]	CHEN G X, HUANG N, WU G P, et al. Mineral prospectivity mapping based on wavelet neural network and Monte Carlo simulations in the Nanling W-Sn metallogenic province[J]. Ore Geology Reviews, 2022, 143: 104765.
[9]	王成彬, 王明果, 王博, 等. 融合知识图谱的矿产资源定量预测[J]. 地学前缘, 2024, 31(4): 26-36. DOI
[10]	曹胜桃, 胡瑞忠, 周永章, 等. 基于大数据关联规则算法的卡林型金矿床元素富集规律及找矿方法研究[J]. 地学前缘, 2024, 31(4): 58-72. DOI
[11]	CHEN G X, CHENG Q M, PUETZ S. Special issue: data-driven discovery in geosciences: opportunities and challenges[J]. Mathematical Geosciences, 2023, 55(3): 287-293.
[12]	ZUO R, PENG Y, LI T, XIONG Y. Challenges of geological prospecting big data mining and integration using deep learning algorithms[J]. Earth Science, 2021, 46(1): 350-358.
[13]	FADAEE M, BISAZZA A, MONZ C. Data augmentation for low-resource neural machine translation[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2:Short Papers). Vancouver, Canada: Stroudsburg, PA, USAACL, 2017: 567-573.
[14]	王琳, 季晓慧, 杨眉, 等. 基于数据增强和集成学习的矿物图像识别[J]. 地学前缘, 2024, 31(4): 87-94. DOI
[15]	YANG N, ZHANG Z K, YANG J H, et al. Applications of data augmentation in mineral prospectivity prediction based on convolutional neural networks[J]. Computers & Geosciences, 2022, 161: 105075.
[16]	HARIHARAN S, TIRODKAR S, PORWAL A, et al. Random forest-based prospectivity modelling of greenfield terrains using sparse deposit data: an example from the tanami region, Western Australia[J]. Natural Resources Research, 2017, 26(4): 489-507.
[17]	LI T F, XIA Q L, ZHAO M Y, et al. Prospectivity mapping for tungsten polymetallic mineral resources, Nanling metallogenic belt, South China: use of random forest algorithm from a perspective of data imbalance[J]. Natural Resources Research, 2020, 29(1): 203-227.
[18]	PRADO E M G, DESOUZA FILHO C R, CARRANZA E J M, et al. Modeling of Cu-Au prospectivity in the Carajás mineral province (Brazil) through machine learning: dealing with imbalanced training data[J]. Ore Geology Reviews, 2020, 124: 103611.
[19]	PARSA M. A data augmentation approach to XGboost-based mineral potential mapping: an example of carbonate-hosted Zn-Pb mineral systems of Western Iran[J]. Journal of Geochemical Exploration, 2021, 228: 106811.
[20]	MA D A, TANG P, ZHAO L J. Sifting GAN: generating and sifting labeled samples to improve the remote sensing image scene classification baseline in vitro[J]. IEEE Geoscience and Remote Sensing Letters, 2019, 16(7): 1046-1050.
[21]	张利军, 鲁文豪, 张建东, 等. 基于深度学习的镜下岩石、矿物薄片识别[J]. 地学前缘, 2024, 31(3): 498-510. DOI
[22]	MORENO-BAREA F J, STRAZZERA F, JEREZ J M, et al. Forward noise adjustment scheme for data augmentation[C]// 2018 IEEE Symposium Series on Computational Intelligence (SSCI). Bangalore, India: IEEE, 2018: 728-734.
[23]	DEVRIES T, TAYLOR G W. Improved regularization of convolutional neural networks with cutout[EB/OL]. (2017-11-29)[2024-12-15]. http://arxiv.org/pdf/1708.04552.
[24]	LI S, CHEN J P, LIU C, et al. Mineral prospectivity prediction via convolutional neural networks based on geological big data[J]. Journal of Earth Science, 2021, 32(2): 327-347.
[25]	LI Q K, CHEN G X, LUO L. Mineral prospectivity mapping using attention-based convolutional neural network[J]. Ore Geology Reviews, 2023, 156: 105381.
[26]	SRIVASTAVA N, HINTON G, KRIZHEVSKY A, et al. Dropout: a simple way to prevent neural networks fromoverfitting[J]. Journal of Machine Learning Research, 2014, 15: 1929-1958.
[27]	WU Y X, LIU B L, GAO Y X, et al. Mineral prospecting mapping with conditional generative adversarial network augmented data[J]. Ore Geology Reviews, 2023, 163: 105787.
[28]	第鹏飞, 汤庆艳, 刘聪, 等. 西秦岭夏河—合作地区早子沟和加甘滩金矿床石英微量元素特征及意义[J]. 现代地质, 2021, 35(6): 1608-1621.
[29]	李康宁, 贾儒雅, 李鸿睿, 等. 西秦岭甘肃夏河—合作地区与中酸性侵入岩有关的金铜多金属成矿系统及找矿预测[J]. 地质通报, 2020, 39(8): 1191-1203.
[30]	蒲万峰, 李鸿睿, 袁臻, 等. 甘肃省玛曲县大水金矿“三位一体” 找矿预测地质模型[J]. 地质通报, 2020, 39(8): 1163-1172.
[31]	第鹏飞, 汤庆艳, 刘东晓, 等. 西秦岭甘南地区金矿床黄铁矿微量元素地球化学特征及意义: 以加甘滩和早子沟金矿为例[J]. 稀土, 2023, 44(4): 140-154.
[32]	LIUJ J, LIU C H, CARRANZA E J M, et al. Geological characteristics and ore-forming process of the gold deposits in the western Qinling region, China[J]. Journal of Asian Earth Sciences, 2015, 103: 40-69.
[33]	刘家军, 刘冲昊, 王建平, 等. 西秦岭地区金矿类型及其成矿作用[J]. 地学前缘, 2019, 26(5): 1-16. DOI
[34]	陈耀宇. 甘南地区金矿找矿标志与找矿模型: 大水、早子沟、拉尔玛金矿床对比分析[J]. 矿产与地质, 2020, 34(1): 7-18.
[35]	李康宁, 张江苏, 徐进, 等. 西秦岭甘南加甘滩金矿床流体包裹体及氢-氧-硫-铅同位素特征[J]. 地质通报, 2023, 42(6): 941-952.
[36]	朱赖民, 张国伟, 李犇, 等. 秦岭造山带重大地质事件、矿床类型和成矿大陆动力学背景[J]. 矿物岩石地球化学通报, 2008, 27(4): 384-390.
[37]	陈衍景. 秦岭印支期构造背景、岩浆活动及成矿作用[J]. 中国地质, 2010, 37(4): 854-865.
[38]	陈衍景, 张静, 张复新, 等. 西秦岭地区卡林—类卡林型金矿床及其成矿时间、构造背景和模式[J]. 地质论评, 2004, 50(2): 134-152.
[39]	翟裕生, 姚书振, 蔡克勤. 矿床学[M]. 3版. 北京: 地质出版社, 2011.
[40]	张家瑞, 高永伟, 张忠平, 等. 甘肃西秦岭地区重要金矿预测模型的建立及资源潜力预测[J]. 西北地质, 2024, 57(5): 88-105.
[41]	XIE X J, MU X Z, REN T X. Geochemical mapping in China[J]. Journal of Geochemical Exploration, 1997, 60(1): 99-113.
[42]	XIE X J, WANG X Q, ZHANG Q, et al. Multi-scale geochemical mapping in China[J]. Geochemistry: Exploration, Environment, Analysis, 2008, 8(3/4): 333-341.
[43]	WANG X Q, ZHANG Q, ZHOU G H. National-scale geochemical mapping projects in China[J]. Geostandards and Geoanalytical Research, 2007, 31(4): 311-320.
[44]	AITCHISON J. The statistical analysis of compositional data[M]. London: Chapman and Hall, 1986.
[45]	ZUO R G, WANG Z Y. Effects of random negative training samples on mineral prospectivity mapping[J]. Natural Resources Research, 2020, 29(6): 3443-3455.
[46]	LU Y, TAO X P, ZENG N Y, et al. Enhanced CNN classification capability for small rice disease datasets using progressive WGAN-GP: algorithms and applications[J]. Remote Sensing, 2023, 15(7): 1789.
[47]	GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets[J]. Advances in Neural Information Processing Systems, 2014, 27:2672-2680.
[48]	GOODFELLOW I, YOSHUA B, AARON C. Deep learning[M]. Cambridge, MA: MIT Press, 2016.
[49]	RADFORD A, METZ L, CHINTALAS S. Unsupervised representation learning with deep convolution generative adversarial networks[EB/OL]. (2016-01-07)[2024-12-15]. http://arxiv.org/pdf/1511.06434.
[50]	SALIMANS T, GOODFELLOW I, ZAREMBA W, et al. Improved techniques for training GANS[J]. Advances in Neural Information Processing Systems, 2016, 29. DOI: 10.4855/arxiv.1511.06434.
[51]	GULRAJANI I, AHMED F, ARJOVSKY M, et al. Improved training of Wasserstein GANs[J]. Advances in Neural Information Processing Systems, 2017, 30: 5767-5777.
	GULRAJANI I, AHMED F, ARJOVSKY M, et al. Improved training of Wasserstein GANs[EB/OL]. (2017-03-31)[2025-04-26]. https://arxiv.org/abs/1704.00028v3.
[52]	LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
[53]	HINTON G E, SALAKHUTDINOV R R. Reducing the dimensionality of data with neural networks[J]. Science, 2006, 313(5786): 504-507. DOI PMID
[54]	KRIZHEVSKY A, SUTSKEVER I, HINTON G. ImageNet classification with deep convolutional neural networks[J]. Advances in Neural Information Processing Systems, 2012, 25(2): 1097-1105.
[55]	IOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift[EB/OL]. (2015-03-02)[2025-01-20]. http://arxiv.org/pdf/1502.03167.
[56]	HEUSEL M, RAMSAUER H, UNTERTHINER T, et al. GANs trained by a two time-scale update rule converge to a local Nash equilibrium[J]. Advances in Neural Information Processing Systems, 2017, 30: 6626-6637.

层类别	输入	输出	卷积核大小
Conv2d_1	[m,9,40,40]	[m,32,20,20]	5×5
BatchNorm2d	[m,32,40,40]	[m,32,40,40]
ReLU	[m,32,40,40]	[m,32,40,40]
MaxPooL_1	[m,32,40,40]	[m,32,20,20]	2×2
Conv2d_2	[m,32,20,20]	[m,64,20,20]	3×3
BatchNorm2d	[m,64,20,20]	[m,64,20,20]
ReLU	[m,64,20,20]	[m,64,20,20]
MaxPooL_2	[m,64,20,20]	[m,64,10,10]	2×2
Conv2d_3	[m,64,10,10]	[m,128,10,10]	3×3
BatchNorm2d	[m,128,10,10]	[m,128,10,10]
ReLU	[m,128,10,10]	[m,128,10,10]
Linear_1	[m,12800]	[m,512]
Linear_2	[m,512]	[m,2]

层类别	输入	输出	卷积核大小
Conv2d_1	[m,9,40,40]	[m,32,20,20]	5×5
BatchNorm2d	[m,32,40,40]	[m,32,40,40]
ReLU	[m,32,40,40]	[m,32,40,40]
MaxPooL_1	[m,32,40,40]	[m,32,20,20]	2×2
Conv2d_2	[m,32,20,20]	[m,64,20,20]	3×3
BatchNorm2d	[m,64,20,20]	[m,64,20,20]
ReLU	[m,64,20,20]	[m,64,20,20]
MaxPooL_2	[m,64,20,20]	[m,64,10,10]	2×2
Conv2d_3	[m,64,10,10]	[m,128,10,10]	3×3
BatchNorm2d	[m,128,10,10]	[m,128,10,10]
ReLU	[m,128,10,10]	[m,128,10,10]
Linear_1	[m,12800]	[m,512]
Linear_2	[m,512]	[m,2]

类别	生成器学习率衰减	判别器学习率衰减	批次大小	迭代次数	优化器
正样本	0.99	0.98	32	2 500	Adam
负样本	0.99	0.98	32	2 500	Adam

类别	生成器学习率衰减	判别器学习率衰减	批次大小	迭代次数	优化器
正样本	0.99	0.98	32	2 500	Adam
负样本	0.99	0.98	32	2 500	Adam

类别	惩罚系数	生成器学习率衰减	判别器学习率衰减	批次大小	迭代次数	优化器
正样本	7.5	0.975	0.97	32	2 500	Adam
负样本	6.5	0.975	0.97	32	2 500	Adam