粤西庞西垌地区非结构化地质信息机器可读表达与致矿异常区域智能预测

doi:10.13745/j.esf.sf.2024.5.5

地学前缘 ›› 2024, Vol. 31 ›› Issue (4): 47-57.DOI: 10.13745/j.esf.sf.2024.5.5

• 大数据算法与图形大数据 • 上一篇下一篇

粤西庞西垌地区非结构化地质信息机器可读表达与致矿异常区域智能预测

王堃屹¹^,²^,³(), 周永章¹^,²^,³^,^*()

1.中山大学地球科学与工程学院, 广东珠海 519082
2.中山大学地球环境与地球资源研究中心, 广东广州 510275
3.广东省地质过程与矿产资源探查重点实验室, 广东广州 510275

收稿日期:2024-02-21 修回日期:2024-03-04 出版日期:2024-07-25 发布日期:2024-07-10
通信作者: * 周永章(1963—),男,博士,教授,博士生导师,主要从事地球化学、大数据与数学地球科学研究工作。E-mail: zhouyz@mail.sysu.edu.cn
作者简介:王堃屹(1995—),男,博士研究生,主要从事数学地球科学与矿床地球化学研究工作。E-mail: wangky28@mail2.sysu.edu.cn
基金资助:
国家重点研发计划项目(2022YFF0801201);国家自然科学基金重点项目(U1911202);广东省重点领域研发计划项目(2020B1111370001)

Machine-readable expression of unstructured geological information and intelligent prediction of mineralization associated anomaly areas in Pangxidong District, Guangdong, China

WANG Kunyi¹^,²^,³(), ZHOU Yongzhang¹^,²^,³^,^*()

1. School of Earth Sciences and Engineering, Sun Yat-sen University, Zhuhai 519082, China
2. Center for Earth Environment & Resources, Sun Yat-sen University, Guangzhou 510275, China
3. Guangdong Provincial Key Lab of Geological Processes and Mineral Resources, Guangzhou 510275, China

Received:2024-02-21 Revised:2024-03-04 Online:2024-07-25 Published:2024-07-10

摘要/Abstract

摘要：

大数据挖掘与机器学习算法应用已成为成矿预测研究的重要趋势,但如何使非结构化地质数据进行机器可读表达成为需要克服的难点。本研究针对粤西庞西垌矿集区开展地层、岩性、断裂等非结构化地质信息面向机器可读的转换处理,并进而应用机器学习算法对水系沉积物地球化学测试数据和构造、地层等综合地质信息进行挖掘,提取致矿异常特征,最终实现智能圈定致矿异常找矿靶区。独热编码算法与空间加权主成分分析中的权重变量方法组合应用,实现了地层、岩性和断裂构造等非结构化地质信息的结构化转化,并最大限度地保留其所包含的地质信息。单分类支持向量机和自编码网络异常检测算法的应用,解决了研究区已知矿点与非矿点数据不平衡问题。对多源地质数据的集成和综合生成的预测结果与研究区铅锌矿床的空间分布和实际的地质构造情况相对一致,表明上述算法能够有效识别找矿潜力区并寻找潜在的矿床。与传统的勘查地球化学方法相比,本研究中的分析方法能够处理和集成多源的地质致矿信息,可应用于尚未发现矿床的找矿潜力区,提高发现矿床的可能性和找矿工作的效率。

关键词: 大数据挖掘, 机器可读表达, 独热编码算法, 单分类支持向量机, 自编码网络, 庞西垌矿区, 钦杭成矿带

Abstract:

The application of big data mining and machine learning algorithms in mineralization prediction has become an important research trend, but unstructured geological data cannot be directly mined—first they need to be converted to machine-readable expressions. In this study of the Pangxidong ore district in western Guangdong Province, the unstructured geological information such as stratigraphy, lithology, faults are converted into machine-readable format, and two machine learning algorithms, namely, One-Class Support Vector Machine and Auto-Encoder Network, are applied to mine the geochemical test data of stream sediments as well as the comprehensive geological information on faults, stratigraphy, etc. to extract the features of mineralization anomalies and ultimately achieve intelligent delineation of the anomaly areas. Through combined application of One-Hot Encoder and the weighted variable method for spatially weighted principal component analysis, the structural transformation of the unstructured geological information is realized, and geological information is maximally preserved for data mining. It is demonstrated that the application of One-Class Support Vector Machine and Auto-Encoder Network can effectively solve the problem of data imbalance, as the numbers of ore and non-ore spots in the study area are seriously unbalanced. The prediction results generated using the integrated, synthesized multi-source geological data are relatively consistent with the observed spatial distribution of Pb-Zn deposits and the actual geological structure in the study area, indicating the two algorithms can effectively identify potential prospecting targets and ore deposits. Compared with traditional geochemical prospecting methods, the intelligent prediction method can process and integrate multi-source geological information about the ore-forming processes and identify mineralization anomaly areas. This method is applicable in prospecting areas without prior ore discovery, thereby improving the efficiency of ore prospecting and increasing the possibility of finding ore deposits.

Key words: big data mining, machine-readable expression, One-Hot Encoder, One-Class Support Vector Machine, Auto-Encoder Network, Pangxidong ore district, Qinzhou-Hangzhou metallogenic belt

中图分类号:

王堃屹, 周永章. 粤西庞西垌地区非结构化地质信息机器可读表达与致矿异常区域智能预测[J]. 地学前缘, 2024, 31(4): 47-57.

WANG Kunyi, ZHOU Yongzhang. Machine-readable expression of unstructured geological information and intelligent prediction of mineralization associated anomaly areas in Pangxidong District, Guangdong, China[J]. Earth Science Frontiers, 2024, 31(4): 47-57.

图/表 10

图1 庞西垌地区区域地质图(据文献[8]修改)

Fig.1 Simplified regional geological map of the Pangxidong ore district. Modified after [8].

图2 支持向量数据描述(SVDD)原理图(据文献[17]修改)

Fig.2 Schematic diagram of Support Vector Data Description (SVDD). Modified after [17].

图3 自编码网络结构图(据文献[22]修改)

Fig.3 Structural diagram of Auto-Encoder Network. Modified after [22].

表1 庞西垌地区地层、岩性以及断裂构造的机器可读表达数据特征(部分)

Table 1 Characteristics of machine-readable expression data for stratigraphy, lithology, and faults (partial results)

X	Y	塘蓬群	帽子峰组	天子岭组	东岗岭组	信都组	老虎头组	新元古代混合岩	燕山早期侵入岩	燕山晚期侵入岩	空间权重变量w
442 920	2 410 830	0	0	0	0	1	0	0	0	0	0.83
434 920	2 424 830	1	0	0	0	0	0	0	0	0	0.79
439 170	2 408 580	0	0	0	0	0	1	0	0	0	0.77
430 670	2 433 830	0	0	0	0	0	0	0	1	0	0.88
418 170	2 400 080	0	1	0	0	0	0	0	0	0	0.86
402 670	2 425 580	0	0	0	0	0	0	0	0	1	0.92
431 670	2 398 580	0	0	0	1	0	0	0	0	0	0.93
424 670	2 399 080	0	0	1	0	0	0	0	0	0	0.94
401 670	2 425 830	0	0	0	0	0	0	0	0	1	0.96
400 170	2 434 080	0	0	0	0	0	0	1	0	0	0.98
445 170	2 404 080	0	1	0	0	0	0	0	0	0	0.99

图4 庞西垌地区北西向断裂的缓冲区距离与研究区内已知铅锌矿床之间的t⁃统计量对应关系

Fig.4 Correspondence relationship between t-value for known Pb-Zn deposits in the study area and buffer zone distance of NW-trending faults

图5 庞西垌地区NW向断裂空间权重系数图

Fig.5 Weight coefficient map from spatially weighted principal component analysis of NW-trending faults in the Pangxidong area

图6 庞西垌地区16种元素地球化学IDW插值数据的聚类分析结果

Fig.6 Results of cluster analysis of geochemical IDW interpolated data for 16 elements

图7 单分类支持向量机综合异常图

Fig.7 Mineralization anomaly map by One-Class Support Vector Machine

图8 自编码网络综合异常图

Fig.8 Mineralization anomaly map by Auto-Encoder Network

图9 单分类支持向量机(a)和自编码网络(b)的ROC曲线图

Fig.9 Receiver operating characteristic curves obtained by One-Class Support Vector Machine (a) and Auto-Encoder Network (b)

参考文献 34

[1]	周永章, 张良均, 张奥多, 等. 地球科学大数据挖掘与机器学习[M]. 广州: 中山大学出版社, 2018.
[2]	翟明国, 杨树锋, 陈宁华, 等. 大数据时代: 地质学的挑战与机遇[J]. 中国科学院院刊, 2018, 33(8): 825-831.
[3]	成秋明. 什么是数学地球科学及其前沿领域?[J]. 地学前缘, 2021, 28(3): 6-25. DOI
[4]	左仁广. 勘查地球化学数据挖掘与弱异常识别[J]. 地学前缘, 2019, 26(4): 67-75. DOI
[5]	刘艳鹏, 朱立新, 周永章. 卷积神经网络及其在矿床找矿预测中的应用: 以安徽省兆吉口铅锌矿床为例[J]. 岩石学报, 2018, 34(11): 3217-3224.
[6]	周永章, 王俊, 左仁广, 等. 地质领域机器学习、深度学习及实现语言[J]. 岩石学报, 2018, 34(11): 3173-3178.
[7]	周永章, 李兴远, 郑义, 等. 钦杭结合带成矿地质背景及成矿规律[J]. 岩石学报, 2017, 33(3): 667-681.
[8]	周永章, 张国桓, 吴勇庆, 等. 广东庞西垌地区矿产远景调查报告(文地幅、石角幅、塘蓬幅、河唇幅, 1∶50000)[R]. 北京: 中国地质调查局, 2016.
[9]	周永章, 曾长育, 李红中, 等. 钦州湾-杭州湾构造结合带(南段)地质演化和找矿方向[J]. 地质通报, 2012, 31(2/3): 486-491.
[10]	广东省地质矿产局704地质大队. 中华人民共和国区域地质调查报告1∶50000塘蓬幅[R]. 湛江: 广东省地质矿产局 704地质大队, 1987.
[11]	广东省地质矿产局704地质大队. 中华人民共和国区域地质调查报告1∶50000河唇幅[R]. 湛江: 广东省地质矿产局 704地质大队, 1994.
[12]	广东省地质矿产局. 广东省区域地质志[R]. 北京: 地质出版社, 1988, 941.
[13]	战明国, 彭松柏, 蔡明海, 等.云开地区重要成矿区带金、银、铜、铅、锌成矿地质背景及找矿靶区优选研究[M]. 海口: 海南出版社, 2006.
[14]	DAVY M, GODSILL S. Detection of abrupt spectral changes using support vector machines: an application to audio signal segmentation[C]// Proceedings of IEEE international conference on acoustics speech and signal processing. Orlando: IEEE, 2002: 1313-1316.
[15]	LECOMTE S, LENGELLE R, RICHARD C, et al. Abnormal events detection using unsupervised One-Class SVM - Application to audio surveillance and evaluation[C]// Proceedings of 2011 8th IEEE international conference on advanced video and signal based surveillance (AVSS). Klagenfurt: IEEE, 2011: 124-129.
[16]	SCHÖLKOPF B, PLATT J C, SHAWE-TAYLOR J C, et al. Estimating the support of a high-dimensional distribution[J]. Neural Computation, 2001, 13(7): 1443-1471. PMID
[17]	吴定海, 张培林, 任国全, 等. 基于支持向量的单类分类方法综述[J]. 计算机工程, 2011, 37(5): 187-189. DOI
[18]	TAX D M J, DUIN R P W. Support vector data description[J]. Machine Learning, 2004, 54(1): 45-66.
[19]	VAPNIK V N. The nature of statistical learning theory[M]. New York: Springer, 1995.
[20]	RUMELHART D E, HINTON G E, WILLIAMS R J. Learning representations by back-propagating errors[J]. Nature, 1986, 323: 533-536.
[21]	AN J, CHO S. Variational autoencoder based anomaly detection using reconstruction probability[J]. Special Lecture on IE, 2015, 2(1): 1-18.
[22]	VALENTINE A P, TRAMPERT J. Data space reduction, quality assessment and searching of seismograms: autoencoder networks for waveform data[J]. Geophysical Journal International, 2012, 189(2): 1183-1202.
[23]	RIFAI S, VINCENT P, MULLER X, et al. Contractive auto-encoders: explicit invariance during feature extraction[C]// Proceedings of the 28th international conference on machine learning. Bellevue: ACM, 2011: 833-840.
[24]	吴冲龙, 刘刚, 张夏林, 等. 地质科学大数据及其利用的若干问题探讨[J]. 科学通报, 2016, 61(16): 1797-1807.
[25]	王成彬, 马小刚, 陈建国. 数据预处理技术在地学大数据中应用[J]. 岩石学报, 2018, 34(2): 303-313.
[26]	张雪英, 叶鹏, 王曙, 等. 基于深度信念网络的地质实体识别方法[J]. 岩石学报, 2018, 34(2): 343-351.
[27]	张雪英, 张春菊, 汪陈, 等. 面向中文文本的地质语义信息标注与语料库构建[J]. 高校地质学报, 2023, 29(3): 429-438.
[28]	CHENG Q M, BONHAM-CARTER G, WANG W L, et al. A spatially weighted principal component analysis for multi-element geochemical data for mapping locations of felsic intrusions in the Gejiu mineral district of Yunnan, China[J]. Computers and Geosciences, 2011, 37(5): 662-669.
[29]	XIAO F, CHEN J G, ZHANG Z Y, et al. Singularity mapping and spatially weighted principal component analysis to identify geochemical anomalies associated with Ag and Pb-Zn polymetallic mineralization in Northwest Zhejiang, China[J]. Journal of Geochemical Exploration, 2012, 122: 90-100.
[30]	肖凡, 陈建国, 侯卫生, 等. 钦-杭结合带南段庞西垌地区Ag-Au致矿地球化学异常信息识别与提取[J]. 岩石学报, 2017, 33(3): 779-790.
[31]	AGTERBERG F P, BONHAM-CARTER G F, WRIGHT D F. Statistical pattern integration for mineral exploration[M]// Computer applications in resource estimation. Amsterdam: Elsevier, 1990: 1-21.
[32]	BONHAM-CARTER G F. Geographic information systems for geoscientists: modelling with GIS[M]. Amsterdam: Elsevier, 1994.
[33]	YU X T, XIAO F, ZHOU Y Z, et al. Application of hierarchical clustering, singularity mapping, and Kohonen neural network to identify Ag-Au-Pb-Zn polymetallic mineralization associated geochemical anomaly in Pangxidong district[J]. Journal of Geochemical Exploration, 2019, 203: 87-95.
[34]	余晓彤, 肖凡, 周永章, 等. 粤西庞西垌地区银金地球化学异常信息挖掘与提取[J]. 地质与勘探, 2019, 55(1): 77-86.

粤西庞西垌地区非结构化地质信息机器可读表达与致矿异常区域智能预测

Machine-readable expression of unstructured geological information and intelligent prediction of mineralization associated anomaly areas in Pangxidong District, Guangdong, China

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 10

参考文献 34

相关文章 10

编辑推荐

Metrics

本文评价

[1]	张前龙, 周永章, 郭兰萱, 原桂强, 虞鹏鹏, 王汉雨, 朱彪彪, 韩枫, 龙师尧. 找矿知识图谱的智能化应用:以钦杭成矿带斑岩铜矿为例[J]. 地学前缘, 2024, 31(4): 7-15.
[2]	周永章, 肖凡. 管窥人工智能与大数据地球科学研究新进展[J]. 地学前缘, 2024, 31(4): 1-6.
[3]	王功文, 张寿庭, 燕长海, 庞振山, 王宏伟, 冯占奎, 董宏, 程红涛, 何亚清, 李瑞喜, 张智强, 黄蕾蕾, 郭娜娜. 栾川矿集区地学大数据挖掘和三维/四维建模的资源-环境联合预测与定量评价[J]. 地学前缘, 2021, 28(3): 139-155.
[4]	周永章, 张前龙, 黄永健, 杨威, 肖凡, 吉俊杰, 韩枫, 唐磊, 欧阳冲, 沈文杰. 钦杭成矿带斑岩铜矿知识图谱构建及应用展望[J]. 地学前缘, 2021, 28(3): 67-75.
[5]	王堃屹，周永章，王俊，张奥多，余晓彤，焦守涛，刘心怡. 推荐系统算法在钦杭成矿带南段文地幅矿床预测中的应用[J]. 地学前缘, 2019, 26(4): 131-137.
[6]	刘心怡，周永章. 关联规则算法在粤西庞西垌地区元素异常组合研究中的应用 [J]. 地学前缘, 2019, 26(4): 125-130.
[7]	李兴远, 周永章, 安燕飞, 吕文超, 白明亮. 钦杭成矿带南段丰村铅锌矿区下园垌矿段围岩微量元素的地球化学特征及其意义[J]. 地学前缘, 2015, 22(2): 131-143.
[8]	劳妙姬, 邹和平, 杜晓东, 丁汝鑫. 广西横县马山晚侏罗世钾玄质侵入岩的年代学和地球化学研究：兼论钦杭成矿带西南段燕山期构造背景[J]. 地学前缘, 2015, 22(2): 95-107.
[9]	徐德明, 蔺志永, 骆学全, 张鲲, 张雪辉, 黄皓. 钦杭成矿带主要金属矿床成矿系列[J]. 地学前缘, 2015, 22(2): 7-24.
[10]	周永章, 郑义, 曾长育, 梁锦. 关于钦杭成矿带的若干认识[J]. 地学前缘, 2015, 22(2): 1-6.