地学前缘 ›› 2024, Vol. 31 ›› Issue (4): 47-57.DOI: 10.13745/j.esf.sf.2024.5.5

• 大数据算法与图形大数据 • 上一篇    下一篇

粤西庞西垌地区非结构化地质信息机器可读表达与致矿异常区域智能预测

王堃屹1,2,3(), 周永章1,2,3,*()   

  1. 1.中山大学 地球科学与工程学院, 广东 珠海 519082
    2.中山大学 地球环境与地球资源研究中心, 广东 广州 510275
    3.广东省地质过程与矿产资源探查重点实验室, 广东 广州 510275
  • 收稿日期:2024-02-21 修回日期:2024-03-04 出版日期:2024-07-25 发布日期:2024-07-10
  • 通信作者: * 周永章(1963—),男,博士,教授,博士生导师,主要从事地球化学、大数据与数学地球科学研究工作。E-mail: zhouyz@mail.sysu.edu.cn
  • 作者简介:王堃屹(1995—),男,博士研究生,主要从事数学地球科学与矿床地球化学研究工作。E-mail: wangky28@mail2.sysu.edu.cn
  • 基金资助:
    国家重点研发计划项目(2022YFF0801201);国家自然科学基金重点项目(U1911202);广东省重点领域研发计划项目(2020B1111370001)

Machine-readable expression of unstructured geological information and intelligent prediction of mineralization associated anomaly areas in Pangxidong District, Guangdong, China

WANG Kunyi1,2,3(), ZHOU Yongzhang1,2,3,*()   

  1. 1. School of Earth Sciences and Engineering, Sun Yat-sen University, Zhuhai 519082, China
    2. Center for Earth Environment & Resources, Sun Yat-sen University, Guangzhou 510275, China
    3. Guangdong Provincial Key Lab of Geological Processes and Mineral Resources, Guangzhou 510275, China
  • Received:2024-02-21 Revised:2024-03-04 Online:2024-07-25 Published:2024-07-10

摘要:

大数据挖掘与机器学习算法应用已成为成矿预测研究的重要趋势,但如何使非结构化地质数据进行机器可读表达成为需要克服的难点。本研究针对粤西庞西垌矿集区开展地层、岩性、断裂等非结构化地质信息面向机器可读的转换处理,并进而应用机器学习算法对水系沉积物地球化学测试数据和构造、地层等综合地质信息进行挖掘,提取致矿异常特征,最终实现智能圈定致矿异常找矿靶区。独热编码算法与空间加权主成分分析中的权重变量方法组合应用,实现了地层、岩性和断裂构造等非结构化地质信息的结构化转化,并最大限度地保留其所包含的地质信息。单分类支持向量机和自编码网络异常检测算法的应用,解决了研究区已知矿点与非矿点数据不平衡问题。对多源地质数据的集成和综合生成的预测结果与研究区铅锌矿床的空间分布和实际的地质构造情况相对一致,表明上述算法能够有效识别找矿潜力区并寻找潜在的矿床。与传统的勘查地球化学方法相比,本研究中的分析方法能够处理和集成多源的地质致矿信息,可应用于尚未发现矿床的找矿潜力区,提高发现矿床的可能性和找矿工作的效率。

关键词: 大数据挖掘, 机器可读表达, 独热编码算法, 单分类支持向量机, 自编码网络, 庞西垌矿区, 钦杭成矿带

Abstract:

The application of big data mining and machine learning algorithms in mineralization prediction has become an important research trend, but unstructured geological data cannot be directly mined—first they need to be converted to machine-readable expressions. In this study of the Pangxidong ore district in western Guangdong Province, the unstructured geological information such as stratigraphy, lithology, faults are converted into machine-readable format, and two machine learning algorithms, namely, One-Class Support Vector Machine and Auto-Encoder Network, are applied to mine the geochemical test data of stream sediments as well as the comprehensive geological information on faults, stratigraphy, etc. to extract the features of mineralization anomalies and ultimately achieve intelligent delineation of the anomaly areas. Through combined application of One-Hot Encoder and the weighted variable method for spatially weighted principal component analysis, the structural transformation of the unstructured geological information is realized, and geological information is maximally preserved for data mining. It is demonstrated that the application of One-Class Support Vector Machine and Auto-Encoder Network can effectively solve the problem of data imbalance, as the numbers of ore and non-ore spots in the study area are seriously unbalanced. The prediction results generated using the integrated, synthesized multi-source geological data are relatively consistent with the observed spatial distribution of Pb-Zn deposits and the actual geological structure in the study area, indicating the two algorithms can effectively identify potential prospecting targets and ore deposits. Compared with traditional geochemical prospecting methods, the intelligent prediction method can process and integrate multi-source geological information about the ore-forming processes and identify mineralization anomaly areas. This method is applicable in prospecting areas without prior ore discovery, thereby improving the efficiency of ore prospecting and increasing the possibility of finding ore deposits.

Key words: big data mining, machine-readable expression, One-Hot Encoder, One-Class Support Vector Machine, Auto-Encoder Network, Pangxidong ore district, Qinzhou-Hangzhou metallogenic belt

中图分类号: