地学前缘 ›› 2025, Vol. 32 ›› Issue (4): 199-212.DOI: 10.13745/j.esf.sf.2025.4.54

• 智能地质填图 • 上一篇    下一篇

基于机器学习与多源数据融合的东天山戈壁沙漠覆盖区中-酸性侵入岩岩性填图

肖凡1,2,3(), 杨华清1, 唐奥1, 黄旋财1, 王翠翠4   

  1. 1.中山大学 地球科学与工程学院, 广东 珠海 519082
    2.广东省地质过程与矿产资源探查重点实验室, 广东 珠海 519082
    3.南方海洋科学与工程广东省实验室(珠海), 广东 珠海 519082
    4.中国地质调查局乌鲁木齐自然资源综合调查中心, 新疆 乌鲁木齐 830057
  • 收稿日期:2024-08-05 修回日期:2025-02-19 出版日期:2025-07-25 发布日期:2025-08-04
  • 作者简介:肖 凡(1985—),男,博士,副教授,博士生导师,主要从事矿产普查与勘探和数学地质方面的教学与科研工作。E-mail: xiaofan3@mail.sysu.edu.cn
  • 基金资助:
    国家重点研发计划项目(2022YFF0801201);广东省引进人才创新创业团队项目(2021ZT09H399)

Lithological mapping of intermediate-acid intrusive rocks in the Eastern Tianshan Gobi-desert covered area using machine learning for multisource data fusion

XIAO Fan1,2,3(), YANG Huaqing1, TANG Ao1, HUANG Xuancai1, WANG Cuicui4   

  1. 1. School of Earth Sciences and Engineering, Sun Yat-sen University, Zhuhai 519082, China
    2. Guangdong Provincial Key Laboratory of Geological Process and Mineral Resource Exploration, Zhuhai 519082, China
    3. Southern Laboratory of Ocean Science and Engineering, Zhuhai 519082, China
    4.ürümqi Comprehensive Survey Center on Natural Resources, China Geological Survey, ürümqi 830057, China
  • Received:2024-08-05 Revised:2025-02-19 Online:2025-07-25 Published:2025-08-04

摘要:

东天山地区矿产资源丰富,构造演化复杂,出露大面积的中-酸性侵入岩,它们主要形成于晚古生代,与区域构造演化和内生金属矿床成矿关系十分密切,对区域构造环境和成矿规律的认识具有重要意义。然而,由于覆盖层的遮蔽作用,覆盖区内中-酸性侵入岩的地质填图信息是不完整或完全缺失的,这在一定程度上制约了东天山区域构造与成矿规律的认识。近年来,基于大数据研究新范式发展起来的融合地球物理、地球化学、遥感图像等多源探测数据进行间接岩性填图的方法,为解决这一难题提供了有效途径。机器学习算法被诸多实例证明是数据融合的有力工具,它对复杂非线性地学数据的分类和判别等问题具有较强的适用性。为此,本文提出利用机器学习方法融合重力、航磁、地球化学、遥感影像数据,快速、经济、更准确地进行东天山地区中-酸性侵入岩的填图工作。对研究区内出露的中-酸性侵入岩进行类别标定并将其作为目标变量,将布格重力、航磁、水系沉积物地球化学和Landsat卫星多波段遥感影像数据作为预测变量,采用合成少数类过采样技术,解决岩性样本数据分布不均衡问题。基于随机森林和人工神经网络算法,对超参数进行网格搜索得到最优预测模型,分别对东天山地区覆盖区内隐伏中-酸性岩体的空间分布和岩性进行预测,并对预测结果进行对比分析和讨论。准确率、召回率和F1得分都表明随机森林模型优于人工神经网络模型,故最终选取随机森林模型的预测结果作为东天山覆盖区的中-酸性侵入岩岩性填图的最终结果,进一步讨论了中-酸性侵入岩的空间分布对区域构造和成矿作用的控制规律。相比于传统的人工地质填图方式,基于机器学习和多源数据融合的间接岩性填图方法具有效率高、成本较低廉和不受地质地理景观条件制约等优点。

关键词: 机器学习, 多源数据, 岩性识别, 随机森林, 人工神经网络

Abstract:

The Eastern Tianshan region is an important metallogenic belt and exhibits a complex tectonic evolution, with extensive exposures of intermediate-acidic intrusive rocks primarily formed during the Late Paleozoic. Understanding their relationship with regional tectonic evolution and the formation of magmatic-hydrothermal associated metal deposits is of great significance for comprehending the regional tectonic environment and ore-forming patterns. However, the covering layers have resulted in incomplete geological mapping of the intermediate-acidic intrusive rocks in the covered areas of the Eastern Tianshan region. This has hindered our understanding of the regional tectonics and ore-forming patterns there. In recent years, a new paradigm has emerged that integrates multisource survey data, such as geophysics, geochemistry, and remote sensing imagery, using big data analytical techniques to support lithological mapping. Machine learning algorithms have been demonstrated to be powerful tools for data fusion, making them applicable to problems involving the classification and discrimination of complex nonlinear geological data. Therefore, this study proposes using machine learning methods to integrate gravity, aeromagnetic, geochemistry, and remote sensing imagery data to conduct rapid, cost-effective, and more accurate lithological mapping of intermediate-acidic intrusive rocks in the Eastern Tianshan district. In this contribution, the exposed intermediate-acidic intrusive rocks of the study area are labeled as target variables. Furthermore, as predictive variables, Bouguer gravity, aeromagnetic, stream sediment geochemical, and Landsat satellite imagery data are employed. Synthetic minority oversampling technique is utilized to address the issue of imbalanced lithological sample data distribution. Random forest (RF) and artificial neural network (ANN) algorithms are applied, and hyperparameter tuning is conducted through grid search to obtain the optimal prediction models. These models are then used to identify concealed intermediate-acidic intrusive rocks in the covered areas of the Eastern Tianshan region. The results of RF are compared and analyzed with those of ANN. Accuracy, recall rate, and F1 scores indicate that the RF model outperforms the ANN model. Therefore, the prediction results of the RF model are selected as the final result for lithological mapping of intermediate-acidic intrusive rocks in the covered areas of the Eastern Tianshan region. Further discussions are conducted on the control patterns of the spatial distribution of intermediate-acidic intrusive rocks on regional tectonics and mineralization. Compared to traditional geological mapping methods, the machine learning-based lithological mapping approach, which integrates multiple data sources, offers advantages including increased depth, high recognition efficiency, and lower costs, making it an effective method for comprehensively exploring potential geological features and patterns.

Key words: machine learning, multisource data, lithological identification, random forest, artificial neural network

中图分类号: