地学前缘 ›› 2025, Vol. 32 ›› Issue (4): 122-139.DOI: 10.13745/j.esf.sf.2025.4.66

• 智能矿产预测 • 上一篇    下一篇

基于集成学习模型与贝叶斯优化算法的成矿预测

孔春芳1,2,3,4(), 田倩1, 刘健5, 蔡国荣1,5, 赵杰1, 徐凯1,2,3,4,*()   

  1. 1.中国地质大学(武汉) 计算机学院, 湖北 武汉 430074
    2.自然资源部基岩区矿产资源勘查工程技术创新中心, 贵州 贵阳 550081
    3.贵州省战略矿产智慧勘查全省重点实验室, 贵州 贵阳 550081
    4.智能地学信息处理湖北省重点实验室, 湖北 武汉 430074
    5.贵州省地质矿产勘查开发局 一○三地质大队, 贵州 铜仁 554300
  • 收稿日期:2025-05-12 修回日期:2025-05-20 出版日期:2025-07-25 发布日期:2025-08-04
  • 通信作者: *徐 凯(1972—),男,博士,副教授,主要从事数据挖掘与知识发现、基于大数据的智能找矿、定量遥感与地学信息工程等方面的教学与研究工作。E-mail: xukai@cug.edu.cn
  • 作者简介:孔春芳(1973—),女,博士,副教授,主要从事遥感与地理信息系统应用方面的教学与科研工作。E-mail: kongcf@cug.edu.cn
  • 基金资助:
    贵州省找矿突破战略行动重大协同创新项目(黔科合战略找矿[2022]ZD004);湖北省自然科学基金项目(2021CFB506);智能地学信息处理湖北省重点实验室开放基金项目(KLIGIP-2023-B08);贵州省基础研究计划(自然科学)(QKHJC-ZK[2023]G 194);贵州省重大科技成果转化项目(黔科合[2022]重点003);贵州省地质矿产勘查开发局项目(黔地矿科合[2021]3);贵州省地质矿产勘查开发局项目(黔地矿科合[2023]1);铜仁市科技局科研项目(铜市科研[2024]97号)

Metallogenic prediction based on ensemble learning models and Bayesian Optimization Algorithm

KONG Chunfang1,2,3,4(), TIAN Qian1, LIU Jian5, CAI Guorong1,5, ZHAO Jie1, XU Kai1,2,3,4,*()   

  1. 1. School of Computer, China University of Geosciences (Wuhan), Wuhan 430074, China
    2. Engineering Technology Innovation Center of Mineral Resource Explorations in Bedrock Zones, Ministry of Natural Resources, Guiyang 550081, China
    3. Guizhou Key Laboratory for Strategic Mineral Intelligent Exploration, Guiyang 550081, China
    4. Hubei Key Laboratory of Intelligent Geo-Information Processing, Wuhan 430074, China
    5. Geology Team 103, Bureau of Geology and Mineral Exploration and Development, Tongren 554300, China
  • Received:2025-05-12 Revised:2025-05-20 Online:2025-07-25 Published:2025-08-04

摘要:

全球进入隐伏矿体勘查时代,急需新的找矿预测方法。利用集成学习进行的数据驱动的成矿预测模型正在成为深部隐伏矿产勘探的有力工具。然而,基于集成学习的成矿预测模型面临着一些普遍的问题,特别是模型的参数调优。模型的参数调优是一个非常耗时的过程,需要繁琐的计算和足够的专家经验。本文提出了一种基于多源地学知识与贝叶斯优化算法的集成学习模型来解决上述问题。具体来说,首先,基于多源地学知识,构建锰矿成矿预测数据库;其次,基于自适应提升模型(Adaptive Boosting,AdaBoost)和随机森林(Random Forest,RF)模型,建立黔东北锰矿成矿预测模型;然后,采用贝叶斯优化算法(Bayesian Optimization,BO),通过5倍交叉验证的辅助,寻找BO-AdaBoost和BO-RF模型最合适的超参数组合;最后,利用精度、准确率、召回率、F1分数、kappa系数、AUC值等参数及已有成果检测模型的性能。实验结果发现,BO-AdaBoost和BO-RF模型的AUC值都得到了显著的提高,表明BO是一个强大的优化工具,优化结果为集成学习模型的超参数设置提供了参考。同时,实验结果也表明:BO-AdaBoost模型(92.8%)比BO-RF模型(89.9%)具有更高的预测精度和地质泛化能力,在成矿预测方面具有巨大潜力。基于BO-AdaBoost模型的预测图为黔东北隐伏锰矿矿床的勘探提供了重要线索,并可以指导未来的矿产勘探与开发。

关键词: 集成学习, 自适应提升模型, 随机森林, 贝叶斯优化算法, 隐伏锰矿, 成矿预测

Abstract:

Exploration for hidden ore bodies is increasingly important and demands innovative prospecting methods. Data-driven metallogenic prediction models using ensemble learning are becoming powerful tools for deep mineral exploration. However, such models face challenges, particularly in time-consuming hyperparameter tuning requiring extensive computation and expertise. To address this, we propose a framework integrating multi-source geological knowledge with Bayesian Optimization (BO) for ensemble learning. Specifically, a manganese (Mn) metallogenic prediction database integrating multi-source geological knowledge is first constructed. Metallogenic prediction models for Mn ore in northeastern Guizhou are then established using Adaptive Boosting (AdaBoost) and Random Forest (RF). The hyperparameters of these base models are optimized using Bayesian Optimization (BO) via 5-fold cross-validation, resulting in the optimized BO-AdaBoost and BO-RF models. Model performance is evaluated using metrics including accuracy, precision, recall, F1-score, kappa, and AUC values. Results show significant improvements in AUC for both BO-optimized models compared to their non-optimized counterparts, demonstrating BO’s effectiveness for ensemble learning hyperparameter tuning. Furthermore, the BO-AdaBoost model achieves higher prediction accuracy (92.8%) and generalization performance than the BO-RF model (89.9%), highlighting its strong potential for metallogenic prediction. The prospectivity map generated by the BO-AdaBoost model provides critical guidance for exploring deep-hidden Mn deposits in northeastern Guizhou and can direct future mineral exploration and development.

Key words: ensemble learning, Adaptive Boosting (AdaBoost), Random Forest (RF), Bayesian Optimization (BO), hidden manganese ore, metallogenic prediction

中图分类号: