地学前缘 ›› 2025, Vol. 32 ›› Issue (4): 155-164.DOI: 10.13745/j.esf.sf.2025.4.77

• 智能矿产预测 • 上一篇    下一篇

基于大模型与GraphRAG的胶东金矿智能搜索技术

李博文1,2(), 王永志1,3,*(), 丁正江4,5, 王斌4,5, 温世博3, 董宇浩1, 纪政3   

  1. 1.吉林大学 综合信息矿产预测研究所, 吉林 长春 130061
    2.长春黄金研究院有限公司, 吉林 长春 130012
    3.吉林大学 地球探测科学与技术学院, 吉林 长春 130061
    4.自然资源部深部金矿勘查开采技术创新中心, 山东 威海 264209
    5.山东省地质矿产勘查开发局 第六地质大队, 山东 威海 264209
  • 收稿日期:2025-01-24 修回日期:2025-04-20 出版日期:2025-07-25 发布日期:2025-08-04
  • 通信作者: *王永志(1974—),男,博士,教授,主要从事地球科学大数据分析与挖掘、矿产资源智能预测等理论与应用研究工作。E-mail: wangyongzhi@jlu.edu.cn
  • 作者简介:李博文(2000—),男,硕士研究生,地球探测与信息技术专业,从事数据与AI驱动的矿产资源智能预测研究。E-mail: 630376038@qq.com
  • 基金资助:
    国家重点研发计划项目(2023YFC2906903);国家重点研发计划项目(2021YFC2901801);国家重点研发计划项目(2023YFC2907105);国家自然科学基金重点项目(42230810);自然资源部科技支撑项目(ZKKJ202419);山东省地矿局科技攻关项目(KY202502)

Intelligent search technology for Jiaodong gold deposits based on large models and GraphRAG

LI Bowen1,2(), WANG Yongzhi1,3,*(), DING Zhengjiang4,5, WANG Bin4,5, WEN Shibo3, DONG Yuhao1, JI Zheng3   

  1. 1. Integrated Information Mineral Prediction Research Institute, Jilin University, Changchun 130061, China
    2. Changchun Gold Research Institute Co.,Ltd., Changchun 130012, China
    3. College of Geoexploration Science and Technology, Jilin University, Changchun 130061, China
    4. Ministry of Natural Resources Technology Innovation Center for Deep Gold Resources Exploration and Mining, Weihai 264209, China
    5. No.6 Geological Team of Shandong Provincial Bureau of Geology and Mineral Resources, Weihai 264209, China
  • Received:2025-01-24 Revised:2025-04-20 Online:2025-07-25 Published:2025-08-04

摘要:

胶东金矿是我国东部重要的金矿资源集中区,其地质信息复杂、知识体系庞大,传统的信息检索方式难以满足矿产勘查中对语义理解与知识推理的高阶需求。为提升地质知识服务效率,本文基于GraphRAG(知识图谱增强型检索生成)技术,构建了面向胶东金矿领域的智能搜索问答系统。研究以知网上胶东金矿相关的论文为语料来源,利用OCR与大语言模型(LLM)技术进行文本解析与语义标准化处理,形成覆盖矿化类型、控矿构造、矿物组合等核心概念的本体知识体系。系统通过提示工程驱动的大模型实现实体与关系自动抽取,构建结构化知识图谱,并集成于图数据库Neo4j中。进一步融合语义嵌入与社区聚类算法,构建知识索引网络,支持自然语言问答、语义扩展与知识溯源等功能。评估结果表明:该系统在回答准确性、上下文精度与知识可解释性等方面优于传统RAG方法及ChatGPT-4o等通用模型,具备更高的专业适应性和推理能力。研究结果可为金矿领域的智能化信息服务提供新型技术路径,也为图谱增强语言模型在地学知识管理中的应用探索提供理论支持。

关键词: GraphRAG, 知识图谱, 大语言模型, 胶东金矿, 知识问答

Abstract:

The Jiaodong gold deposit is a major concentration area of gold resources in eastern China, characterized by complex geological information and an extensive knowledge system. Traditional information retrieval methods struggle to meet the advanced demands of semantic understanding and knowledge reasoning in mineral exploration. To enhance geological knowledge service efficiency, this study develops an intelligent question-answering system for the Jiaodong gold deposit domain based on GraphRAG (Graph-enhanced Retrieval-Augmented Generation) technology. The research utilizes academic papers from CNKI as the corpus, employs OCR and large language models (LLMs) for text parsing and semantic standardization to establish an ontological knowledge system covering core concepts such as mineralization types, ore-controlling structures, and mineral assemblages. The system uses prompt engineering-driven LLMs to automatically extract entities and relationships, constructing a structured knowledge graph integrated into Neo4j. Furthermore, by combining semantic embedding with community clustering algorithms, a knowledge indexing network enables natural language question answering, semantic query expansion, and knowledge provenance. Evaluation results demonstrate the system’s superiority over traditional RAG and general models (e.g., ChatGPT-4o) in answer accuracy, contextual precision, and knowledge interpretability, exhibiting enhanced professional adaptability and reasoning capabilities. The findings provide a novel technical pathway for intelligent information services in gold deposits and theoretical support for knowledge of graph-enhanced language models in geoscience knowledge management.

Key words: GraphRAG, knowledge graph, large language model, Jiaodong gold deposit, knowledge question answering

中图分类号: