地学前缘 ›› 2019, Vol. 26 ›› Issue (4): 6-12.DOI: 10.13745/j.esf.sf.2019.4.28

• 地质大数据综述 • 上一篇    下一篇

大数据开创地学研究新途径:查明相关关系,增强研究可行性

罗建民,张旗   

  1. 1. 甘肃省地质调查院, 甘肃 兰州 730000
    2. 中国科学院 地质与地球物理研究所, 北京100029
  • 收稿日期:2018-12-10 修回日期:2019-05-21 出版日期:2019-07-25 发布日期:2019-07-25
  • 作者简介:罗建民(1958—),男,教授级高级工程师,主要从事区域地质、矿产调查与成矿预测研究工作。
  • 基金资助:
    甘肃省地质矿产局甘肃省西秦岭地区综合信息成矿预测研究项目(甘地发[2014]158号)

Big data pioneers new ways of geoscience research: identifying relevant relationships to enhance research feasibility

LUO Jianmin,ZHANG Qi   

  1. 1. Geological Survey of Gansu Province, Lanzhou 730000, China
    2. Institute of Geology and Geophysics, Chinese Academy of Sciences, Beijing 100029, China
  • Received:2018-12-10 Revised:2019-05-21 Online:2019-07-25 Published:2019-07-25
  • Supported by:
     

摘要: 人类已经进入大数据时代,大数据研究的思想、方法在地学领域也备受关注。笔者认为,大数据研究的对象是数据,研究的工具是计算机,研究的方法、手段是查明数据间相关关系,研究的特点是取向高概率做出决策。大数据是通过对大量数据的挖掘,查明数据间的相关关系,研究问题并做出正确决策的思想、方法。本文提出大数据是应用“归纳法”开展科学研究的思想、方法,以及高性能计算机和大数据计算技术使“归纳法”得以升华的观点。文章通过对统计学、机器学习算法的深入探讨,得出大数据将改变人们对自然的理解和认知方式,改变科学研究的思想和方法,改变长期以来人们通过查找因果关系开展科学研究的习惯。大数据必将开创一条跨越复杂的因果关系、直接获得研究结果的全新的科学研究途径。随着数据爆发式增长,随着高性能计算机的普及和计算技术的迅猛发展,统计分析方法将很大程度地突破数据体量的限制,统计分析预测模型以其真实可靠的处理结果、对条件和结果良好的解释能力、结合机器学习算法对半结构化与非结构化数据的处理优势,将推动地质科学进入定量化研究的新高度。

 

关键词: 大数据, 统计分析, 机器学习, 数据挖掘, 新途径

Abstract: Humans have entered the era of big data. Research ideas and methods based on big data have gained much attention and start to apply widely in the field of geoscience. In our view, the subject of big data research is data, the tool is the computer, the method and means are to find out the correlation between data, and the characteristics is to make decisions based on probability criteria. To reiterate: big data is the idea and method of finding out the correlation between data; it researches problems and make correct decisions by mining large amounts of data. In this paper, we suggest that the inductive method is the way to carry out big data research, specially as its research power has been greatly enhanced by high performance computer and big data technology. Through an in-depth analyses of statistics and machine learning algorithm, we came to the conclusion that big data shall change the ways people learn and understand nature and scientific studies are designed and performed. And it shall subvert the long-standing habit of conducting scientific research by finding causal relationships. Big data shall create a new approach to conducting geoscience research across complex causal relationships and obtaining research results directly. We concluded in this study that with the explosive growth of data, and with popularization of high-performance computers and rapid development of computing technology, the statistical analysis method has largely broken through the limitation of data volume. This shall enable statistical analysis and prediction models to generate truer thus more reliable results. Ultimately, the ability to explain conditions and outcomes, combining with the advantages of machine learning algorithms for semi-structured and unstructured data, will make quantitative geoscience research truly feasible.

Key words: big data, statistical analysis, machine learning, data mining, new approach to geoscience research

中图分类号: