地学前缘 ›› 2024, Vol. 31 ›› Issue (4): 112-118.DOI: 10.13745/j.esf.sf.2024.5.1

• 深度学习与图像识别 • 上一篇    下一篇

基于渐进多粒度训练深度学习的矿物图像识别

万成舟1(), 季晓慧1,*(), 杨眉2, 何明跃2, 张招崇3, 曾姗1, 王玉柱1   

  1. 1.中国地质大学(北京) 信息工程学院, 北京 100083
    2.中国地质大学(北京) 国家岩矿化石标本资源库, 北京 100083
    3.中国地质大学(北京) 地球科学与资源学院, 北京 100083
  • 收稿日期:2023-08-26 修回日期:2024-02-28 出版日期:2024-07-25 发布日期:2024-07-10
  • 通信作者: * 季晓慧(1977—),女,博士,副教授,主要从事人工智能应用研究。E-mail: xhji@cugb.edu.cn
  • 作者简介:万成舟(1996—),男,硕士研究生,主要从事深度学习、矿物图像识别研究。E-mail: czwan@email.cugb.edu.cn
  • 基金资助:
    国家科技资源共享服务平台——国家岩矿化石标本资源库子项目(NCSTI-RMF20230107)

Mineral image recognition based on progressive deep learning across different granularity levels

WAN Chengzhou1(), JI Xiaohui1,*(), YANG Mei2, HE Mingyue2, ZHANG Zhaochong3, ZENG Shan1, WANG Yuzhu1   

  1. 1. School of Information Engineering, China University of Geosciences (Beijing), Beijing 100083, China
    2. National Mineral Rock and Fossil Specimens Resource Center at MOST, China University of Geosciences (Beijing), Beijing 100083, China
    3. School of Earth Sciences and Resources, China University of Geosciences (Beijing), Beijing 100083, China
  • Received:2023-08-26 Revised:2024-02-28 Online:2024-07-25 Published:2024-07-10

摘要:

近年来,随着深度学习在地学领域中的应用,矿物图像识别变得越来越重要。虽然已经有研究将深度学习应用于矿物图像识别,并取得了一定的成果,但在大规模矿物数据集上的识别准确率仍然有待进一步提高。不同矿物之间可能存在细微的形态、纹理和颜色差异,符合细粒度识别算法特征,但以往的研究中很少有人采用细粒度方法进行矿物识别。所以本文提出了一种基于Next-ViT模型的细粒度矿物识别方法,通过引入渐进式多粒度训练拼图技术,实现对矿物图像的精确分类。首先采用Next-ViT模型作为特征提取器,该模型结合了Transformer结构和卷积神经网络的优势,能够提取到丰富的图像特征;接下来利用随机拼图生成器创建不同粒度级别的矿物拼图,这些拼图包含从细节到整体的多种信息。训练过程中采用渐进式多粒度训练策略,在训练的早期阶段,模型主要关注细粒度的特征,通过学习拼图中的细节信息来区分不同的矿物,随着训练的深入,模型逐渐将注意力转移到更大粒度级别的特征上,学习更加抽象和全局的信息。通过这种方式,模型能够充分利用不同粒度级别的信息,提高矿物识别的准确性。实验结果表明,该模型在常见的36种矿物数据集上取得了86.5%的准确率,有效地提高了矿物识别的准确率。这表明,细粒度识别方法对于矿物识别是有效的。

关键词: 矿物识别, 深度学习, Next-ViT, 细粒度识别, 渐进式多粒度训练

Abstract:

In recent years mineral image recognition has become increasingly important for mineral identification with the use of deep learning. While such application has gained some success, further improvement is needed to enhance the identification accuracy on large-scale mineral datasets. The fine differences in morphology, texture, and color between different minerals may align with the characteristics of fine-grained recognition algorithms, yet results of fine-grained recognition for mineral identification have rarely been reported. This paper proposes a fine-grained mineral identification model based on Next-ViT, which allows precise classification of mineral images by progressive model training across different granularity levels. In this approach, Next-ViT, which combines the advantages of transformer and convolutional neural network, is utilized to extract rich image features; a random jigsaw generator is then employed to create mineral puzzles at different granularity levels encompassing various information from detailed to general. The model training involves progressive learning across multiple granularity levels. In the early stages, the model primarily focuses on fine-grained features, learning detailed information from the puzzles to differentiate between different minerals; as training progresses, model learning gradually shifts to higher granularity levels, capturing more abstract and global information. Through this approach, the model can effectively utilize information across multiple granularity levels, thereby improving the accuracy of mineral identification. Experimental results demonstrated the effectiveness of this approach, with the proposed model achieving an accuracy of 86.5% in mineral identification on a dataset on 36 common minerals.

Key words: mineral identification, deep learning, Next-ViT, fine-grained identification, progressive multi-granularity-level training

中图分类号: