基于渐进多粒度训练深度学习的矿物图像识别

doi:10.13745/j.esf.sf.2024.5.1

地学前缘 ›› 2024, Vol. 31 ›› Issue (4): 112-118.DOI: 10.13745/j.esf.sf.2024.5.1

基于渐进多粒度训练深度学习的矿物图像识别

万成舟¹(), 季晓慧¹^,^*(), 杨眉², 何明跃², 张招崇³, 曾姗¹, 王玉柱¹

1.中国地质大学(北京) 信息工程学院, 北京 100083
2.中国地质大学(北京) 国家岩矿化石标本资源库, 北京 100083
3.中国地质大学(北京) 地球科学与资源学院, 北京 100083

收稿日期:2023-08-26 修回日期:2024-02-28 出版日期:2024-07-25 发布日期:2024-07-10
通信作者: * 季晓慧(1977—),女,博士,副教授,主要从事人工智能应用研究。E-mail: xhji@cugb.edu.cn
作者简介:万成舟(1996—),男,硕士研究生,主要从事深度学习、矿物图像识别研究。E-mail: czwan@email.cugb.edu.cn
基金资助:
国家科技资源共享服务平台——国家岩矿化石标本资源库子项目(NCSTI-RMF20230107)

Mineral image recognition based on progressive deep learning across different granularity levels

WAN Chengzhou¹(), JI Xiaohui¹^,^*(), YANG Mei², HE Mingyue², ZHANG Zhaochong³, ZENG Shan¹, WANG Yuzhu¹

1. School of Information Engineering, China University of Geosciences (Beijing), Beijing 100083, China
2. National Mineral Rock and Fossil Specimens Resource Center at MOST, China University of Geosciences (Beijing), Beijing 100083, China
3. School of Earth Sciences and Resources, China University of Geosciences (Beijing), Beijing 100083, China

Received:2023-08-26 Revised:2024-02-28 Online:2024-07-25 Published:2024-07-10

摘要/Abstract

摘要：

近年来,随着深度学习在地学领域中的应用,矿物图像识别变得越来越重要。虽然已经有研究将深度学习应用于矿物图像识别,并取得了一定的成果,但在大规模矿物数据集上的识别准确率仍然有待进一步提高。不同矿物之间可能存在细微的形态、纹理和颜色差异,符合细粒度识别算法特征,但以往的研究中很少有人采用细粒度方法进行矿物识别。所以本文提出了一种基于Next-ViT模型的细粒度矿物识别方法,通过引入渐进式多粒度训练拼图技术,实现对矿物图像的精确分类。首先采用Next-ViT模型作为特征提取器,该模型结合了Transformer结构和卷积神经网络的优势,能够提取到丰富的图像特征;接下来利用随机拼图生成器创建不同粒度级别的矿物拼图,这些拼图包含从细节到整体的多种信息。训练过程中采用渐进式多粒度训练策略,在训练的早期阶段,模型主要关注细粒度的特征,通过学习拼图中的细节信息来区分不同的矿物,随着训练的深入,模型逐渐将注意力转移到更大粒度级别的特征上,学习更加抽象和全局的信息。通过这种方式,模型能够充分利用不同粒度级别的信息,提高矿物识别的准确性。实验结果表明,该模型在常见的36种矿物数据集上取得了86.5%的准确率,有效地提高了矿物识别的准确率。这表明,细粒度识别方法对于矿物识别是有效的。

关键词: 矿物识别, 深度学习, Next-ViT, 细粒度识别, 渐进式多粒度训练

Abstract:

In recent years mineral image recognition has become increasingly important for mineral identification with the use of deep learning. While such application has gained some success, further improvement is needed to enhance the identification accuracy on large-scale mineral datasets. The fine differences in morphology, texture, and color between different minerals may align with the characteristics of fine-grained recognition algorithms, yet results of fine-grained recognition for mineral identification have rarely been reported. This paper proposes a fine-grained mineral identification model based on Next-ViT, which allows precise classification of mineral images by progressive model training across different granularity levels. In this approach, Next-ViT, which combines the advantages of transformer and convolutional neural network, is utilized to extract rich image features; a random jigsaw generator is then employed to create mineral puzzles at different granularity levels encompassing various information from detailed to general. The model training involves progressive learning across multiple granularity levels. In the early stages, the model primarily focuses on fine-grained features, learning detailed information from the puzzles to differentiate between different minerals; as training progresses, model learning gradually shifts to higher granularity levels, capturing more abstract and global information. Through this approach, the model can effectively utilize information across multiple granularity levels, thereby improving the accuracy of mineral identification. Experimental results demonstrated the effectiveness of this approach, with the proposed model achieving an accuracy of 86.5% in mineral identification on a dataset on 36 common minerals.

Key words: mineral identification, deep learning, Next-ViT, fine-grained identification, progressive multi-granularity-level training

中图分类号:

TP391.4
P57

万成舟, 季晓慧, 杨眉, 何明跃, 张招崇, 曾姗, 王玉柱. 基于渐进多粒度训练深度学习的矿物图像识别[J]. 地学前缘, 2024, 31(4): 112-118.

WAN Chengzhou, JI Xiaohui, YANG Mei, HE Mingyue, ZHANG Zhaochong, ZENG Shan, WANG Yuzhu. Mineral image recognition based on progressive deep learning across different granularity levels[J]. Earth Science Frontiers, 2024, 31(4): 112-118.

图/表 7

表1 本文数据集包含的36种常见矿物及其样本数量

Table 1 Sample numbers of the 36 minerals in the studied dataset

序号	矿物名称	数量	序号	矿物名称	数量	序号	矿物名称	数量
1	玛瑙	3 225	13	锂电气石	5 439	25	黄铁矿	8 769
2	钠长石	1 775	14	绿帘石	3 720	26	石英	34 883
3	铁铝榴石	2 018	15	萤石	26 336	27	菱锰矿	4 276
4	硫酸铅矿	1 797	16	方铅矿	6 188	28	红宝石	820
5	蓝铜矿	7 924	17	自然金	4 545	29	蓝宝石	996
6	绿柱石	8 957	18	盐岩	756	30	黑电气石	2 099
7	锡石	3 205	19	赤铁矿	5 728	31	闪锌矿	6 354
8	黄铜矿	3 253	20	磁铁矿	2 445	32	辉锑矿	2 475
9	辰砂	1 605	21	孔雀石	6 796	33	硫黄	1 890
10	自然铜	5 288	22	白铁矿	1 608	34	黄玉	3 577
11	钙铁榴石	755	23	蛋白石	3 197	35	铜铀云母	1 100
12	透辉石	1 586	24	雌黄	720	36	钼铅矿	7 583

图1 数据集中的矿物图像示例(从左到右为表1中的前6种矿物)

Fig.1 Representative mineral images in the dataset (first 6 minerals in Table 1)

图2 本文进行矿物识别的PMG-Next-ViT结构

Fig.2 PMG-Next-ViT architecture

图3 NCB结构图(左)及NTB结构图(右)

Fig.3 NCB (left) and NTB (right) architectures

表2 各模型在36种矿物识别上的top-1精度

Table 2 Top-1 accuracy of various models for the 36 minerals

Method	Accuracy (top-1)
EfficientNet-b4^[10]	78.3%
ResNet50	76.4%
PMG-ResNet50	82.7%
Next-ViT-Small	78.2%
PMG-Next-ViT-Small	86.5%

图4 Next-ViT-Small和PMG-Next-ViT-Small在36种矿物上的top-1准确率对比图中横坐标轴上所列各序号代表的矿物名称见表1。

Fig.4 Top-1 accuracy of Next-ViT-Small and PMG-Next-ViT-Small for the 36 minerals

图5 Next-ViT-Small的混淆矩阵(左)及PMG-Next-ViT-Small的混淆矩阵(右) 图中横坐标轴上所列各序号代表的矿物名称见表1。图中色轴代表精度。

Fig.5 Confusion matrices for Next-ViT-Small (left) and PMG-Next-ViT-Small (right)

参考文献 23

[1]	郝慧珍, 顾庆, 胡修棉. 基于机器学习的矿物智能识别方法研究进展与展望[J]. 地球科学, 2021, 46(9): 3091-3106.
[2]	LOU W, ZHANG D X, BAYLESS R C. Review of mineral recognition and its future[J]. Applied Geochemistry, 2020, 122: 104727.
[3]	徐述腾, 周永章. 基于深度学习的镜下矿石矿物的智能识别实验研究[J]. 岩石学报, 2018, 34(11): 3244-3252.
[4]	周永章, 左仁广, 刘刚, 等. 数学地球科学跨越发展的十年: 大数据、人工智能算法正在改变地质学[J]. 矿物岩石地球化学通报, 2021, 40(3): 556-573.
[5]	周永章, 张良均, 张奥多, 等. 地球科学大数据挖掘与机器学习[M]. 广州: 中山大学出版社, 2018.
[6]	BAYKEN N A, YIMAZ N, KANSUN G, et al. Case study in effects of color spaces for mineral identification[J]. Scientific Research and Essays, 2010, 5(11): 1243-1253.
[7]	郭艳军, 周哲, 林贺洵, 等. 基于深度学习的智能矿物识别方法研究[J]. 地学前缘, 2020, 27(5): 39-47. DOI
[8]	AGRAWAL N, GOVIL H. A deep residual convolutional neural network for mineral classification[J]. Advances in Space Research, 2023, 71(8): 3186-3202.
[9]	彭伟航, 白林, 商世为, 等. 基于改进InceptionV3模型的常见矿物智能识别[J]. 地质通报, 2019, 38(12): 2059-2066.
[10]	杨彪, 马亦骥, 倪瑞璞, 等. 基于多尺度密集连接网络的矿物图像智能识别[J]. 云南大学学报(自然科学版), 2022, 44(6): 1118-1126.
[11]	杨彪, 倪瑞璞, 高皓, 等. 基于多分辨率图像的矿物特征自动提取与矿物智能识别模型[J]. 有色金属工程, 2022, 12(5): 84-93.
[12]	ZENG X, XIAO Y C, JI X H, et al. Mineral identification based on deep learning that combines image and mohs hardness[J]. Minerals, 2021, 11(5): 506.
[13]	矿物数据库[EB/OL]. [2024-04-24]. https://www.mindat.org/.
[14]	WEI X S, SONG Y Z, MAC AODHA O, et al. Fine-grained image analysis with deep learning: a survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(12): 8927-8948.
[15]	马瑶, 智敏, 殷雁君, 等. CNN和Transformer在细粒度图像识别中的应用综述[J]. 计算机工程与应用, 2022, 58(19): 53-63. DOI
[16]	李祥霞, 吉晓慧, 李彬. 细粒度图像分类的深度学习方法[J]. 计算机科学与探索, 2021, 15(10): 1830-1842. DOI
[17]	LIN T Y, ROYCHOWDHURY A, MAJI S. Bilinear CNN models for fine-grained visual recognition[C]// Proceedings of the 2015 IEEE international conference on computer vision (ICCV), Santiago, Chile. New York: IEEE, 2015: 1449-1457.
[18]	ZHUANG P Q, WANG Y L, QIAO Y. Learning attentive pairwise interaction for fine-grained classification[C]// Proceedings of the AAAI conference on artificial intelligence, New York, USA. Washington: AAAI Press, 2020, 34(7): 13130-13137.
[19]	ZHENG H L, FU J L, ZHA Z J, et al. Learning deep bilinear transformation for fine-grained image representation[C]// Proceedings of the 33rd international conference on neural information processing systems (NeurIPS). Vancouver: Curran Associates Inc., 2019: 4277-4286.
[20]	DU R Y, CHANG D L, BHUNIA A K, et al. Fine-grained visual classification via progressive multi-granularity training of jigsaw patches[C]//VEDALDI A, BISCHOF H, BROX T, et al. Proceedings of European conference on computer vision. Cham: Springer, 2020: 153-168.
[21]	DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: transformers for image recognition at scale[EB/OL]. (2021-01-03)[2023-08-15]. https://arxiv.org/abs/2010.11929.
[22]	LI J S, XIA X, LI W, et al. Next-ViT: next generation vision transformer for efficient deployment in realistic industrial scenarios[EB/OL]. (2022-08-16)[2024-04-24]. http://arxiv.org/abs/2207.05501v4.
[23]	DENG J, DONG W, SOCHER R, et al. ImageNet: a large-scale hierarchical image database[C]// Proceedings of 2009 IEEE conference on computer vision and pattern recognition (CVPR), Miami, FL, USA. New York: IEEE, 2009: 248-255.

基于渐进多粒度训练深度学习的矿物图像识别

Mineral image recognition based on progressive deep learning across different granularity levels

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 7

参考文献 23

相关文章 8

编辑推荐

Metrics

本文评价

[1]	董少群, 曾联波, 冀春秋, 张延兵, 郝静茹, 徐小童, 韩高松, 徐辉, 李海明, 李心琦. 超深层致密砂岩裂缝测井识别深度核方法[J]. 地学前缘, 2024, 31(5): 166-176.
[2]	王琳, 季晓慧, 杨眉, 何明跃, 张招崇, 曾姗, 王玉柱. 基于数据增强和集成学习的矿物图像识别[J]. 地学前缘, 2024, 31(4): 87-94.
[3]	周永章, 肖凡. 管窥人工智能与大数据地球科学研究新进展[J]. 地学前缘, 2024, 31(4): 1-6.
[4]	张利军, 鲁文豪, 张建东, 彭光雄, 卜建财, 唐凯, 谢渐成, 徐质彬, 杨海燕. 基于深度学习的镜下岩石、矿物薄片识别[J]. 地学前缘, 2024, 31(3): 498-510.
[5]	陶士振, 吴义平, 陶小晚, 王晓波, 王青, 陈胜, 高建荣, 吴晓智, 刘申奥艺, 宋连腾, 陈荣, 李谦, 杨怡青, 陈悦, 陈秀艳, 陈燕燕, 齐雯. 氦气地质理论认识、资源勘查评价与全产业链一体化评价关键技术[J]. 地学前缘, 2024, 31(1): 351-367.
[6]	蒋果, 周可法, 王金林, 白泳, 孙国庆, 汪玮. 基于深度学习的花岗伟晶岩型锂铍矿物识别研究[J]. 地学前缘, 2023, 30(5): 185-196.
[7]	陈宗铭, 唐玄, 梁国栋, 关子珩. 基于深度学习的页岩扫描电镜图像有机质孔隙识别与比较[J]. 地学前缘, 2023, 30(3): 208-220.
[8]	郭艳军, 周哲, 林贺洵, 刘小辉, 陈丹丘, 祝佳琪, 伍峻琦. 基于深度学习的智能矿物识别方法研究[J]. 地学前缘, 2020, 27(5): 39-47.