Earth Science Frontiers ›› 2024, Vol. 31 ›› Issue (4): 87-94.DOI: 10.13745/j.esf.sf.2024.5.6

Previous Articles     Next Articles

Mineral identification based on data augmentation and ensemble learning

WANG Lin1(), JI Xiaohui1,*(), YANG Mei2, HE Mingyue2, ZHANG Zhaochong3, ZENG Shan1, WANG Yuzhu1   

  1. 1. School of Information Engineering, China University of Geosciences (Beijing), Beijing 100083, China
    2. National Mineral Rock and Fossil Specimens Resource Center from MOST, China University of Geosciences (Beijing), Beijing 100083, China
    3. School of Earth Sciences and Resources, China University of Geosciences (Beijing), Beijing 100083, China
  • Received:2023-08-30 Revised:2024-02-27 Online:2024-07-25 Published:2024-07-10

Abstract:

Mineral identification as a crucial aspect of geosciences is of great importance to resource exploration, rock classification, and geological monitoring. However, traditional methods are inefficient as they often rely on human experience and subjective judgment. In recent years deep learning-based image classification has been used for accurate and rapid mineral identification. While these studies have achieved certain results, the number of identifiable mineral types are limited and the identification accuracy need to be further improved. This paper aims to address the issue of uneven distribution of mineral image samples in a mineral dataset on 36 common minerals. DCGAN is first used to generate images for data augmentation focusing on the 11 minerals with low sample counts, and the best set of images is selected, by comparison, to expand the dataset. Next, to obtain a more reliable and precise identification model, ResNet, RegNet, EfficientNet, and Vision Transformer models with better performance on ImageNet are transferred to the mineral dataset. Based on the permutations of the trained base models, 11 ensemble models are obtained, with which 24 identification results are obtained using two voting methods, average and weighted soft voting. These results are then compared to select the one with the highest accuracy. The experimental results demonstrated that data augmentation using DCGAN improved the model accuracy by 3.12% averaged over all models. Among the ensemble models, weighted soft voting performed better and achieved the highest accuracy of 87.47% on the augmented dataset.

Key words: mineral identification, deep convolutional generative adversarial networks, data augmentation, ensemble learning

CLC Number: