OreFormer: Ore Sorting Transformer Based on ConvNet and Visual Attention

Yang Liu, Xueyi Wang, Zelin Zhang, Fang Deng*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Intelligent ore sorting stands as a pivotal technology in contemporary mining and production. To establish a more efficient general framework for mineral image recognition, this paper proposes to combine the inductive biases of convolutional neural networks (i.e., locality and translation invariance) with the non-local qualities of transformer architecture (i.e., globality and long-range dependencies) and introduce the self-attention mechanism, which leads to a new series model, called OreFormer. In fine-grained coal sorting, OreFormer demonstrated preferred performance across gas coal, coking coal, and anthracite, which are higher than the purely ConvNet or vision transformer. At the same time, OreFormer variant models achieved a preferred tradeoff between classification performance and efficiency (i.e., they can simultaneously maintain higher classification accuracy), less computational complexity, and smaller model size. In addition, OreFormer had excellent discriminative and feature representation capability, which can distinguish accurately mineral particles with minor apparent differences between categories.

Original languageEnglish
Pages (from-to)521-538
Number of pages18
JournalNatural Resources Research
Volume33
Issue number2
DOIs
Publication statusPublished - Apr 2024

Keywords

  • Fine-grained multi-type and multi-class
  • Locality
  • Long-range dependencies
  • Ore sorting
  • Self-attention

Fingerprint

Dive into the research topics of 'OreFormer: Ore Sorting Transformer Based on ConvNet and Visual Attention'. Together they form a unique fingerprint.

Cite this