OreFormer: Ore Sorting Transformer Based on ConvNet and Visual Attention

Yang Liu; Xueyi Wang; Zelin Zhang; Fang Deng

doi:10.1007/s11053-023-10298-x

OreFormer: Ore Sorting Transformer Based on ConvNet and Visual Attention

Yang Liu, Xueyi Wang, Zelin Zhang, Fang Deng^*

^*Corresponding author for this work

School of Automation

Research output: Contribution to journal › Article › peer-review

1 Citation (Scopus)

Abstract

Intelligent ore sorting stands as a pivotal technology in contemporary mining and production. To establish a more efficient general framework for mineral image recognition, this paper proposes to combine the inductive biases of convolutional neural networks (i.e., locality and translation invariance) with the non-local qualities of transformer architecture (i.e., globality and long-range dependencies) and introduce the self-attention mechanism, which leads to a new series model, called OreFormer. In fine-grained coal sorting, OreFormer demonstrated preferred performance across gas coal, coking coal, and anthracite, which are higher than the purely ConvNet or vision transformer. At the same time, OreFormer variant models achieved a preferred tradeoff between classification performance and efficiency (i.e., they can simultaneously maintain higher classification accuracy), less computational complexity, and smaller model size. In addition, OreFormer had excellent discriminative and feature representation capability, which can distinguish accurately mineral particles with minor apparent differences between categories.

Original language	English
Pages (from-to)	521-538
Number of pages	18
Journal	Natural Resources Research
Volume	33
Issue number	2
DOIs	https://doi.org/10.1007/s11053-023-10298-x
Publication status	Published - Apr 2024

Keywords

Fine-grained multi-type and multi-class
Locality
Long-range dependencies
Ore sorting
Self-attention

Access to Document

10.1007/s11053-023-10298-x

Cite this

@article{c71acda53fad463b9ed061388a1d44f0,

title = "OreFormer: Ore Sorting Transformer Based on ConvNet and Visual Attention",

abstract = "Intelligent ore sorting stands as a pivotal technology in contemporary mining and production. To establish a more efficient general framework for mineral image recognition, this paper proposes to combine the inductive biases of convolutional neural networks (i.e., locality and translation invariance) with the non-local qualities of transformer architecture (i.e., globality and long-range dependencies) and introduce the self-attention mechanism, which leads to a new series model, called OreFormer. In fine-grained coal sorting, OreFormer demonstrated preferred performance across gas coal, coking coal, and anthracite, which are higher than the purely ConvNet or vision transformer. At the same time, OreFormer variant models achieved a preferred tradeoff between classification performance and efficiency (i.e., they can simultaneously maintain higher classification accuracy), less computational complexity, and smaller model size. In addition, OreFormer had excellent discriminative and feature representation capability, which can distinguish accurately mineral particles with minor apparent differences between categories.",

keywords = "Fine-grained multi-type and multi-class, Locality, Long-range dependencies, Ore sorting, Self-attention",

author = "Yang Liu and Xueyi Wang and Zelin Zhang and Fang Deng",

note = "Publisher Copyright: {\textcopyright} International Association for Mathematical Geosciences 2024.",

year = "2024",

month = apr,

doi = "10.1007/s11053-023-10298-x",

language = "English",

volume = "33",

pages = "521--538",

journal = "Natural Resources Research",

issn = "1520-7439",

publisher = "Springer Netherlands",

number = "2",

}

TY - JOUR

T1 - OreFormer

T2 - Ore Sorting Transformer Based on ConvNet and Visual Attention

AU - Liu, Yang

AU - Wang, Xueyi

AU - Zhang, Zelin

AU - Deng, Fang

N1 - Publisher Copyright: © International Association for Mathematical Geosciences 2024.

PY - 2024/4

Y1 - 2024/4

N2 - Intelligent ore sorting stands as a pivotal technology in contemporary mining and production. To establish a more efficient general framework for mineral image recognition, this paper proposes to combine the inductive biases of convolutional neural networks (i.e., locality and translation invariance) with the non-local qualities of transformer architecture (i.e., globality and long-range dependencies) and introduce the self-attention mechanism, which leads to a new series model, called OreFormer. In fine-grained coal sorting, OreFormer demonstrated preferred performance across gas coal, coking coal, and anthracite, which are higher than the purely ConvNet or vision transformer. At the same time, OreFormer variant models achieved a preferred tradeoff between classification performance and efficiency (i.e., they can simultaneously maintain higher classification accuracy), less computational complexity, and smaller model size. In addition, OreFormer had excellent discriminative and feature representation capability, which can distinguish accurately mineral particles with minor apparent differences between categories.

AB - Intelligent ore sorting stands as a pivotal technology in contemporary mining and production. To establish a more efficient general framework for mineral image recognition, this paper proposes to combine the inductive biases of convolutional neural networks (i.e., locality and translation invariance) with the non-local qualities of transformer architecture (i.e., globality and long-range dependencies) and introduce the self-attention mechanism, which leads to a new series model, called OreFormer. In fine-grained coal sorting, OreFormer demonstrated preferred performance across gas coal, coking coal, and anthracite, which are higher than the purely ConvNet or vision transformer. At the same time, OreFormer variant models achieved a preferred tradeoff between classification performance and efficiency (i.e., they can simultaneously maintain higher classification accuracy), less computational complexity, and smaller model size. In addition, OreFormer had excellent discriminative and feature representation capability, which can distinguish accurately mineral particles with minor apparent differences between categories.

KW - Fine-grained multi-type and multi-class

KW - Locality

KW - Long-range dependencies

KW - Ore sorting

KW - Self-attention

UR - http://www.scopus.com/inward/record.url?scp=85182188227&partnerID=8YFLogxK

U2 - 10.1007/s11053-023-10298-x

DO - 10.1007/s11053-023-10298-x

M3 - Article

AN - SCOPUS:85182188227

SN - 1520-7439

VL - 33

SP - 521

EP - 538

JO - Natural Resources Research

JF - Natural Resources Research

IS - 2

ER -

OreFormer: Ore Sorting Transformer Based on ConvNet and Visual Attention

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this