TY - GEN
T1 - PickerOptimizer
T2 - 17th International Symposium on Bioinformatics Research and Applications, ISBRA 2021
AU - Li, Hongjia
AU - Chen, Ge
AU - Gao, Shan
AU - Li, Jintao
AU - Zhang, Fa
N1 - Publisher Copyright:
© 2021, Springer Nature Switzerland AG.
PY - 2021
Y1 - 2021
N2 - Cryo-electron microscopy single particle analysis requires tens of thousands of particle projections for the structural determination of macromolecules. To free researchers from laborious particle picking work, a number of fully automatic and semi-automatic particle picking approaches have been proposed. However, due to the presence of carbon and different types of high-contrast contaminations, these approaches tend to select a non-negligible number of false-positive particles, which affects the subsequent 3D reconstruction. In order to overcome this limitation, we present a deep learning-based particle pruning algorithm, PickerOptimizer, to separate erroneously picked particles from the correct ones. PickerOptimizer trained a convolutional neural network based on transfer learning techniques, where the pre-trained model maintains strong generalization ability and can be quickly adapted to the characteristics of the new dataset. Here, we build the first cryo-EM dataset for image classification pre-training which contains particles, carbon regions and high-contrast contaminations from 14 different EMPIAR entries. The PickerOptimizer works by fine-tuning the pre-trained model with only a few manually labeled samples from new datasets. The experiments carried out on several public datasets show that PickerOptimizer is a very efficient approach for particle post-processing, achieving F1 scores above 90%. Moreover, the method is able to identify false-positive particles more accurately than other pruning strategies. A case study further shows that PickerOptimizer can improve conventional particle pickers and complement deep-learning-based ones. The Source code, pre-trained models and datasets are available at https://github.com/LiHongjia-ict/PickerOptimizer/.
AB - Cryo-electron microscopy single particle analysis requires tens of thousands of particle projections for the structural determination of macromolecules. To free researchers from laborious particle picking work, a number of fully automatic and semi-automatic particle picking approaches have been proposed. However, due to the presence of carbon and different types of high-contrast contaminations, these approaches tend to select a non-negligible number of false-positive particles, which affects the subsequent 3D reconstruction. In order to overcome this limitation, we present a deep learning-based particle pruning algorithm, PickerOptimizer, to separate erroneously picked particles from the correct ones. PickerOptimizer trained a convolutional neural network based on transfer learning techniques, where the pre-trained model maintains strong generalization ability and can be quickly adapted to the characteristics of the new dataset. Here, we build the first cryo-EM dataset for image classification pre-training which contains particles, carbon regions and high-contrast contaminations from 14 different EMPIAR entries. The PickerOptimizer works by fine-tuning the pre-trained model with only a few manually labeled samples from new datasets. The experiments carried out on several public datasets show that PickerOptimizer is a very efficient approach for particle post-processing, achieving F1 scores above 90%. Moreover, the method is able to identify false-positive particles more accurately than other pruning strategies. A case study further shows that PickerOptimizer can improve conventional particle pickers and complement deep-learning-based ones. The Source code, pre-trained models and datasets are available at https://github.com/LiHongjia-ict/PickerOptimizer/.
KW - Cryo-electron microscopy
KW - Deep learning
KW - Particle pruning
KW - Transfer learning
UR - http://www.scopus.com/inward/record.url?scp=85120625160&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-91415-8_46
DO - 10.1007/978-3-030-91415-8_46
M3 - Conference contribution
AN - SCOPUS:85120625160
SN - 9783030914141
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 549
EP - 560
BT - Bioinformatics Research and Applications - 17th International Symposium, ISBRA 2021, Proceedings
A2 - Wei, Yanjie
A2 - Li, Min
A2 - Skums, Pavel
A2 - Cai, Zhipeng
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 26 November 2021 through 28 November 2021
ER -