TY - JOUR
T1 - Improve the Resolution and Parallel Performance of the Three-Dimensional Refine Algorithm in RELION Using CUDA and MPI
AU - Zhang, Jingrong
AU - Wang, Zihao
AU - Liu, Zhiyong
AU - Zhang, Fa
N1 - Publisher Copyright:
© 2004-2012 IEEE.
PY - 2021/3/1
Y1 - 2021/3/1
N2 - In cryo-electron microscopy, RELION is a powerful tool for high-resolution reconstruction. Due to the complicated imaging procedure and the heterogeneity of particles, some of the selected particle images offer more disturbing information than others. However, in the current RELION, all these particle images are treated equally. In our work, we extend RELION's model with one scalar parameter to score the contribution of a particle depending on the error between the experimental particle and the corresponding reprojection. This scores down weight potentially poor particles, hence accelerating the convergence. Besides, by now there is no sophisticated memory management system for RELION, fragmentation on GPU will increase with iterations, eventually crashing the program. In our work, we designed the stack-based memory management system to guarantee the stability of RELION and to optimize the memory usage condition. Also, to reduce memory usage, we developed a customized compressed data structure for the memory-demanding weight array. In addition, to speed up the GPU version of RELION, we proposed two highly efficient parallel algorithms for weight calculation algorithm and weight selection algorithm. Experiments show that compared with RELION, the optimized three-dimensional refine algorithm can speed up the converge procedure, the memory system can avoid memory fragmentation, and a better speed-up ratio can be obtained.
AB - In cryo-electron microscopy, RELION is a powerful tool for high-resolution reconstruction. Due to the complicated imaging procedure and the heterogeneity of particles, some of the selected particle images offer more disturbing information than others. However, in the current RELION, all these particle images are treated equally. In our work, we extend RELION's model with one scalar parameter to score the contribution of a particle depending on the error between the experimental particle and the corresponding reprojection. This scores down weight potentially poor particles, hence accelerating the convergence. Besides, by now there is no sophisticated memory management system for RELION, fragmentation on GPU will increase with iterations, eventually crashing the program. In our work, we designed the stack-based memory management system to guarantee the stability of RELION and to optimize the memory usage condition. Also, to reduce memory usage, we developed a customized compressed data structure for the memory-demanding weight array. In addition, to speed up the GPU version of RELION, we proposed two highly efficient parallel algorithms for weight calculation algorithm and weight selection algorithm. Experiments show that compared with RELION, the optimized three-dimensional refine algorithm can speed up the converge procedure, the memory system can avoid memory fragmentation, and a better speed-up ratio can be obtained.
KW - CUDA
KW - RELION
KW - Top-k selection
KW - cryoEM
KW - statistical method
UR - http://www.scopus.com/inward/record.url?scp=85104118005&partnerID=8YFLogxK
U2 - 10.1109/TCBB.2019.2929171
DO - 10.1109/TCBB.2019.2929171
M3 - Article
C2 - 31329127
AN - SCOPUS:85104118005
SN - 1545-5963
VL - 18
SP - 583
EP - 595
JO - IEEE/ACM Transactions on Computational Biology and Bioinformatics
JF - IEEE/ACM Transactions on Computational Biology and Bioinformatics
IS - 2
M1 - 8766887
ER -