TY - JOUR
T1 - Algorithm analysis and efficient parallelization of the single particle reconstruction software package
T2 - EMAN
AU - Fan, Liya
AU - Zhang, Fa
AU - Wang, Gongming
AU - Liu, Zhiyong
PY - 2010/12
Y1 - 2010/12
N2 - Single particle reconstruction is one of the most important technologies for determining three-dimensional structures of macromolecules. In recent years, it has been given more and more attention, because of some of its distinct features. Unfortunately, its application is greatly constrained, due to its extremely long processing time and lack of efficient parallel implementations. This study optimizes and parallelizes one of the most widely-used software packages for single particle reconstruction: EMAN. By analyzing algorithms of its major components, the authors find that the key problem is achieving ideal load balancing with low communication costs. A self-adaptive dynamic scheduling algorithm is introduced to solve this problem. It is not only applicable to EMAN, but also to other similar scheduling problems with independent tasks. Actual experiments show that through optimization, serial execution time of our implementation is 11.50% less than that of EMAN. Besides, thanks to the self-adaptive scheduling algorithm, our implementation produces much higher speedups than EMAN. Speedups of the most time-consuming classification component are close to linearity. Moreover, parallel efficiency of our implementation on 16 CPU cores is 29.8% higher, compared with the implementation of EMAN. Therefore, our implementation is capable of making full use of available computing resources, dramatically reducing the processing time of single particle reconstruction.
AB - Single particle reconstruction is one of the most important technologies for determining three-dimensional structures of macromolecules. In recent years, it has been given more and more attention, because of some of its distinct features. Unfortunately, its application is greatly constrained, due to its extremely long processing time and lack of efficient parallel implementations. This study optimizes and parallelizes one of the most widely-used software packages for single particle reconstruction: EMAN. By analyzing algorithms of its major components, the authors find that the key problem is achieving ideal load balancing with low communication costs. A self-adaptive dynamic scheduling algorithm is introduced to solve this problem. It is not only applicable to EMAN, but also to other similar scheduling problems with independent tasks. Actual experiments show that through optimization, serial execution time of our implementation is 11.50% less than that of EMAN. Besides, thanks to the self-adaptive scheduling algorithm, our implementation produces much higher speedups than EMAN. Speedups of the most time-consuming classification component are close to linearity. Moreover, parallel efficiency of our implementation on 16 CPU cores is 29.8% higher, compared with the implementation of EMAN. Therefore, our implementation is capable of making full use of available computing resources, dramatically reducing the processing time of single particle reconstruction.
KW - Bioinformation
KW - EMAN
KW - Parallel computing
KW - Scheduling algorithm
KW - Single particle reconstruction
UR - http://www.scopus.com/inward/record.url?scp=78650951358&partnerID=8YFLogxK
M3 - Article
AN - SCOPUS:78650951358
SN - 1000-1239
VL - 47
SP - 2165
EP - 2176
JO - Jisuanji Yanjiu yu Fazhan/Computer Research and Development
JF - Jisuanji Yanjiu yu Fazhan/Computer Research and Development
IS - 12
ER -