TY - JOUR
T1 - ELM ∗
T2 - distributed extreme learning machine with MapReduce
AU - Xin, Junchang
AU - Wang, Zhiqiong
AU - Chen, Chen
AU - Ding, Linlin
AU - Wang, Guoren
AU - Zhao, Yuhai
N1 - Publisher Copyright:
© 2013, Springer Science+Business Media New York.
PY - 2014/9/1
Y1 - 2014/9/1
N2 - Extreme Learning Machine (ELM) has been widely used in many fields such as text classification, image recognition and bioinformatics, as it provides good generalization performance at a extremely fast learning speed. However, as the data volume in real-world applications becomes larger and larger, the traditional centralized ELM cannot learn such massive data efficiently. Therefore, in this paper, we propose a novel Distributed Extreme Learning Machine based on MapReduce framework, named ELM ∗, which can cover the shortage of traditional ELM whose learning ability is weak to huge dataset. Firstly, after adequately analyzing the property of traditional ELM, it can be found out that the most expensive computation part of the matrix Moore-Penrose generalized inverse operator in the output weight vector calculation is the matrix multiplication operator. Then, as the matrix multiplication operator is decomposable, a Distributed Extreme Learning Machine (ELM ∗) based on MapReduce framework can be developed, which can first calculate the matrix multiplication effectively with MapReduce in parallel, and then calculate the corresponding output weight vector with centralized computing. Therefore, the learning of massive data can be made effectively. Finally, we conduct extensive experiments on synthetic data to verify the effectiveness and efficiency of our proposed ELM ∗ in learning massive data with various experimental settings.
AB - Extreme Learning Machine (ELM) has been widely used in many fields such as text classification, image recognition and bioinformatics, as it provides good generalization performance at a extremely fast learning speed. However, as the data volume in real-world applications becomes larger and larger, the traditional centralized ELM cannot learn such massive data efficiently. Therefore, in this paper, we propose a novel Distributed Extreme Learning Machine based on MapReduce framework, named ELM ∗, which can cover the shortage of traditional ELM whose learning ability is weak to huge dataset. Firstly, after adequately analyzing the property of traditional ELM, it can be found out that the most expensive computation part of the matrix Moore-Penrose generalized inverse operator in the output weight vector calculation is the matrix multiplication operator. Then, as the matrix multiplication operator is decomposable, a Distributed Extreme Learning Machine (ELM ∗) based on MapReduce framework can be developed, which can first calculate the matrix multiplication effectively with MapReduce in parallel, and then calculate the corresponding output weight vector with centralized computing. Therefore, the learning of massive data can be made effectively. Finally, we conduct extensive experiments on synthetic data to verify the effectiveness and efficiency of our proposed ELM ∗ in learning massive data with various experimental settings.
KW - Cloud computing
KW - Extreme learning machine
KW - MapReduce
KW - Massive data processing
UR - https://www.scopus.com/pages/publications/84927804633
U2 - 10.1007/s11280-013-0236-2
DO - 10.1007/s11280-013-0236-2
M3 - Article
AN - SCOPUS:84927804633
SN - 1386-145X
VL - 17
SP - 1189
EP - 1204
JO - World Wide Web
JF - World Wide Web
IS - 5
ER -