TY - GEN
T1 - Cached Embedding with Random Selection
T2 - 12th Asian Conference on Intelligent Information and Database Systems, ACIIDS 2020
AU - Yang, Yaofei
AU - Zhang, Hua Ping
AU - Wu, Linfang
AU - Liu, Xin
AU - Zhang, Yangsen
N1 - Publisher Copyright:
© 2020, Springer Nature Switzerland AG.
PY - 2020
Y1 - 2020
N2 - Embedding is widely used in most natural language processing. e.g., neural machine translation, text classification, text abstraction and sentiment analysis etc. Word-based embedding is faster and character-based embedding performs better. In this paper, we explore a way to combine these two embeddings to bridge the gap between word-based and character-based embedding in speed and performance. In the experiments and analysis of Hybrid Embedding, we found it’s difficult to make these two different embeddings generate the same embedding vector, but we still obtain a comparable result. According to the results of analysis, we explore a form of character-based embedding called Cached Embedding that can achieve almost the same performance and reduce the extra training time by almost half compared to character-based embedding.
AB - Embedding is widely used in most natural language processing. e.g., neural machine translation, text classification, text abstraction and sentiment analysis etc. Word-based embedding is faster and character-based embedding performs better. In this paper, we explore a way to combine these two embeddings to bridge the gap between word-based and character-based embedding in speed and performance. In the experiments and analysis of Hybrid Embedding, we found it’s difficult to make these two different embeddings generate the same embedding vector, but we still obtain a comparable result. According to the results of analysis, we explore a form of character-based embedding called Cached Embedding that can achieve almost the same performance and reduce the extra training time by almost half compared to character-based embedding.
KW - Cached Embedding
KW - Char-aware embedding
KW - Linguist
KW - Natural language processing
KW - Time reduction
KW - Training speed
KW - Word embedding
UR - http://www.scopus.com/inward/record.url?scp=85082295056&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-41964-6_5
DO - 10.1007/978-3-030-41964-6_5
M3 - Conference contribution
AN - SCOPUS:85082295056
SN - 9783030419639
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 51
EP - 62
BT - Intelligent Information and Database Systems - 12th Asian Conference, ACIIDS 2020, Proceedings
A2 - Nguyen, Ngoc Thanh
A2 - Trawinski, Bogdan
A2 - Jearanaitanakij, Kietikul
A2 - Chittayasothorn, Suphamit
A2 - Selamat, Ali
PB - Springer
Y2 - 23 March 2020 through 26 March 2020
ER -