TY - GEN
T1 - Attentive Dual Embedding for Understanding Medical Concepts in Electronic Health Records
AU - Peng, Xueping
AU - Long, Guodong
AU - Pan, Shirui
AU - Jiang, Jing
AU - Niu, Zhendong
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/7
Y1 - 2019/7
N2 - Electronic health records contain a wealth of information on a patients healthcare over many visits, such as diagnoses, treatments, drugs administered, and so on. The untapped potential of these data in healthcare analytics is vast. However, given that much of medical information is a cause and effect science, new embedding methods are required to ensure the learning representations reflect the comprehensive interplays between medical concepts and their relationships over time. Unlike one-hot encoding, a distributed representation should preserve these complex interactions as high-quality inputs for machine learning-based healthcare analytics tasks. Therefore, we propose a novel attentive dual embedding method called MC2Vec. MC2Vec captures the proximity relationships between medical concepts through a two-step optimization framework that recursively refines the embedding for superior output. The framework comprises a Skip-gram model to generate the initial embedding and an attentive CBOW model to fine-tune the embedding with temporal information gleaned from sequences of patient visits. Experiments with two public datasets demonstrate that MC2Vecs produces embeddings of higher quality than five state-of-the-art methods.
AB - Electronic health records contain a wealth of information on a patients healthcare over many visits, such as diagnoses, treatments, drugs administered, and so on. The untapped potential of these data in healthcare analytics is vast. However, given that much of medical information is a cause and effect science, new embedding methods are required to ensure the learning representations reflect the comprehensive interplays between medical concepts and their relationships over time. Unlike one-hot encoding, a distributed representation should preserve these complex interactions as high-quality inputs for machine learning-based healthcare analytics tasks. Therefore, we propose a novel attentive dual embedding method called MC2Vec. MC2Vec captures the proximity relationships between medical concepts through a two-step optimization framework that recursively refines the embedding for superior output. The framework comprises a Skip-gram model to generate the initial embedding and an attentive CBOW model to fine-tune the embedding with temporal information gleaned from sequences of patient visits. Experiments with two public datasets demonstrate that MC2Vecs produces embeddings of higher quality than five state-of-the-art methods.
KW - attention mechanism
KW - dual embedding
KW - med2Vec
KW - medical concept embedding
UR - http://www.scopus.com/inward/record.url?scp=85073253728&partnerID=8YFLogxK
U2 - 10.1109/IJCNN.2019.8852429
DO - 10.1109/IJCNN.2019.8852429
M3 - Conference contribution
AN - SCOPUS:85073253728
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - 2019 International Joint Conference on Neural Networks, IJCNN 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2019 International Joint Conference on Neural Networks, IJCNN 2019
Y2 - 14 July 2019 through 19 July 2019
ER -