TY - GEN
T1 - Attributed Network Embedding in Streaming Style
AU - Wu, Anbiao
AU - Yuan, Ye
AU - Li, Changsheng
AU - Ma, Yuliang
AU - Zhang, Hao
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Attributed network embedding (ANE) can learn low-dimensional embeddings for nodes in attributed graphs, which can facilitate several data analysis tasks. However, the existing ANE methods fail to tackle scenarios involving the continuous generation of attributes. The ongoing generation of attributes accumulates numerous attributes, incurring high storage costs in existing methods. Furthermore, due to storage limitations, old attributes will be discarded as new ones are generated, existing methods struggle to integrate the new attribute information into embeddings generated from old attributes. Therefore, we propose a novel ANE framework named SANE (Streaming-style ANE), featuring a 'memory' capability -that is, when updating the embeddings for new attributes, old attribute information can be partly preserved. In SANE, we first define forward and backward affinity between nodes and attributes by reviewing a node as source or target node. The definition guides quick computation of affinity vectors that integrate both topological and attribute information. Meanwhile, we propose an augmentation strategy to enrich node attribute information for enhance the quality of node embeddings. Leveraging the augmented attributes, we iteratively generate forward and backward affinity vectors, providing quantification of node-attribute affinity in two directions. Subsequently, we achieve a streaming-style update of node embeddings by employing matrix sketching technology on these iteratively generated vectors. Furthermore, capitalizing on the mergeability of matrix sketching, we efficiently integrate information of new generated attributes into node embeddings. Extensive experiments on 5 real datasets demonstrate that SANE surpasses the state-of-the-art algorithms in node classification and link prediction. SANE's ability to incorporate new attribute information into embeddings in a fast manner is validated through adequate simulation experiments.
AB - Attributed network embedding (ANE) can learn low-dimensional embeddings for nodes in attributed graphs, which can facilitate several data analysis tasks. However, the existing ANE methods fail to tackle scenarios involving the continuous generation of attributes. The ongoing generation of attributes accumulates numerous attributes, incurring high storage costs in existing methods. Furthermore, due to storage limitations, old attributes will be discarded as new ones are generated, existing methods struggle to integrate the new attribute information into embeddings generated from old attributes. Therefore, we propose a novel ANE framework named SANE (Streaming-style ANE), featuring a 'memory' capability -that is, when updating the embeddings for new attributes, old attribute information can be partly preserved. In SANE, we first define forward and backward affinity between nodes and attributes by reviewing a node as source or target node. The definition guides quick computation of affinity vectors that integrate both topological and attribute information. Meanwhile, we propose an augmentation strategy to enrich node attribute information for enhance the quality of node embeddings. Leveraging the augmented attributes, we iteratively generate forward and backward affinity vectors, providing quantification of node-attribute affinity in two directions. Subsequently, we achieve a streaming-style update of node embeddings by employing matrix sketching technology on these iteratively generated vectors. Furthermore, capitalizing on the mergeability of matrix sketching, we efficiently integrate information of new generated attributes into node embeddings. Extensive experiments on 5 real datasets demonstrate that SANE surpasses the state-of-the-art algorithms in node classification and link prediction. SANE's ability to incorporate new attribute information into embeddings in a fast manner is validated through adequate simulation experiments.
KW - Attributed network
KW - matrix sketching
KW - network embedding
UR - http://www.scopus.com/inward/record.url?scp=85200455503&partnerID=8YFLogxK
U2 - 10.1109/ICDE60146.2024.00243
DO - 10.1109/ICDE60146.2024.00243
M3 - Conference contribution
AN - SCOPUS:85200455503
T3 - Proceedings - International Conference on Data Engineering
SP - 3138
EP - 3150
BT - Proceedings - 2024 IEEE 40th International Conference on Data Engineering, ICDE 2024
PB - IEEE Computer Society
T2 - 40th IEEE International Conference on Data Engineering, ICDE 2024
Y2 - 13 May 2024 through 17 May 2024
ER -