ECG2TOK: ECG Pre-Training with Self-Distillation Semantic Tokenizers

  • Xiaoyan Yuan
  • , Wei Wang*
  • , Han Liu
  • , Jian Chen
  • , Xiping Hu
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Self-supervised learning (SSL) has garnered increasing attention in electrocardiogram (ECG) analysis for its effectiveness in resource-limited settings. Existing state-of-the-art SSL methods rely on time-frequency detail reconstruction, but due to the inherent redundancy of ECG signals and individual variability, these approaches often yield suboptimal performance. In contrast, discrete label prediction becomes a superior pre-training objective by encouraging models to efficiently abstract ECG high-level semantics. However, the continuity and significant variability of ECG signals pose a challenge in generating semantically discrete labels. To address this issue, we propose an ECG pretraining framework with a self-distillation semantic tokenizer (ECG2TOK), which maps continuous ECG signals into discrete labels for self-supervised training. Specifically, the tokenizer extracts semantically aware embeddings of ECG by self-distillation and performs online clustering to generate semantically rich discrete labels. Subsequently, the SSL model is trained in conjunction with masking strategies and discrete label prediction to facilitate the abstraction of high-level semantic representations. We evaluate ECG2TOK in six downstream tasks, demonstrating that ECG2TOK efficiently achieves state-of-the-art performance and up to a 30.73% AUC increase in low-resource scenarios. Moreover, visualization experiments demonstrate that the discrete labels generated by ECG2TOK exhibit consistent semantics closely associated with clinical features. Our code is available on https://github.com/YXYanova/ECG2TOK.

Original languageEnglish
Title of host publicationProceedings of the 34th International Joint Conference on Artificial Intelligence, IJCAI 2025
EditorsJames Kwok
PublisherInternational Joint Conferences on Artificial Intelligence
Pages9990-9998
Number of pages9
ISBN (Electronic)9781956792065
DOIs
Publication statusPublished - 2025
Externally publishedYes
Event34th Internationa Joint Conference on Artificial Intelligence, IJCAI 2025 - Montreal, Canada
Duration: 16 Aug 202522 Aug 2025

Publication series

NameIJCAI International Joint Conference on Artificial Intelligence
ISSN (Print)1045-0823

Conference

Conference34th Internationa Joint Conference on Artificial Intelligence, IJCAI 2025
Country/TerritoryCanada
CityMontreal
Period16/08/2522/08/25

Fingerprint

Dive into the research topics of 'ECG2TOK: ECG Pre-Training with Self-Distillation Semantic Tokenizers'. Together they form a unique fingerprint.

Cite this