Sparse and Hierarchical Transformer for Survival Analysis on Whole Slide Images

Rui Yan; Zhilong Lv; Zhidong Yang; Senlin Lin; Chunhou Zheng; Fa Zhang

doi:10.1109/JBHI.2023.3307584

Sparse and Hierarchical Transformer for Survival Analysis on Whole Slide Images

Rui Yan, Zhilong Lv, Zhidong Yang, Senlin Lin, Chunhou Zheng, Fa Zhang^*

^*Corresponding author for this work

School of Medical and Technology

Research output: Contribution to journal › Article › peer-review

1 Citation (Scopus)

Abstract

The Transformer-based methods provide a good opportunity for modeling the global context of gigapixel whole slide image (WSI), however, there are still two main problems in applying Transformer to WSI-based survival analysis task. First, the training data for survival analysis is limited, which makes the model prone to overfitting. This problem is even worse for Transformer-based models which require large-scale data to train. Second, WSI is of extremely high resolution (up to 150,000 × 150,000 pixels) and is typically organized as a multi-resolution pyramid. Vanilla Transformer cannot model the hierarchical structure of WSI (such as patch cluster-level relationships), which makes it incapable of learning hierarchical WSI representation. To address these problems, in this article, we propose a novel Sparse and Hierarchical Transformer (SH-Transformer) for survival analysis. Specifically, we introduce sparse self-attention to alleviate the overfitting problem, and propose a hierarchical Transformer structure to learn the hierarchical WSI representation. Experimental results based on three WSI datasets show that the proposed framework outperforms the state-of-the-art methods.

Original language	English
Pages (from-to)	7-18
Number of pages	12
Journal	IEEE Journal of Biomedical and Health Informatics
Volume	28
Issue number	1
DOIs	https://doi.org/10.1109/JBHI.2023.3307584
Publication status	Published - 1 Jan 2024

Keywords

Hierarchical representation
pathological image analysis
sparse transformer
survival analysis

Access to Document

10.1109/JBHI.2023.3307584

Cite this

@article{d0b98b10218d4b1ca44169880b212be9,

title = "Sparse and Hierarchical Transformer for Survival Analysis on Whole Slide Images",

abstract = "The Transformer-based methods provide a good opportunity for modeling the global context of gigapixel whole slide image (WSI), however, there are still two main problems in applying Transformer to WSI-based survival analysis task. First, the training data for survival analysis is limited, which makes the model prone to overfitting. This problem is even worse for Transformer-based models which require large-scale data to train. Second, WSI is of extremely high resolution (up to 150,000 × 150,000 pixels) and is typically organized as a multi-resolution pyramid. Vanilla Transformer cannot model the hierarchical structure of WSI (such as patch cluster-level relationships), which makes it incapable of learning hierarchical WSI representation. To address these problems, in this article, we propose a novel Sparse and Hierarchical Transformer (SH-Transformer) for survival analysis. Specifically, we introduce sparse self-attention to alleviate the overfitting problem, and propose a hierarchical Transformer structure to learn the hierarchical WSI representation. Experimental results based on three WSI datasets show that the proposed framework outperforms the state-of-the-art methods.",

keywords = "Hierarchical representation, pathological image analysis, sparse transformer, survival analysis",

author = "Rui Yan and Zhilong Lv and Zhidong Yang and Senlin Lin and Chunhou Zheng and Fa Zhang",

note = "Publisher Copyright: {\textcopyright} 2013 IEEE.",

year = "2024",

month = jan,

day = "1",

doi = "10.1109/JBHI.2023.3307584",

language = "English",

volume = "28",

pages = "7--18",

journal = "IEEE Journal of Biomedical and Health Informatics",

issn = "2168-2194",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "1",

}

TY - JOUR

T1 - Sparse and Hierarchical Transformer for Survival Analysis on Whole Slide Images

AU - Yan, Rui

AU - Lv, Zhilong

AU - Yang, Zhidong

AU - Lin, Senlin

AU - Zheng, Chunhou

AU - Zhang, Fa

PY - 2024/1/1

Y1 - 2024/1/1

N2 - The Transformer-based methods provide a good opportunity for modeling the global context of gigapixel whole slide image (WSI), however, there are still two main problems in applying Transformer to WSI-based survival analysis task. First, the training data for survival analysis is limited, which makes the model prone to overfitting. This problem is even worse for Transformer-based models which require large-scale data to train. Second, WSI is of extremely high resolution (up to 150,000 × 150,000 pixels) and is typically organized as a multi-resolution pyramid. Vanilla Transformer cannot model the hierarchical structure of WSI (such as patch cluster-level relationships), which makes it incapable of learning hierarchical WSI representation. To address these problems, in this article, we propose a novel Sparse and Hierarchical Transformer (SH-Transformer) for survival analysis. Specifically, we introduce sparse self-attention to alleviate the overfitting problem, and propose a hierarchical Transformer structure to learn the hierarchical WSI representation. Experimental results based on three WSI datasets show that the proposed framework outperforms the state-of-the-art methods.

AB - The Transformer-based methods provide a good opportunity for modeling the global context of gigapixel whole slide image (WSI), however, there are still two main problems in applying Transformer to WSI-based survival analysis task. First, the training data for survival analysis is limited, which makes the model prone to overfitting. This problem is even worse for Transformer-based models which require large-scale data to train. Second, WSI is of extremely high resolution (up to 150,000 × 150,000 pixels) and is typically organized as a multi-resolution pyramid. Vanilla Transformer cannot model the hierarchical structure of WSI (such as patch cluster-level relationships), which makes it incapable of learning hierarchical WSI representation. To address these problems, in this article, we propose a novel Sparse and Hierarchical Transformer (SH-Transformer) for survival analysis. Specifically, we introduce sparse self-attention to alleviate the overfitting problem, and propose a hierarchical Transformer structure to learn the hierarchical WSI representation. Experimental results based on three WSI datasets show that the proposed framework outperforms the state-of-the-art methods.

KW - Hierarchical representation

KW - pathological image analysis

KW - sparse transformer

KW - survival analysis

UR - http://www.scopus.com/inward/record.url?scp=85168719849&partnerID=8YFLogxK

U2 - 10.1109/JBHI.2023.3307584

DO - 10.1109/JBHI.2023.3307584

M3 - Article

C2 - 37607153

AN - SCOPUS:85168719849

SN - 2168-2194

VL - 28

SP - 7

EP - 18

JO - IEEE Journal of Biomedical and Health Informatics

JF - IEEE Journal of Biomedical and Health Informatics

IS - 1

ER -

Sparse and Hierarchical Transformer for Survival Analysis on Whole Slide Images

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this