Change point detection for high dimensional data via kernel measure with application to human aging brain data

Jinjuan Wang; Na Li; Zhen Meng; Qizhai Li

doi:10.1002/sim.9881

Change point detection for high dimensional data via kernel measure with application to human aging brain data

Jinjuan Wang, Na Li, Zhen Meng^*, Qizhai Li^*

^*Corresponding author for this work

School of Mathematics and Statistics

Research output: Contribution to journal › Article › peer-review

Abstract

Identifying the existence and locations of change points has been a broadly encountered task in many statistical application areas. The existing change point detection methods may produce unsatisfactory results for high-dimensional data since certain distributional assumptions are made on data, which are hard to verify in practice. Moreover, some parameters (such as the number of change points) need to be estimated beforehand for some methods, making their powers sensitive to these values. Here, we propose a kernel-based (Figure presented.) -statistic to identify change points (KUCP) for high dimensional data, which is free of distributional assumptions and sup-parameter estimations. Specifically, we employ a kernel function to describe similarities among the subjects and construct a (Figure presented.) -statistic to test the existence of change point for a given location. The asymptotic properties of the (Figure presented.) -statistic are deduced. We also develop a procedure to locate the change points sequentially via a dichotomy algorithm. Extensive simulations demonstrate that KUCP has higher sensitivity in identifying existence of change points and higher accuracy in locating these change points than its counterparts. We further illustrate its practical utility by analyzing a gene expression data of human brain to detect the time point when gene expression profiles begin to change, which has been reported to be closely related with aging brain.

Original language	English
Pages (from-to)	4644-4663
Number of pages	20
Journal	Statistics in Medicine
Volume	42
Issue number	25
DOIs	https://doi.org/10.1002/sim.9881
Publication status	Published - 10 Nov 2023

Keywords

-statistic
change point detection
gene expression profile
high dimensional data
kernel-based method

Access to Document

10.1002/sim.9881

Cite this

Wang, J., Li, N., Meng, Z., & Li, Q. (2023). Change point detection for high dimensional data via kernel measure with application to human aging brain data. Statistics in Medicine, 42(25), 4644-4663. https://doi.org/10.1002/sim.9881

@article{6e8a9c432c5e443ca659945a406ef011,

title = "Change point detection for high dimensional data via kernel measure with application to human aging brain data",

abstract = "Identifying the existence and locations of change points has been a broadly encountered task in many statistical application areas. The existing change point detection methods may produce unsatisfactory results for high-dimensional data since certain distributional assumptions are made on data, which are hard to verify in practice. Moreover, some parameters (such as the number of change points) need to be estimated beforehand for some methods, making their powers sensitive to these values. Here, we propose a kernel-based (Figure presented.) -statistic to identify change points (KUCP) for high dimensional data, which is free of distributional assumptions and sup-parameter estimations. Specifically, we employ a kernel function to describe similarities among the subjects and construct a (Figure presented.) -statistic to test the existence of change point for a given location. The asymptotic properties of the (Figure presented.) -statistic are deduced. We also develop a procedure to locate the change points sequentially via a dichotomy algorithm. Extensive simulations demonstrate that KUCP has higher sensitivity in identifying existence of change points and higher accuracy in locating these change points than its counterparts. We further illustrate its practical utility by analyzing a gene expression data of human brain to detect the time point when gene expression profiles begin to change, which has been reported to be closely related with aging brain.",

keywords = "-statistic, change point detection, gene expression profile, high dimensional data, kernel-based method",

author = "Jinjuan Wang and Na Li and Zhen Meng and Qizhai Li",

note = "Publisher Copyright: {\textcopyright} 2023 John Wiley & Sons Ltd.",

year = "2023",

month = nov,

day = "10",

doi = "10.1002/sim.9881",

language = "English",

volume = "42",

pages = "4644--4663",

journal = "Statistics in Medicine",

issn = "0277-6715",

publisher = "John Wiley and Sons Ltd",

number = "25",

}

TY - JOUR

T1 - Change point detection for high dimensional data via kernel measure with application to human aging brain data

AU - Wang, Jinjuan

AU - Li, Na

AU - Meng, Zhen

AU - Li, Qizhai

PY - 2023/11/10

Y1 - 2023/11/10

N2 - Identifying the existence and locations of change points has been a broadly encountered task in many statistical application areas. The existing change point detection methods may produce unsatisfactory results for high-dimensional data since certain distributional assumptions are made on data, which are hard to verify in practice. Moreover, some parameters (such as the number of change points) need to be estimated beforehand for some methods, making their powers sensitive to these values. Here, we propose a kernel-based (Figure presented.) -statistic to identify change points (KUCP) for high dimensional data, which is free of distributional assumptions and sup-parameter estimations. Specifically, we employ a kernel function to describe similarities among the subjects and construct a (Figure presented.) -statistic to test the existence of change point for a given location. The asymptotic properties of the (Figure presented.) -statistic are deduced. We also develop a procedure to locate the change points sequentially via a dichotomy algorithm. Extensive simulations demonstrate that KUCP has higher sensitivity in identifying existence of change points and higher accuracy in locating these change points than its counterparts. We further illustrate its practical utility by analyzing a gene expression data of human brain to detect the time point when gene expression profiles begin to change, which has been reported to be closely related with aging brain.

AB - Identifying the existence and locations of change points has been a broadly encountered task in many statistical application areas. The existing change point detection methods may produce unsatisfactory results for high-dimensional data since certain distributional assumptions are made on data, which are hard to verify in practice. Moreover, some parameters (such as the number of change points) need to be estimated beforehand for some methods, making their powers sensitive to these values. Here, we propose a kernel-based (Figure presented.) -statistic to identify change points (KUCP) for high dimensional data, which is free of distributional assumptions and sup-parameter estimations. Specifically, we employ a kernel function to describe similarities among the subjects and construct a (Figure presented.) -statistic to test the existence of change point for a given location. The asymptotic properties of the (Figure presented.) -statistic are deduced. We also develop a procedure to locate the change points sequentially via a dichotomy algorithm. Extensive simulations demonstrate that KUCP has higher sensitivity in identifying existence of change points and higher accuracy in locating these change points than its counterparts. We further illustrate its practical utility by analyzing a gene expression data of human brain to detect the time point when gene expression profiles begin to change, which has been reported to be closely related with aging brain.

KW - -statistic

KW - change point detection

KW - gene expression profile

KW - high dimensional data

KW - kernel-based method

UR - http://www.scopus.com/inward/record.url?scp=85169450183&partnerID=8YFLogxK

U2 - 10.1002/sim.9881

DO - 10.1002/sim.9881

M3 - Article

C2 - 37649243

AN - SCOPUS:85169450183

SN - 0277-6715

VL - 42

SP - 4644

EP - 4663

JO - Statistics in Medicine

JF - Statistics in Medicine

IS - 25

ER -

Change point detection for high dimensional data via kernel measure with application to human aging brain data

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this