TY - JOUR
T1 - A resample-replace lasso procedure for combining high-dimensional markers with limit of detection
AU - Wang, Jinjuan
AU - Zhao, Yunpeng
AU - Tang, Larry L.
AU - Mueller, Claudius
AU - Li, Qizhai
N1 - Publisher Copyright:
© 2021 Informa UK Limited, trading as Taylor & Francis Group.
PY - 2022
Y1 - 2022
N2 - In disease screening, a biomarker combination developed by combining multiple markers tends to have a higher sensitivity than an individual marker. Parametric methods for marker combination rely on the inverse of covariance matrices, which is often a non-trivial problem for high-dimensional data generated by modern high-throughput technologies. Additionally, another common problem in disease diagnosis is the existence of limit of detection (LOD) for an instrument–that is, when a biomarker's value falls below the limit, it cannot be observed and is assigned an NA value. To handle these two challenges in combining high-dimensional biomarkers with the presence of LOD, we propose a resample-replace lasso procedure. We first impute the values below LOD and then use the graphical lasso method to estimate the means and precision matrices for the high-dimensional biomarkers. The simulation results show that our method outperforms alternative methods such as either substitute NA values with LOD values or remove observations that have NA values. A real case analysis on a protein profiling study of glioblastoma patients on their survival status indicates that the biomarker combination obtained through the proposed method is more accurate in distinguishing between two groups.
AB - In disease screening, a biomarker combination developed by combining multiple markers tends to have a higher sensitivity than an individual marker. Parametric methods for marker combination rely on the inverse of covariance matrices, which is often a non-trivial problem for high-dimensional data generated by modern high-throughput technologies. Additionally, another common problem in disease diagnosis is the existence of limit of detection (LOD) for an instrument–that is, when a biomarker's value falls below the limit, it cannot be observed and is assigned an NA value. To handle these two challenges in combining high-dimensional biomarkers with the presence of LOD, we propose a resample-replace lasso procedure. We first impute the values below LOD and then use the graphical lasso method to estimate the means and precision matrices for the high-dimensional biomarkers. The simulation results show that our method outperforms alternative methods such as either substitute NA values with LOD values or remove observations that have NA values. A real case analysis on a protein profiling study of glioblastoma patients on their survival status indicates that the biomarker combination obtained through the proposed method is more accurate in distinguishing between two groups.
KW - Limit of detection (LOD)
KW - area under the receiver operating characteristic curve (AUC)
KW - graphical lasso
KW - high-dimensional data
KW - imputation
KW - precision matrix
UR - http://www.scopus.com/inward/record.url?scp=85115299045&partnerID=8YFLogxK
U2 - 10.1080/02664763.2021.1977785
DO - 10.1080/02664763.2021.1977785
M3 - Article
AN - SCOPUS:85115299045
SN - 0266-4763
VL - 49
SP - 4278
EP - 4293
JO - Journal of Applied Statistics
JF - Journal of Applied Statistics
IS - 16
ER -