An Intelligible Risk Stratification Model Based on Pairwise and Size Constrained Kmeans

Longfei Han, Senlin Luo, Huaiqing Wang, Limin Pan, Xincheng Ma, Tiemei Zhang

Research output: Contribution to journalArticlepeer-review

20 Citations (Scopus)

Abstract

Having a system to stratify individuals according to risk is key to clinical disease prevention. This allows individuals identified at different risk tiers to benefit from further investigation and intervention. But the same risk score estimated for two different persons does not mean they need the same further investigation or represent the similarity health condition between two persons. Meanwhile, users still do not know a prior what most of the risk tiers are, and how many tiers should be found in risk stratification. In this paper, the proposed pairwise and size constrained Kmeans (PSCKmeans) method simultaneously integrates the limited supervised information and the size constraints to screen the high-risk population based on similarity measurement, and gets a feasible and balanced stratification solution to avoid cluster with few points. Results on China Health and Nutrition Survey public dataset and follow-up dataset show that the proposed PSCKmeans method can naturally grade the risk of diabetes into four tiers, and achieve 73.8%, 85.1%, and 0.95% sensitivity, specificity, and ratio of minimum to expected on testing data. The proposed method compares favorably with eight previous semisupervised clustering methods; it demonstrates that semisupervised clustering by unifying multiple forms of constraints can guide a good partition that is more relevant for the domain and find new categories through prior knowledge. Finally, this risk stratification model can provide a tool for risk stratification of clinical disease and be used for further intervention for people with similar health condition.

Original languageEnglish
Article number7762039
Pages (from-to)1288-1296
Number of pages9
JournalIEEE Journal of Biomedical and Health Informatics
Volume21
Issue number5
DOIs
Publication statusPublished - Sept 2017

Keywords

  • Pairwise constraints
  • risk assessment
  • semisupervised clustering
  • size constraints
  • type 2 diabetes

Fingerprint

Dive into the research topics of 'An Intelligible Risk Stratification Model Based on Pairwise and Size Constrained Kmeans'. Together they form a unique fingerprint.

Cite this