Abstract
Having a system to stratify individuals according to risk is key to clinical disease prevention. This allows individuals identified at different risk tiers to benefit from further investigation and intervention. But the same risk score estimated for two different persons does not mean they need the same further investigation or represent the similarity health condition between two persons. Meanwhile, users still do not know a prior what most of the risk tiers are, and how many tiers should be found in risk stratification. In this paper, the proposed pairwise and size constrained Kmeans (PSCKmeans) method simultaneously integrates the limited supervised information and the size constraints to screen the high-risk population based on similarity measurement, and gets a feasible and balanced stratification solution to avoid cluster with few points. Results on China Health and Nutrition Survey public dataset and follow-up dataset show that the proposed PSCKmeans method can naturally grade the risk of diabetes into four tiers, and achieve 73.8%, 85.1%, and 0.95% sensitivity, specificity, and ratio of minimum to expected on testing data. The proposed method compares favorably with eight previous semisupervised clustering methods; it demonstrates that semisupervised clustering by unifying multiple forms of constraints can guide a good partition that is more relevant for the domain and find new categories through prior knowledge. Finally, this risk stratification model can provide a tool for risk stratification of clinical disease and be used for further intervention for people with similar health condition.
| Original language | English |
|---|---|
| Article number | 7762039 |
| Pages (from-to) | 1288-1296 |
| Number of pages | 9 |
| Journal | IEEE Journal of Biomedical and Health Informatics |
| Volume | 21 |
| Issue number | 5 |
| DOIs | |
| Publication status | Published - Sept 2017 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Keywords
- Pairwise constraints
- risk assessment
- semisupervised clustering
- size constraints
- type 2 diabetes
Fingerprint
Dive into the research topics of 'An Intelligible Risk Stratification Model Based on Pairwise and Size Constrained Kmeans'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver