TY - JOUR
T1 - Nationwide prediction of type 2 diabetes comorbidities
AU - Dworzynski, Piotr
AU - Aasbrenn, Martin
AU - Rostgaard, Klaus
AU - Melbye, Mads
AU - Gerds, Thomas Alexander
AU - Hjalgrim, Henrik
AU - Pers, Tune H
PY - 2020/2/4
Y1 - 2020/2/4
N2 - Identification of individuals at risk of developing disease comorbidities represents an important task in tackling the growing personal and societal burdens associated with chronic diseases. We employed machine learning techniques to investigate to what extent data from longitudinal, nationwide Danish health registers can be used to predict individuals at high risk of developing type 2 diabetes (T2D) comorbidities. Leveraging logistic regression-, random forest- and gradient boosting models and register data spanning hospitalizations, drug prescriptions and contacts with primary care contractors from >200,000 individuals newly diagnosed with T2D, we predicted five-year risk of heart failure (HF), myocardial infarction (MI), stroke (ST), cardiovascular disease (CVD) and chronic kidney disease (CKD). For HF, MI, CVD, and CKD, register-based models outperformed a reference model leveraging canonical individual characteristics by achieving area under the receiver operating characteristic curve improvements of 0.06, 0.03, 0.04, and 0.07, respectively. The top 1,000 patients predicted to be at highest risk exhibited observed incidence ratios exceeding 4.99, 3.52, 1.97 and 4.71 respectively. In summary, prediction of T2D comorbidities utilizing Danish registers led to consistent albeit modest performance improvements over reference models, suggesting that register data could be leveraged to systematically identify individuals at risk of developing disease comorbidities.
AB - Identification of individuals at risk of developing disease comorbidities represents an important task in tackling the growing personal and societal burdens associated with chronic diseases. We employed machine learning techniques to investigate to what extent data from longitudinal, nationwide Danish health registers can be used to predict individuals at high risk of developing type 2 diabetes (T2D) comorbidities. Leveraging logistic regression-, random forest- and gradient boosting models and register data spanning hospitalizations, drug prescriptions and contacts with primary care contractors from >200,000 individuals newly diagnosed with T2D, we predicted five-year risk of heart failure (HF), myocardial infarction (MI), stroke (ST), cardiovascular disease (CVD) and chronic kidney disease (CKD). For HF, MI, CVD, and CKD, register-based models outperformed a reference model leveraging canonical individual characteristics by achieving area under the receiver operating characteristic curve improvements of 0.06, 0.03, 0.04, and 0.07, respectively. The top 1,000 patients predicted to be at highest risk exhibited observed incidence ratios exceeding 4.99, 3.52, 1.97 and 4.71 respectively. In summary, prediction of T2D comorbidities utilizing Danish registers led to consistent albeit modest performance improvements over reference models, suggesting that register data could be leveraged to systematically identify individuals at risk of developing disease comorbidities.
KW - Cardiovascular Diseases/epidemiology
KW - Comorbidity
KW - Denmark/epidemiology
KW - Diabetes Mellitus, Type 2/epidemiology
KW - Female
KW - Heart Failure/epidemiology
KW - Humans
KW - Male
KW - Middle Aged
KW - Myocardial Infarction/epidemiology
KW - Registries
KW - Renal Insufficiency, Chronic/epidemiology
KW - Stroke/epidemiology
U2 - 10.1038/s41598-020-58601-7
DO - 10.1038/s41598-020-58601-7
M3 - Journal article
C2 - 32019971
SN - 2045-2322
VL - 10
SP - 1776
JO - Scientific Reports
JF - Scientific Reports
IS - 1
M1 - 1776
ER -