TY - JOUR
T1 - Translating polygenic risk scores for clinical use by estimating the confidence bounds of risk prediction
AU - Sun, Jiangming
AU - Wang, Yunpeng
AU - Folkersen, Lasse
AU - Borné, Yan
AU - Amlien, Inge
AU - Buil, Alfonso
AU - Orho-Melander, Marju
AU - Børglum, Anders D
AU - Hougaard, David M
AU - Regeneron Genetics Center
AU - Melander, Olle
AU - Engström, Gunnar
AU - Werge, Thomas
AU - Lage, Kasper
N1 - © 2021. The Author(s).
PY - 2021/9/6
Y1 - 2021/9/6
N2 - A promise of genomics in precision medicine is to provide individualized genetic risk predictions. Polygenic risk scores (PRS), computed by aggregating effects from many genomic variants, have been developed as a useful tool in complex disease research. However, the application of PRS as a tool for predicting an individual's disease susceptibility in a clinical setting is challenging because PRS typically provide a relative measure of risk evaluated at the level of a group of people but not at individual level. Here, we introduce a machine-learning technique, Mondrian Cross-Conformal Prediction (MCCP), to estimate the confidence bounds of PRS-to-disease-risk prediction. MCCP can report disease status conditional probability value for each individual and give a prediction at a desired error level. Moreover, with a user-defined prediction error rate, MCCP can estimate the proportion of sample (coverage) with a correct prediction.
AB - A promise of genomics in precision medicine is to provide individualized genetic risk predictions. Polygenic risk scores (PRS), computed by aggregating effects from many genomic variants, have been developed as a useful tool in complex disease research. However, the application of PRS as a tool for predicting an individual's disease susceptibility in a clinical setting is challenging because PRS typically provide a relative measure of risk evaluated at the level of a group of people but not at individual level. Here, we introduce a machine-learning technique, Mondrian Cross-Conformal Prediction (MCCP), to estimate the confidence bounds of PRS-to-disease-risk prediction. MCCP can report disease status conditional probability value for each individual and give a prediction at a desired error level. Moreover, with a user-defined prediction error rate, MCCP can estimate the proportion of sample (coverage) with a correct prediction.
KW - Age Factors
KW - Biological Specimen Banks
KW - Breast Neoplasms/genetics
KW - Coronary Artery Disease/genetics
KW - Diabetes Mellitus, Type 2/genetics
KW - Female
KW - Genetic Predisposition to Disease/genetics
KW - Humans
KW - Inflammatory Bowel Diseases/genetics
KW - Machine Learning
KW - Male
KW - Multifactorial Inheritance/genetics
KW - Reproducibility of Results
KW - Schizophrenia/genetics
KW - Sweden
KW - United Kingdom
UR - http://www.scopus.com/inward/record.url?scp=85114717685&partnerID=8YFLogxK
U2 - 10.1038/s41467-021-25014-7
DO - 10.1038/s41467-021-25014-7
M3 - Journal article
C2 - 34489429
VL - 12
SP - 5276
JO - Nature Communications
JF - Nature Communications
SN - 2041-1722
IS - 1
M1 - 5276
ER -