Research
Print page Print page
Switch language
The Capital Region of Denmark - a part of Copenhagen University Hospital
Published

Predicting and elucidating the etiology of fatty liver disease: A machine learning modeling and validation study in the IMI DIRECT cohorts

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Atabaki-Pasdar, N, Ohlsson, M, Viñuela, A, Frau, F, Pomares-Millan, H, Haid, M, Jones, AG, Thomas, EL, Koivula, RW, Kurbasic, A, Mutie, PM, Fitipaldi, H, Fernandez, J, Dawed, AY, Giordano, GN, Forgie, IM, McDonald, TJ, Rutters, F, Cederberg, H, Chabanova, E, Dale, M, Masi, FD, Thomas, CE, Allin, KH, Hansen, TH, Heggie, A, Hong, M-G, Elders, PJM, Kennedy, G, Kokkola, T, Pedersen, HK, Mahajan, A, McEvoy, D, Pattou, F, Raverdy, V, Häussler, RS, Sharma, S, Thomsen, HS, Vangipurapu, J, Vestergaard, H, 't Hart, LM, Adamski, J, Musholt, PB, Brage, S, Brunak, S, Dermitzakis, E, Frost, G, Hansen, T, Laakso, M, Pedersen, O, Ridderstråle, M, Ruetten, H, Hattersley, AT, Walker, M, Beulens, JWJ, Mari, A, Schwenk, JM, Gupta, R, McCarthy, MI, Pearson, ER, Bell, JD, Pavo, I & Franks, PW 2020, 'Predicting and elucidating the etiology of fatty liver disease: A machine learning modeling and validation study in the IMI DIRECT cohorts', PLOS Medicine, vol. 17, no. 6, pp. e1003149. https://doi.org/10.1371/journal.pmed.1003149

APA

Atabaki-Pasdar, N., Ohlsson, M., Viñuela, A., Frau, F., Pomares-Millan, H., Haid, M., Jones, A. G., Thomas, E. L., Koivula, R. W., Kurbasic, A., Mutie, P. M., Fitipaldi, H., Fernandez, J., Dawed, A. Y., Giordano, G. N., Forgie, I. M., McDonald, T. J., Rutters, F., Cederberg, H., ... Franks, P. W. (2020). Predicting and elucidating the etiology of fatty liver disease: A machine learning modeling and validation study in the IMI DIRECT cohorts. PLOS Medicine, 17(6), e1003149. https://doi.org/10.1371/journal.pmed.1003149

CBE

Atabaki-Pasdar N, Ohlsson M, Viñuela A, Frau F, Pomares-Millan H, Haid M, Jones AG, Thomas EL, Koivula RW, Kurbasic A, Mutie PM, Fitipaldi H, Fernandez J, Dawed AY, Giordano GN, Forgie IM, McDonald TJ, Rutters F, Cederberg H, Chabanova E, Dale M, Masi FD, Thomas CE, Allin KH, Hansen TH, Heggie A, Hong M-G, Elders PJM, Kennedy G, Kokkola T, Pedersen HK, Mahajan A, McEvoy D, Pattou F, Raverdy V, Häussler RS, Sharma S, Thomsen HS, Vangipurapu J, Vestergaard H, 't Hart LM, Adamski J, Musholt PB, Brage S, Brunak S, Dermitzakis E, Frost G, Hansen T, Laakso M, Pedersen O, Ridderstråle M, Ruetten H, Hattersley AT, Walker M, Beulens JWJ, Mari A, Schwenk JM, Gupta R, McCarthy MI, Pearson ER, Bell JD, Pavo I, Franks PW. 2020. Predicting and elucidating the etiology of fatty liver disease: A machine learning modeling and validation study in the IMI DIRECT cohorts. PLOS Medicine. 17(6):e1003149. https://doi.org/10.1371/journal.pmed.1003149

MLA

Vancouver

Author

Atabaki-Pasdar, Naeimeh ; Ohlsson, Mattias ; Viñuela, Ana ; Frau, Francesca ; Pomares-Millan, Hugo ; Haid, Mark ; Jones, Angus G ; Thomas, E Louise ; Koivula, Robert W ; Kurbasic, Azra ; Mutie, Pascal M ; Fitipaldi, Hugo ; Fernandez, Juan ; Dawed, Adem Y ; Giordano, Giuseppe N ; Forgie, Ian M ; McDonald, Timothy J ; Rutters, Femke ; Cederberg, Henna ; Chabanova, Elizaveta ; Dale, Matilda ; Masi, Federico De ; Thomas, Cecilia Engel ; Allin, Kristine H ; Hansen, Tue H ; Heggie, Alison ; Hong, Mun-Gwan ; Elders, Petra J M ; Kennedy, Gwen ; Kokkola, Tarja ; Pedersen, Helle Krogh ; Mahajan, Anubha ; McEvoy, Donna ; Pattou, Francois ; Raverdy, Violeta ; Häussler, Ragna S ; Sharma, Sapna ; Thomsen, Henrik S ; Vangipurapu, Jagadish ; Vestergaard, Henrik ; 't Hart, Leen M ; Adamski, Jerzy ; Musholt, Petra B ; Brage, Soren ; Brunak, Søren ; Dermitzakis, Emmanouil ; Frost, Gary ; Hansen, Torben ; Laakso, Markku ; Pedersen, Oluf ; Ridderstråle, Martin ; Ruetten, Hartmut ; Hattersley, Andrew T ; Walker, Mark ; Beulens, Joline W J ; Mari, Andrea ; Schwenk, Jochen M ; Gupta, Ramneek ; McCarthy, Mark I ; Pearson, Ewan R ; Bell, Jimmy D ; Pavo, Imre ; Franks, Paul W. / Predicting and elucidating the etiology of fatty liver disease : A machine learning modeling and validation study in the IMI DIRECT cohorts. In: PLOS Medicine. 2020 ; Vol. 17, No. 6. pp. e1003149.

Bibtex

@article{1bbb4532c1c14c6aa605ce9b6b864dcc,
title = "Predicting and elucidating the etiology of fatty liver disease: A machine learning modeling and validation study in the IMI DIRECT cohorts",
abstract = "BACKGROUND: Non-alcoholic fatty liver disease (NAFLD) is highly prevalent and causes serious health complications in individuals with and without type 2 diabetes (T2D). Early diagnosis of NAFLD is important, as this can help prevent irreversible damage to the liver and, ultimately, hepatocellular carcinomas. We sought to expand etiological understanding and develop a diagnostic tool for NAFLD using machine learning.METHODS AND FINDINGS: We utilized the baseline data from IMI DIRECT, a multicenter prospective cohort study of 3,029 European-ancestry adults recently diagnosed with T2D (n = 795) or at high risk of developing the disease (n = 2,234). Multi-omics (genetic, transcriptomic, proteomic, and metabolomic) and clinical (liver enzymes and other serological biomarkers, anthropometry, measures of beta-cell function, insulin sensitivity, and lifestyle) data comprised the key input variables. The models were trained on MRI-image-derived liver fat content (<5% or ≥5%) available for 1,514 participants. We applied LASSO (least absolute shrinkage and selection operator) to select features from the different layers of omics data and random forest analysis to develop the models. The prediction models included clinical and omics variables separately or in combination. A model including all omics and clinical variables yielded a cross-validated receiver operating characteristic area under the curve (ROCAUC) of 0.84 (95% CI 0.82, 0.86; p < 0.001), which compared with a ROCAUC of 0.82 (95% CI 0.81, 0.83; p < 0.001) for a model including 9 clinically accessible variables. The IMI DIRECT prediction models outperformed existing noninvasive NAFLD prediction tools. One limitation is that these analyses were performed in adults of European ancestry residing in northern Europe, and it is unknown how well these findings will translate to people of other ancestries and exposed to environmental risk factors that differ from those of the present cohort. Another key limitation of this study is that the prediction was done on a binary outcome of liver fat quantity (<5% or ≥5%) rather than a continuous one.CONCLUSIONS: In this study, we developed several models with different combinations of clinical and omics data and identified biological features that appear to be associated with liver fat accumulation. In general, the clinical variables showed better prediction ability than the complex omics variables. However, the combination of omics and clinical variables yielded the highest accuracy. We have incorporated the developed clinical models into a web interface (see: https://www.predictliverfat.org/) and made it available to the community.TRIAL REGISTRATION: ClinicalTrials.gov NCT03814915.",
author = "Naeimeh Atabaki-Pasdar and Mattias Ohlsson and Ana Vi{\~n}uela and Francesca Frau and Hugo Pomares-Millan and Mark Haid and Jones, {Angus G} and Thomas, {E Louise} and Koivula, {Robert W} and Azra Kurbasic and Mutie, {Pascal M} and Hugo Fitipaldi and Juan Fernandez and Dawed, {Adem Y} and Giordano, {Giuseppe N} and Forgie, {Ian M} and McDonald, {Timothy J} and Femke Rutters and Henna Cederberg and Elizaveta Chabanova and Matilda Dale and Masi, {Federico De} and Thomas, {Cecilia Engel} and Allin, {Kristine H} and Hansen, {Tue H} and Alison Heggie and Mun-Gwan Hong and Elders, {Petra J M} and Gwen Kennedy and Tarja Kokkola and Pedersen, {Helle Krogh} and Anubha Mahajan and Donna McEvoy and Francois Pattou and Violeta Raverdy and H{\"a}ussler, {Ragna S} and Sapna Sharma and Thomsen, {Henrik S} and Jagadish Vangipurapu and Henrik Vestergaard and {'t Hart}, {Leen M} and Jerzy Adamski and Musholt, {Petra B} and Soren Brage and S{\o}ren Brunak and Emmanouil Dermitzakis and Gary Frost and Torben Hansen and Markku Laakso and Oluf Pedersen and Martin Ridderstr{\aa}le and Hartmut Ruetten and Hattersley, {Andrew T} and Mark Walker and Beulens, {Joline W J} and Andrea Mari and Schwenk, {Jochen M} and Ramneek Gupta and McCarthy, {Mark I} and Pearson, {Ewan R} and Bell, {Jimmy D} and Imre Pavo and Franks, {Paul W}",
year = "2020",
month = jun,
doi = "10.1371/journal.pmed.1003149",
language = "English",
volume = "17",
pages = "e1003149",
journal = "PLOS Medicine",
issn = "1549-1277",
publisher = "Public Library of Science",
number = "6",

}

RIS

TY - JOUR

T1 - Predicting and elucidating the etiology of fatty liver disease

T2 - A machine learning modeling and validation study in the IMI DIRECT cohorts

AU - Atabaki-Pasdar, Naeimeh

AU - Ohlsson, Mattias

AU - Viñuela, Ana

AU - Frau, Francesca

AU - Pomares-Millan, Hugo

AU - Haid, Mark

AU - Jones, Angus G

AU - Thomas, E Louise

AU - Koivula, Robert W

AU - Kurbasic, Azra

AU - Mutie, Pascal M

AU - Fitipaldi, Hugo

AU - Fernandez, Juan

AU - Dawed, Adem Y

AU - Giordano, Giuseppe N

AU - Forgie, Ian M

AU - McDonald, Timothy J

AU - Rutters, Femke

AU - Cederberg, Henna

AU - Chabanova, Elizaveta

AU - Dale, Matilda

AU - Masi, Federico De

AU - Thomas, Cecilia Engel

AU - Allin, Kristine H

AU - Hansen, Tue H

AU - Heggie, Alison

AU - Hong, Mun-Gwan

AU - Elders, Petra J M

AU - Kennedy, Gwen

AU - Kokkola, Tarja

AU - Pedersen, Helle Krogh

AU - Mahajan, Anubha

AU - McEvoy, Donna

AU - Pattou, Francois

AU - Raverdy, Violeta

AU - Häussler, Ragna S

AU - Sharma, Sapna

AU - Thomsen, Henrik S

AU - Vangipurapu, Jagadish

AU - Vestergaard, Henrik

AU - 't Hart, Leen M

AU - Adamski, Jerzy

AU - Musholt, Petra B

AU - Brage, Soren

AU - Brunak, Søren

AU - Dermitzakis, Emmanouil

AU - Frost, Gary

AU - Hansen, Torben

AU - Laakso, Markku

AU - Pedersen, Oluf

AU - Ridderstråle, Martin

AU - Ruetten, Hartmut

AU - Hattersley, Andrew T

AU - Walker, Mark

AU - Beulens, Joline W J

AU - Mari, Andrea

AU - Schwenk, Jochen M

AU - Gupta, Ramneek

AU - McCarthy, Mark I

AU - Pearson, Ewan R

AU - Bell, Jimmy D

AU - Pavo, Imre

AU - Franks, Paul W

PY - 2020/6

Y1 - 2020/6

N2 - BACKGROUND: Non-alcoholic fatty liver disease (NAFLD) is highly prevalent and causes serious health complications in individuals with and without type 2 diabetes (T2D). Early diagnosis of NAFLD is important, as this can help prevent irreversible damage to the liver and, ultimately, hepatocellular carcinomas. We sought to expand etiological understanding and develop a diagnostic tool for NAFLD using machine learning.METHODS AND FINDINGS: We utilized the baseline data from IMI DIRECT, a multicenter prospective cohort study of 3,029 European-ancestry adults recently diagnosed with T2D (n = 795) or at high risk of developing the disease (n = 2,234). Multi-omics (genetic, transcriptomic, proteomic, and metabolomic) and clinical (liver enzymes and other serological biomarkers, anthropometry, measures of beta-cell function, insulin sensitivity, and lifestyle) data comprised the key input variables. The models were trained on MRI-image-derived liver fat content (<5% or ≥5%) available for 1,514 participants. We applied LASSO (least absolute shrinkage and selection operator) to select features from the different layers of omics data and random forest analysis to develop the models. The prediction models included clinical and omics variables separately or in combination. A model including all omics and clinical variables yielded a cross-validated receiver operating characteristic area under the curve (ROCAUC) of 0.84 (95% CI 0.82, 0.86; p < 0.001), which compared with a ROCAUC of 0.82 (95% CI 0.81, 0.83; p < 0.001) for a model including 9 clinically accessible variables. The IMI DIRECT prediction models outperformed existing noninvasive NAFLD prediction tools. One limitation is that these analyses were performed in adults of European ancestry residing in northern Europe, and it is unknown how well these findings will translate to people of other ancestries and exposed to environmental risk factors that differ from those of the present cohort. Another key limitation of this study is that the prediction was done on a binary outcome of liver fat quantity (<5% or ≥5%) rather than a continuous one.CONCLUSIONS: In this study, we developed several models with different combinations of clinical and omics data and identified biological features that appear to be associated with liver fat accumulation. In general, the clinical variables showed better prediction ability than the complex omics variables. However, the combination of omics and clinical variables yielded the highest accuracy. We have incorporated the developed clinical models into a web interface (see: https://www.predictliverfat.org/) and made it available to the community.TRIAL REGISTRATION: ClinicalTrials.gov NCT03814915.

AB - BACKGROUND: Non-alcoholic fatty liver disease (NAFLD) is highly prevalent and causes serious health complications in individuals with and without type 2 diabetes (T2D). Early diagnosis of NAFLD is important, as this can help prevent irreversible damage to the liver and, ultimately, hepatocellular carcinomas. We sought to expand etiological understanding and develop a diagnostic tool for NAFLD using machine learning.METHODS AND FINDINGS: We utilized the baseline data from IMI DIRECT, a multicenter prospective cohort study of 3,029 European-ancestry adults recently diagnosed with T2D (n = 795) or at high risk of developing the disease (n = 2,234). Multi-omics (genetic, transcriptomic, proteomic, and metabolomic) and clinical (liver enzymes and other serological biomarkers, anthropometry, measures of beta-cell function, insulin sensitivity, and lifestyle) data comprised the key input variables. The models were trained on MRI-image-derived liver fat content (<5% or ≥5%) available for 1,514 participants. We applied LASSO (least absolute shrinkage and selection operator) to select features from the different layers of omics data and random forest analysis to develop the models. The prediction models included clinical and omics variables separately or in combination. A model including all omics and clinical variables yielded a cross-validated receiver operating characteristic area under the curve (ROCAUC) of 0.84 (95% CI 0.82, 0.86; p < 0.001), which compared with a ROCAUC of 0.82 (95% CI 0.81, 0.83; p < 0.001) for a model including 9 clinically accessible variables. The IMI DIRECT prediction models outperformed existing noninvasive NAFLD prediction tools. One limitation is that these analyses were performed in adults of European ancestry residing in northern Europe, and it is unknown how well these findings will translate to people of other ancestries and exposed to environmental risk factors that differ from those of the present cohort. Another key limitation of this study is that the prediction was done on a binary outcome of liver fat quantity (<5% or ≥5%) rather than a continuous one.CONCLUSIONS: In this study, we developed several models with different combinations of clinical and omics data and identified biological features that appear to be associated with liver fat accumulation. In general, the clinical variables showed better prediction ability than the complex omics variables. However, the combination of omics and clinical variables yielded the highest accuracy. We have incorporated the developed clinical models into a web interface (see: https://www.predictliverfat.org/) and made it available to the community.TRIAL REGISTRATION: ClinicalTrials.gov NCT03814915.

UR - http://www.scopus.com/inward/record.url?scp=85086754493&partnerID=8YFLogxK

U2 - 10.1371/journal.pmed.1003149

DO - 10.1371/journal.pmed.1003149

M3 - Journal article

C2 - 32559194

VL - 17

SP - e1003149

JO - PLOS Medicine

JF - PLOS Medicine

SN - 1549-1277

IS - 6

ER -

ID: 60273751