TY - JOUR
T1 - The Danish Lymphoid Cancer Research (DALY-CARE) Data Resource
T2 - The Basis for Developing Data-Driven Hematology
AU - Brieghel, Christian
AU - Werling, Mikkel
AU - Frederiksen, Casper Møller
AU - Parviz, Mehdi
AU - Lacoppidan, Thomas
AU - Faitova, Tereza
AU - Teglgaard, Rebecca Svanberg
AU - Vainer, Noomi
AU - da Cunha-Bang, Caspar
AU - Rotbain, Emelie Curovic
AU - Agius, Rudi
AU - Niemann, Carsten Utoft
N1 - © 2025 Brieghel et al.
PY - 2025
Y1 - 2025
N2 - BACKGROUND: Lymphoid-lineage cancers (LC; International Classification of Diseases, 10th edition [ICD10] C81.x-C90.x, C91.1-C91.9, C95.1, C95.7, C95.9, D47.2, D47.9B, and E85.8A) share many epidemiological and clinical features, which favor meta-learning when developing medical artificial intelligence (mAI). However, access to large, shared datasets is largely missing and limits mAI research.AIM: Creating a large-scale data repository for patients with LC to develop data-driven hematology.METHODS: We gathered electronic health data and created open-source processing pipelines to create a comprehensive data resource for Danish LC Research (DALY-CARE) approved for epidemiological, molecular, and data-driven research.RESULTS: We included all Danish adults registered with LC diagnoses since 2002 (n=65,774) and combined 10 nationwide registers, electronic health records (EHR), and laboratory data on a high-powered cloud-computer to develop a secure research environment. Among other, data include treatments (ie 21,750 cytoreductive treatment plans, 21.3M outpatient prescriptions, and 12.7M in-hospital administrations), biochemical analyses (77.3M), comorbidity (14.8M ICD10 codes), pathology codes (4.5M), treatment procedures (8.3M), surgical procedures (1.0M), radiological examinations (3.3M), vital signs (18.3M values), and survival data. We herein describe the data infrastructure and exemplify how DALY-CARE has been used for molecular studies, real-world evidence to evaluate the efficacy of care, and mAI deployed directly into EHR systems.CONCLUSION: The DALY-CARE data resource allows for the development of near real-time decision-support tools and extrapolation of clinical trial results to clinical practice, thereby improving care for patients with LC while facilitating streamlining of health data infrastructure across cohorts and medical specialties.
AB - BACKGROUND: Lymphoid-lineage cancers (LC; International Classification of Diseases, 10th edition [ICD10] C81.x-C90.x, C91.1-C91.9, C95.1, C95.7, C95.9, D47.2, D47.9B, and E85.8A) share many epidemiological and clinical features, which favor meta-learning when developing medical artificial intelligence (mAI). However, access to large, shared datasets is largely missing and limits mAI research.AIM: Creating a large-scale data repository for patients with LC to develop data-driven hematology.METHODS: We gathered electronic health data and created open-source processing pipelines to create a comprehensive data resource for Danish LC Research (DALY-CARE) approved for epidemiological, molecular, and data-driven research.RESULTS: We included all Danish adults registered with LC diagnoses since 2002 (n=65,774) and combined 10 nationwide registers, electronic health records (EHR), and laboratory data on a high-powered cloud-computer to develop a secure research environment. Among other, data include treatments (ie 21,750 cytoreductive treatment plans, 21.3M outpatient prescriptions, and 12.7M in-hospital administrations), biochemical analyses (77.3M), comorbidity (14.8M ICD10 codes), pathology codes (4.5M), treatment procedures (8.3M), surgical procedures (1.0M), radiological examinations (3.3M), vital signs (18.3M values), and survival data. We herein describe the data infrastructure and exemplify how DALY-CARE has been used for molecular studies, real-world evidence to evaluate the efficacy of care, and mAI deployed directly into EHR systems.CONCLUSION: The DALY-CARE data resource allows for the development of near real-time decision-support tools and extrapolation of clinical trial results to clinical practice, thereby improving care for patients with LC while facilitating streamlining of health data infrastructure across cohorts and medical specialties.
UR - http://www.scopus.com/inward/record.url?scp=85219082955&partnerID=8YFLogxK
U2 - 10.2147/CLEP.S479672
DO - 10.2147/CLEP.S479672
M3 - Journal article
C2 - 39996155
SN - 1179-1349
VL - 17
SP - 131
EP - 145
JO - Clinical Epidemiology
JF - Clinical Epidemiology
ER -