Skip to main navigation Skip to search Skip to main content

Prediction of bloodstream infection using machine learning based primarily on biochemical data

Ramtin Zargari Marandi, Frederik Boetius Hertz*, Jesper Qvist Thomassen, Steen Christian Rasmussen, Ruth Frikke-Schmidt, Niels Frimodt-Møller, Karen Leth Nielsen, Cameron Ross MacPherson

*Corresponding author for this work
1 Citation (Scopus)

Abstract

Early diagnosis of bloodstream infection (BSI) is crucial for informed antibiotic use. This study developed a machine learning approach for early BSI detection using a comprehensive dataset from Rigshospitalet, Denmark (2010-2020). The dataset included 144,398 samples from adult patients, containing blood culture results, demographics, and up to 36 biochemical variables. Positive blood culture was observed in 6.4% of samples, mostly caused by Staphylococcus aureus, Escherichia coli, and Enterococcus faecium. 80% of the samples (N = 43,351 patients) were used for ML model development and five-fold cross-validation, with 20% for independent testing (N = 10,837). Among seven models, LightGBM performed best, achieving an AUC of 0.69 on the test set. It was more accurate in detecting negatives, with a negative predictive value (NPV) of 0.96 and specificity of 0.74, compared to a positive predictive value (PPV) of 0.13 and sensitivity of 0.54. SHapley Additive exPlanations (SHAP) identified platelets, leukocytes, and neutrophils-to-lymphocytes as the top-3 predictive features. The model showed higher sensitivity (average 0.66) for common pathogens, e.g., 0.71 for E. coli. Results highlight the potential of biochemical variables as diagnostic factors for BSI, indicating clinical use to focus on identifying patients at low risks and can be further enhanced in future investigations.

Original languageEnglish
Article number17478
JournalScientific Reports
Volume15
Issue number1
Pages (from-to)17478
ISSN2045-2322
DOIs
Publication statusPublished - 20 May 2025

Keywords

  • Artificial intelligence
  • Biomarkers
  • Classification models
  • Clinical utility
  • Diagnosis
  • Electronic medical records
  • Infection management
  • Interpretability
  • Real-world data

Fingerprint

Dive into the research topics of 'Prediction of bloodstream infection using machine learning based primarily on biochemical data'. Together they form a unique fingerprint.

Cite this