TY - JOUR
T1 - Ancestry-Associated Performance Variability of Open-Source AI Models for EGFR Prediction in Lung Cancer
AU - Rakaee, Mehrdad
AU - Nassar, Amin H.
AU - Tafavvoghi, Masoud
AU - Jabar, Falah
AU - Bou Farhat, Elias
AU - Adib, Elio
AU - Andersen, Sigve
AU - Busund, Lill Tove Rasmussen
AU - Pøhl, Mette
AU - Helland, Åslaug
AU - Gusev, Alexander
AU - Ricciuti, Biagio
AU - Sholl, Lynette M.
AU - Donnem, Tom
AU - Kwiatkowski, David J.
N1 - Publisher Copyright:
© 2026 Rakaee M et al.
PY - 2026
Y1 - 2026
N2 - Importance Artificial intelligence (AI) models are emerging as rapid, low-cost tools for predicting targetable genomic alterations directly from routine pathology slides. Although these approaches could accelerate treatment decisions in lung cancer, little is known about whether their performance is consistent across diverse patient populations and tissue contexts. Objective To evaluate the performance and generalizability of 2 open-source AI pathology models for predicting EGFR mutation status in lung adenocarcinoma (LUAD) across independent cohorts and ancestral subgroups. Design, Setting, and Participants This cohort study included patients with LUAD from 2 cohorts: Dana-Farber Cancer Institute (DFCI) from June 2013 to November 2023, and a European-based trial (TNM-I) from August 2016 to February 2022. All patients had paired next-generation sequencing data and hematoxylin-eosin–stained whole-slide images. In the DFCI cohort, genetic ancestry was inferred using germline genotype data. Data analyses were performed from July 2025 to September 2025. Main Outcomes The primary outcome was model performance for predicting EGFR mutation status, measured as the area under the receiver operating characteristic curve (AUC), evaluated overall and across ancestry subgroups and sample types. Results Overall, 2098 patients with LUAD were included (mean [SD] age, 66.6 [10.3] years; 1315 female individuals [63%] and 783 male individuals [37%]). In the DFCI cohort (n = 1759; 54 African, 101 American, 95 Asian, 1465 European), EGFR mutations were detected in 432 patients (25%). One AI-pathology model achieved an AUC of 0.83 (95% CI, 0.81-0.85) compared with 0.68 (95% CI, 0.65-0.70) for the other model. In the TNM-I cohort (n = 339), EGFR mutations were detected in 50 patients (15%), with AUCs of 0.81 (95% CI, 0.74-0.88) and 0.75 (95% CI, 0.68-0.83), respectively. In ancestry-stratified analyses of the DFCI cohort, AUCs for the higher-performing model were 0.84 (95% CI, 0.81-0.86) in patients of European ancestry, 0.85 (95% CI, 0.72-0.94) in African ancestry, and 0.68 (95% CI, 0.55-0.78) in Asian ancestry. In sample type analyses, performance declined in pleural (AUC, 0.66; 95% CI, 0.56-0.76) compared with lung specimens (AUC, 0.86; 95% CI, 0.83-0.88). AI-guided triage analyses showed a potential 57% reduction in rapid EGFR testing, while maintaining sensitivity of 0.84 and specificity of 0.99. Conclusions This cohort study found that AI-based pathology tools may serve as preliminary adjuncts for EGFR prediction in lung cancer, though performance differences by ancestry warrant careful interpretation.
AB - Importance Artificial intelligence (AI) models are emerging as rapid, low-cost tools for predicting targetable genomic alterations directly from routine pathology slides. Although these approaches could accelerate treatment decisions in lung cancer, little is known about whether their performance is consistent across diverse patient populations and tissue contexts. Objective To evaluate the performance and generalizability of 2 open-source AI pathology models for predicting EGFR mutation status in lung adenocarcinoma (LUAD) across independent cohorts and ancestral subgroups. Design, Setting, and Participants This cohort study included patients with LUAD from 2 cohorts: Dana-Farber Cancer Institute (DFCI) from June 2013 to November 2023, and a European-based trial (TNM-I) from August 2016 to February 2022. All patients had paired next-generation sequencing data and hematoxylin-eosin–stained whole-slide images. In the DFCI cohort, genetic ancestry was inferred using germline genotype data. Data analyses were performed from July 2025 to September 2025. Main Outcomes The primary outcome was model performance for predicting EGFR mutation status, measured as the area under the receiver operating characteristic curve (AUC), evaluated overall and across ancestry subgroups and sample types. Results Overall, 2098 patients with LUAD were included (mean [SD] age, 66.6 [10.3] years; 1315 female individuals [63%] and 783 male individuals [37%]). In the DFCI cohort (n = 1759; 54 African, 101 American, 95 Asian, 1465 European), EGFR mutations were detected in 432 patients (25%). One AI-pathology model achieved an AUC of 0.83 (95% CI, 0.81-0.85) compared with 0.68 (95% CI, 0.65-0.70) for the other model. In the TNM-I cohort (n = 339), EGFR mutations were detected in 50 patients (15%), with AUCs of 0.81 (95% CI, 0.74-0.88) and 0.75 (95% CI, 0.68-0.83), respectively. In ancestry-stratified analyses of the DFCI cohort, AUCs for the higher-performing model were 0.84 (95% CI, 0.81-0.86) in patients of European ancestry, 0.85 (95% CI, 0.72-0.94) in African ancestry, and 0.68 (95% CI, 0.55-0.78) in Asian ancestry. In sample type analyses, performance declined in pleural (AUC, 0.66; 95% CI, 0.56-0.76) compared with lung specimens (AUC, 0.86; 95% CI, 0.83-0.88). AI-guided triage analyses showed a potential 57% reduction in rapid EGFR testing, while maintaining sensitivity of 0.84 and specificity of 0.99. Conclusions This cohort study found that AI-based pathology tools may serve as preliminary adjuncts for EGFR prediction in lung cancer, though performance differences by ancestry warrant careful interpretation.
UR - https://www.scopus.com/pages/publications/105030557836
U2 - 10.1001/jamaoncol.2025.6430
DO - 10.1001/jamaoncol.2025.6430
M3 - Journal article
C2 - 41678173
AN - SCOPUS:105030557836
SN - 2374-2437
JO - JAMA Oncology
JF - JAMA Oncology
ER -