TY - JOUR
T1 - Deep Learning for Malignancy Risk Estimation of Pulmonary Nodules Detected at Low-Dose Screening CT
AU - Venkadesh, Kiran Vaidhya
AU - Setio, Arnaud A A
AU - Schreuder, Anton
AU - Scholten, Ernst T
AU - Chung, Kaman
AU - W Wille, Mathilde M
AU - Saghir, Zaigham
AU - van Ginneken, Bram
AU - Prokop, Mathias
AU - Jacobs, Colin
PY - 2021/8
Y1 - 2021/8
N2 - Background Accurate estimation of the malignancy risk of pulmonary nodules at chest CT is crucial for optimizing management in lung cancer screening. Purpose To develop and validate a deep learning (DL) algorithm for malignancy risk estimation of pulmonary nodules detected at screening CT. Materials and Methods In this retrospective study, the DL algorithm was developed with 16 077 nodules (1249 malignant) collected -between 2002 and 2004 from the National Lung Screening Trial. External validation was performed in the following three -cohorts -collected between 2004 and 2010 from the Danish Lung Cancer Screening Trial: a full cohort containing all 883 nodules (65 -malignant) and two cancer-enriched cohorts with size matching (175 nodules, 59 malignant) and without size matching (177 -nodules, 59 malignant) of benign nodules selected at random. Algorithm performance was measured by using the area under the receiver operating characteristic curve (AUC) and compared with that of the Pan-Canadian Early Detection of Lung Cancer (PanCan) model in the full cohort and a group of 11 clinicians composed of four thoracic radiologists, five radiology residents, and two pulmonologists in the cancer-enriched cohorts. Results The DL algorithm significantly outperformed the PanCan model in the full cohort (AUC, 0.93 [95% CI: 0.89, 0.96] vs 0.90 [95% CI: 0.86, 0.93]; P = .046). The algorithm performed comparably to thoracic radiologists in cancer-enriched cohorts with both random benign nodules (AUC, 0.96 [95% CI: 0.93, 0.99] vs 0.90 [95% CI: 0.81, 0.98]; P = .11) and size-matched benign nodules (AUC, 0.86 [95% CI: 0.80, 0.91] vs 0.82 [95% CI: 0.74, 0.89]; P = .26). Conclusion The deep learning algorithm showed excellent performance, comparable to thoracic radiologists, for malignancy risk estimation of pulmonary nodules detected at screening CT. This algorithm has the potential to provide reliable and reproducible malignancy risk scores for clinicians, which may help optimize management in lung cancer screening. © RSNA, 2021 Online supplemental material is available for this article. See also the editorial by Tammemägi in this issue.
AB - Background Accurate estimation of the malignancy risk of pulmonary nodules at chest CT is crucial for optimizing management in lung cancer screening. Purpose To develop and validate a deep learning (DL) algorithm for malignancy risk estimation of pulmonary nodules detected at screening CT. Materials and Methods In this retrospective study, the DL algorithm was developed with 16 077 nodules (1249 malignant) collected -between 2002 and 2004 from the National Lung Screening Trial. External validation was performed in the following three -cohorts -collected between 2004 and 2010 from the Danish Lung Cancer Screening Trial: a full cohort containing all 883 nodules (65 -malignant) and two cancer-enriched cohorts with size matching (175 nodules, 59 malignant) and without size matching (177 -nodules, 59 malignant) of benign nodules selected at random. Algorithm performance was measured by using the area under the receiver operating characteristic curve (AUC) and compared with that of the Pan-Canadian Early Detection of Lung Cancer (PanCan) model in the full cohort and a group of 11 clinicians composed of four thoracic radiologists, five radiology residents, and two pulmonologists in the cancer-enriched cohorts. Results The DL algorithm significantly outperformed the PanCan model in the full cohort (AUC, 0.93 [95% CI: 0.89, 0.96] vs 0.90 [95% CI: 0.86, 0.93]; P = .046). The algorithm performed comparably to thoracic radiologists in cancer-enriched cohorts with both random benign nodules (AUC, 0.96 [95% CI: 0.93, 0.99] vs 0.90 [95% CI: 0.81, 0.98]; P = .11) and size-matched benign nodules (AUC, 0.86 [95% CI: 0.80, 0.91] vs 0.82 [95% CI: 0.74, 0.89]; P = .26). Conclusion The deep learning algorithm showed excellent performance, comparable to thoracic radiologists, for malignancy risk estimation of pulmonary nodules detected at screening CT. This algorithm has the potential to provide reliable and reproducible malignancy risk scores for clinicians, which may help optimize management in lung cancer screening. © RSNA, 2021 Online supplemental material is available for this article. See also the editorial by Tammemägi in this issue.
UR - http://www.scopus.com/inward/record.url?scp=85111280462&partnerID=8YFLogxK
U2 - 10.1148/radiol.2021204433
DO - 10.1148/radiol.2021204433
M3 - Journal article
C2 - 34003056
SN - 0033-8419
VL - 300
SP - 438
EP - 447
JO - Radiology
JF - Radiology
IS - 2
ER -