TY - JOUR
T1 - Development of a national deep learning-based auto-segmentation model for the heart on clinical delineations from the DBCG RT nation cohort
AU - Skarsø, Emma Riis
AU - Refsgaard, Lasse
AU - Saini, Abhilasha
AU - Sloth Møller, Ditte
AU - Lorenzen, Ebbe Laugaard
AU - Maae, Else
AU - Andersen, Karen
AU - Maraldo, Maja Vestmø
AU - Milo, Marie Louise
AU - Nyeng, Tine Bisballe
AU - Vrou Offersen, Birgitte
AU - Korreman, Stine Sofia
PY - 2023/10
Y1 - 2023/10
N2 - BACKGROUND: This study aimed at investigating the feasibility of developing a deep learning-based auto-segmentation model for the heart trained on clinical delineations.MATERIAL AND METHODS: This study included two different datasets. The first dataset contained clinical heart delineations from the DBCG RT Nation study (1,561 patients). The second dataset was smaller (114 patients), but with corrected heart delineations. Before training the model on the clinical delineations an outlier-detection was performed, to remove cases with gross deviations from the delineation guideline. No outlier detection was performed for the dataset with corrected heart delineations. Both models were trained with a 3D full resolution nnUNet. The models were evaluated with the dice similarity coefficient (DSC), 95% Hausdorff distance (HD95) and Mean Surface Distance (MSD). The difference between the models were tested with the Mann-Whitney U-test. The balance of dataset quantity versus quality was investigated, by stepwise reducing the cohort size for the model trained on clinical delineations.RESULTS: During the outlier-detection 137 patients were excluded from the clinical cohort due to non-compliance with delineation guidelines. The model trained on the curated clinical cohort performed with a median DSC of 0.96 (IQR 0.94-0.96), median HD95 of 4.00 mm (IQR 3.00 mm-6.00 mm) and a median MSD of 1.49 mm (IQR 1.12 mm-2.02 mm). The model trained on the dedicated and corrected cohort performed with a median DSC of 0.95 (IQR 0.93-0.96), median HD95 of 5.65 mm (IQR 3.37 mm-8.62 mm) and median MSD of 1.63 mm (IQR 1.35 mm-2.11 mm). The difference between the two models were found non-significant for all metrics (p > 0.05). Reduction of cohort size showed no significant difference for all metrics (p > 0.05). However, with the smallest cohort size, a few outlier structures were found.CONCLUSIONS: This study demonstrated a deep learning-based auto-segmentation model trained on curated clinical delineations which performs on par with a model trained on dedicated delineations, making it easier to develop multi-institutional auto-segmentation models.
AB - BACKGROUND: This study aimed at investigating the feasibility of developing a deep learning-based auto-segmentation model for the heart trained on clinical delineations.MATERIAL AND METHODS: This study included two different datasets. The first dataset contained clinical heart delineations from the DBCG RT Nation study (1,561 patients). The second dataset was smaller (114 patients), but with corrected heart delineations. Before training the model on the clinical delineations an outlier-detection was performed, to remove cases with gross deviations from the delineation guideline. No outlier detection was performed for the dataset with corrected heart delineations. Both models were trained with a 3D full resolution nnUNet. The models were evaluated with the dice similarity coefficient (DSC), 95% Hausdorff distance (HD95) and Mean Surface Distance (MSD). The difference between the models were tested with the Mann-Whitney U-test. The balance of dataset quantity versus quality was investigated, by stepwise reducing the cohort size for the model trained on clinical delineations.RESULTS: During the outlier-detection 137 patients were excluded from the clinical cohort due to non-compliance with delineation guidelines. The model trained on the curated clinical cohort performed with a median DSC of 0.96 (IQR 0.94-0.96), median HD95 of 4.00 mm (IQR 3.00 mm-6.00 mm) and a median MSD of 1.49 mm (IQR 1.12 mm-2.02 mm). The model trained on the dedicated and corrected cohort performed with a median DSC of 0.95 (IQR 0.93-0.96), median HD95 of 5.65 mm (IQR 3.37 mm-8.62 mm) and median MSD of 1.63 mm (IQR 1.35 mm-2.11 mm). The difference between the two models were found non-significant for all metrics (p > 0.05). Reduction of cohort size showed no significant difference for all metrics (p > 0.05). However, with the smallest cohort size, a few outlier structures were found.CONCLUSIONS: This study demonstrated a deep learning-based auto-segmentation model trained on curated clinical delineations which performs on par with a model trained on dedicated delineations, making it easier to develop multi-institutional auto-segmentation models.
KW - Benchmarking
KW - Deep Learning
KW - Heart
KW - Humans
KW - Image Processing, Computer-Assisted
KW - Patient Compliance
KW - whole heart
KW - clinical delineations
KW - radiotherapy
KW - breast cancer
KW - Deep learning-based auto-segmentation
UR - http://www.scopus.com/inward/record.url?scp=85171300140&partnerID=8YFLogxK
U2 - 10.1080/0284186X.2023.2252582
DO - 10.1080/0284186X.2023.2252582
M3 - Journal article
C2 - 37712509
SN - 0284-186X
VL - 62
SP - 1201
EP - 1207
JO - Acta Oncologica
JF - Acta Oncologica
IS - 10
ER -