TY - JOUR
T1 - Early life body size and puberty markers as predictors of breast cancer risk later in life
T2 - A neural network analysis
AU - Svendsen, Sara M S
AU - Pedersen, Dorthe C
AU - Jensen, Britt W
AU - Aarestrup, Julie
AU - Mellemkjær, Lene
AU - Bjerregaard, Lise G
AU - Baker, Jennifer L
N1 - Copyright: © 2024 Svendsen et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2024
Y1 - 2024
N2 - BACKGROUND: The early life factors of birthweight, child weight, height, body mass index (BMI) and pubertal timing are associated with risks of breast cancer. However, the predictive value of these factors in relation to breast cancer is largely unknown. Therefore, using a machine learning approach, we examined whether birthweight, childhood weights, heights, BMIs, and pubertal timing individually and in combination were predictive of breast cancer.METHODS: We used information on birthweight, childhood height and weight, and pubertal timing assessed by the onset of the growth spurt (OGS) from 164,216 girls born 1930-1996 from the Copenhagen School Health Records Register. Of these, 10,002 women were diagnosed with breast cancer during 1977-2019 according to a nationwide breast cancer database. We developed a feed-forward neural network, which was trained and tested on early life body size measures individually and in various combinations. Evaluation metrics were examined to identify the best performing model.RESULTS: The highest area under the receiver operating curve (AUC) was achieved in a model that included birthweight, childhood heights, weights and age at OGS (AUC = 0.600). A model based on childhood heights and weights had a comparable AUC value (AUC = 0.598), whereas a model including only childhood heights had the lowest AUC value (AUC = 0.572). The sensitivity of the models ranged from 0.698 to 0.760 while the precision ranged from 0.071 to 0.076.CONCLUSION: We found that the best performing network was based on birthweight, childhood weights, heights and age at OGS as the input features. Nonetheless, this performance was only slightly better than the model including childhood heights and weights. Further, although the performance of our networks was relatively low, it was similar to those from previous studies including well-established risk factors. As such, our results suggest that childhood body size may add additional value to breast cancer prediction models.
AB - BACKGROUND: The early life factors of birthweight, child weight, height, body mass index (BMI) and pubertal timing are associated with risks of breast cancer. However, the predictive value of these factors in relation to breast cancer is largely unknown. Therefore, using a machine learning approach, we examined whether birthweight, childhood weights, heights, BMIs, and pubertal timing individually and in combination were predictive of breast cancer.METHODS: We used information on birthweight, childhood height and weight, and pubertal timing assessed by the onset of the growth spurt (OGS) from 164,216 girls born 1930-1996 from the Copenhagen School Health Records Register. Of these, 10,002 women were diagnosed with breast cancer during 1977-2019 according to a nationwide breast cancer database. We developed a feed-forward neural network, which was trained and tested on early life body size measures individually and in various combinations. Evaluation metrics were examined to identify the best performing model.RESULTS: The highest area under the receiver operating curve (AUC) was achieved in a model that included birthweight, childhood heights, weights and age at OGS (AUC = 0.600). A model based on childhood heights and weights had a comparable AUC value (AUC = 0.598), whereas a model including only childhood heights had the lowest AUC value (AUC = 0.572). The sensitivity of the models ranged from 0.698 to 0.760 while the precision ranged from 0.071 to 0.076.CONCLUSION: We found that the best performing network was based on birthweight, childhood weights, heights and age at OGS as the input features. Nonetheless, this performance was only slightly better than the model including childhood heights and weights. Further, although the performance of our networks was relatively low, it was similar to those from previous studies including well-established risk factors. As such, our results suggest that childhood body size may add additional value to breast cancer prediction models.
KW - Child
KW - Humans
KW - Female
KW - Breast Neoplasms/diagnosis
KW - Birth Weight
KW - Body Height
KW - Body Size
KW - Puberty
KW - Body Mass Index
KW - Risk Factors
KW - Neural Networks, Computer
UR - http://www.scopus.com/inward/record.url?scp=85184683770&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0296835
DO - 10.1371/journal.pone.0296835
M3 - Journal article
C2 - 38335218
SN - 1932-6203
VL - 19
SP - e0296835
JO - PLoS One
JF - PLoS One
IS - 2
M1 - e0296835
ER -