PURPOSE: Decades of research have identified multiple genetic variants associated with breast cancer etiology. However, there is no database that archives breast cancer genes and variants responsible for predisposition. We set out to build a dynamic repository of curated breast cancer genes.
METHODS: A comprehensive literature search was performed in PubMed and Google Scholar, followed by data extraction and harmonization for downstream analysis.
RESULTS: Using a subset of 345 studies, we cataloged 652 breast cancer-associated loci across the genome. A majority of these were present in the non-coding region (i.e., intergenic (101) and intronic (345)), whereas only 158 were located within an exon. Using the odds ratio, we identified 429 loci to increase the disease risk and 198 to confer protection against breast cancer, whereas 25 were identified to both increase disease risk and confer protection against breast cancer. Chromosomal ideogram analysis indicated that chromosomes 17 and 19 have the highest density of breast cancer loci. We manually annotated and collated breast cancer genes in which a previous association between rare-monogenic variant and breast cancer has been documented. Finally, network and functional enrichment analysis revealed that steroid metabolism and DNA repair pathways were predominant among breast cancer genes and variants.
CONCLUSIONS: We have built an online interactive catalog of curated breast cancer genes ( https://cbcg.dk ). This will expedite clinical diagnostics and support the ongoing efforts in managing breast cancer etiology. Moreover, the database will serve as an essential repository when designing new breast cancer multigene panels.
- Breast cancer
- Common-polygenic variants
- DNA repair pathways
- Genetic predisposition
- Rare-monogenic variants