Abstract
DNA methylation plays an important role in both normal human development and risk of disease. The most utilized method of assessing DNA methylation uses BeadChips, generating an epigenome-wide “snapshot” of >450,000 observations (probe measurements) per assay. However, the reliability of each of these measurements is not equal, and little consideration is paid to consequences for research. We correlated repeat measurements of the same DNA samples using the Illumina HumanMethylation450K and the Infinium MethylationEPIC BeadChips in 350 blood DNA samples. Probes that were reliably measured were more heritable and showed consistent associations with environmental exposures, gene expression, and greater cross-tissue concordance. Unreliable probes were less replicable and generated an unknown volume of false negatives. This serves as a lesson for working with DNA methylation data, but the lessons are equally applicable to working with other data: as we advance toward generating increasingly greater volumes of data, failure to document reliability risks harming reproducibility. Although DNA methylation data are used widely by researchers in many fields, the reliability of these data are surprisingly variable. Our findings remind us that, in an age of increasingly big data, research is only as robust as its foundations. We hope that our findings will improve the integrity of DNA methylation studies. We also hope that our findings serve as a cautionary reminder for those generating and implementing big data of any type: reliability is a fundamental aspect of replicability. Conducting analysis with reliable data will improve chances of replicable findings, which might lead to more actionable targets for further research. To the extent that reliable data improve replicability, the knock-on effect will be more public confidence in research and less effort spent trying to replicate findings that are bound to fail. DNA methylation is an important mechanism of gene regulation. The most popular method to measure methylation is to use BeadChips that contain probes to index hundreds of thousands of methylation sites at once. However, these probes are not equally reliable. In blood DNA, unreliable probes were less heritable and less likely to index gene expression, and associations were less replicable. This has serious downstream consequences for reproducible science and should serve as a caution for all data scientists regardless of discipline.
| Originalsprog | Engelsk |
|---|---|
| Artikelnummer | 100014 |
| Tidsskrift | Patterns (New York, N.Y.) |
| Vol/bind | 1 |
| Udgave nummer | 2 |
| Sider (fra-til) | 100014 |
| ISSN | 2666-3899 |
| DOI | |
| Status | Udgivet - 8 maj 2020 |