The epimutacionsData
package is a repository of datasets for the
epimutacions
package. It includes 2 datasets to use as an example:
The following code explains how to access to the data:
library(ExperimentHub)
eh <- ExperimentHub()
query(eh, c("epimutacionsData"))
## ExperimentHub with 3 records
## # snapshotDate(): 2022-10-24
## # $dataprovider: GEO, Illumina 450k array
## # $species: Homo sapiens
## # $rdataclass: RGChannelSet, GenomicRatioSet, GRanges
## # additional mcols(): taxonomyid, genome, description,
## # coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
## # rdatapath, sourceurl, sourcetype
## # retrieve records with, e.g., 'object[["EH6690"]]'
##
## title
## EH6690 | Control and case samples
## EH6691 | Reference panel
## EH6692 | Candidate epimutations
In Illumina 450K array (Reproducibility 2012), probes are unequally distributed along the genome, limiting the number of regions that can fulfil the requirements to be considered an epimutation. So, we have computed a dataset containing the regions that are candidates to become an epimutation.
To define the candidate epimutations,
we relied on the clustering from bumphunter (Jaffe et al. 2012).
We defined a primary dataset with all the CpGs from the Illumina 450K array.
Then, we run bumphunter and selected those regions with at least 3 CpGs.
As a result, we found 40408 candidate epimutations
which are available in the candRegsGR
dataset.
candRegsGR <- eh[["EH6692"]]
The package includes an RGChannelSet
class reference panel
(reference_panel
)
which contains 22 whole cord blood samples from
healthy children born via caesarian from
the GSE127824 cohort (Gervin et al. 2019).
The reference panel can be found in EH6691
record of the eh
object:
reference_panel <- eh[["EH6691"]]
The methy
dataset includes 51 DNA methylation profiling
of whole blood samples. 48 controls from GSE104812 (Shi et al. 2018) cohort
and 3 cases from GSE97362 (Butcher et al. 2017).
it is a GenomicRatioSet
class object.
methy <- eh[["EH6690"]]
The IDAT files contain raw microarray intensities of 4 case samples
from GSE131350 cohort.
The files are located on the external data of epimutacionsData
package:
library(minfi)
baseDir <- system.file("extdata", package = "epimutacionsData")
targets <- read.metharray.sheet(baseDir)
sessionInfo()
## R version 4.2.1 (2022-06-23)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.5 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.16-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.16-bioc/R/lib/libRlapack.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] parallel stats4 stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] minfi_1.44.0 bumphunter_1.40.0
## [3] locfit_1.5-9.6 iterators_1.0.14
## [5] foreach_1.5.2 Biostrings_2.66.0
## [7] XVector_0.38.0 SummarizedExperiment_1.28.0
## [9] Biobase_2.58.0 MatrixGenerics_1.10.0
## [11] matrixStats_0.62.0 GenomicRanges_1.50.0
## [13] GenomeInfoDb_1.34.0 IRanges_2.32.0
## [15] S4Vectors_0.36.0 epimutacionsData_1.2.0
## [17] ExperimentHub_2.6.0 AnnotationHub_3.6.0
## [19] BiocFileCache_2.6.0 dbplyr_2.2.1
## [21] BiocGenerics_0.44.0 BiocStyle_2.26.0
##
## loaded via a namespace (and not attached):
## [1] plyr_1.8.7 splines_4.2.1
## [3] BiocParallel_1.32.0 digest_0.6.30
## [5] htmltools_0.5.3 fansi_1.0.3
## [7] magrittr_2.0.3 memoise_2.0.1
## [9] tzdb_0.3.0 limma_3.54.0
## [11] readr_2.1.3 annotate_1.76.0
## [13] askpass_1.1 siggenes_1.72.0
## [15] prettyunits_1.1.1 blob_1.2.3
## [17] rappdirs_0.3.3 xfun_0.34
## [19] dplyr_1.0.10 crayon_1.5.2
## [21] RCurl_1.98-1.9 jsonlite_1.8.3
## [23] genefilter_1.80.0 GEOquery_2.66.0
## [25] survival_3.4-0 glue_1.6.2
## [27] zlibbioc_1.44.0 DelayedArray_0.24.0
## [29] Rhdf5lib_1.20.0 HDF5Array_1.26.0
## [31] DBI_1.1.3 rngtools_1.5.2
## [33] Rcpp_1.0.9 xtable_1.8-4
## [35] progress_1.2.2 bit_4.0.4
## [37] mclust_6.0.0 preprocessCore_1.60.0
## [39] httr_1.4.4 RColorBrewer_1.1-3
## [41] ellipsis_0.3.2 pkgconfig_2.0.3
## [43] reshape_0.8.9 XML_3.99-0.12
## [45] sass_0.4.2 utf8_1.2.2
## [47] tidyselect_1.2.0 rlang_1.0.6
## [49] later_1.3.0 AnnotationDbi_1.60.0
## [51] BiocVersion_3.16.0 tools_4.2.1
## [53] cachem_1.0.6 cli_3.4.1
## [55] generics_0.1.3 RSQLite_2.2.18
## [57] evaluate_0.17 stringr_1.4.1
## [59] fastmap_1.1.0 yaml_2.3.6
## [61] knitr_1.40 bit64_4.0.5
## [63] beanplot_1.3.1 scrime_1.3.5
## [65] purrr_0.3.5 KEGGREST_1.38.0
## [67] nlme_3.1-160 doRNG_1.8.2
## [69] sparseMatrixStats_1.10.0 mime_0.12
## [71] nor1mix_1.3-0 xml2_1.3.3
## [73] biomaRt_2.54.0 compiler_4.2.1
## [75] filelock_1.0.2 curl_4.3.3
## [77] png_0.1-7 interactiveDisplayBase_1.36.0
## [79] tibble_3.1.8 bslib_0.4.0
## [81] stringi_1.7.8 GenomicFeatures_1.50.1
## [83] lattice_0.20-45 Matrix_1.5-1
## [85] multtest_2.54.0 vctrs_0.5.0
## [87] pillar_1.8.1 lifecycle_1.0.3
## [89] rhdf5filters_1.10.0 BiocManager_1.30.19
## [91] jquerylib_0.1.4 data.table_1.14.4
## [93] bitops_1.0-7 httpuv_1.6.6
## [95] rtracklayer_1.58.0 R6_2.5.1
## [97] BiocIO_1.8.0 bookdown_0.29
## [99] promises_1.2.0.1 codetools_0.2-18
## [101] MASS_7.3-58.1 assertthat_0.2.1
## [103] rhdf5_2.42.0 openssl_2.0.4
## [105] rjson_0.2.21 withr_2.5.0
## [107] GenomicAlignments_1.34.0 Rsamtools_2.14.0
## [109] GenomeInfoDbData_1.2.9 hms_1.1.2
## [111] quadprog_1.5-8 grid_4.2.1
## [113] tidyr_1.2.1 base64_2.0.1
## [115] rmarkdown_2.17 DelayedMatrixStats_1.20.0
## [117] illuminaio_0.40.0 shiny_1.7.3
## [119] restfulr_0.0.15
Butcher, Darci T, Cheryl Cytrynbaum, Andrei L Turinsky, Michelle T Siu, Michal Inbar-Feigenberg, Roberto Mendoza-Londono, David Chitayat, et al. 2017. “CHARGE and Kabuki Syndromes: Gene-Specific Dna Methylation Signatures Identify Epigenetic Mechanisms Linking These Clinically Overlapping Conditions.” The American Journal of Human Genetics 100 (5): 773–88.
Gervin, Kristina, Lucas A Salas, Kelly M Bakulski, Menno C Van Zelm, Devin C Koestler, John K Wiencke, Liesbeth Duijts, et al. 2019. “Systematic Evaluation and Validation of Reference and Library Selection Methods for Deconvolution of Cord Blood Dna Methylation Data.” Clinical Epigenetics 11 (1): 1–15.
Jaffe, Andrew E, Peter Murakami, Hwajin Lee, Jeffrey T Leek, M Daniele Fallin, Andrew P Feinberg, and Rafael A Irizarry. 2012. “Bump Hunting to Identify Differentially Methylated Regions in Epigenetic Epidemiology Studies.” International Journal of Epidemiology 41 (1): 200–209.
Reproducibility, Unrivaled Assay. 2012. “Infinium Humanmethylation450 Beadchip.”
Shi, Lei, Fan Jiang, Fengxiu Ouyang, Jun Zhang, Zhimin Wang, and Xiaoming Shen. 2018. “DNA Methylation Markers in Combination with Skeletal and Dental Ages to Improve Age Estimation in Children.” Forensic Science International: Genetics 33: 1–9.