In this vignette, we demonstrate the unsegmented block bootstrap functionality implemented in nullranges. “Unsegmented” refers to the fact that this implementation does not consider segmentation of the genome for sampling of blocks, see the segmented block bootstrap vignette for the alternative implementation.
First we use the DNase hypersensitivity peaks in A549 downloaded from AnnotationHub, and pre-processed as described in the nullrangesData package.
library(nullrangesData)
<- DHSA549Hg38() dhs
library(nullranges)
The following chunk of code evaluates various types of
bootstrap/permutation schemes, first within chromosome, and then across
chromosome (the default). The default type
is bootstrap,
and the default for withinChrom
is FALSE
(bootstrapping with blocks moving across chromosomes).
set.seed(5) # reproducibility
library(microbenchmark)
<- 5e5
blockLength microbenchmark(
list=alist(
p_within=bootRanges(dhs, blockLength=blockLength,
type="permute", withinChrom=TRUE),
b_within=bootRanges(dhs, blockLength=blockLength,
type="bootstrap", withinChrom=TRUE),
p_across=bootRanges(dhs, blockLength=blockLength,
type="permute", withinChrom=FALSE),
b_across=bootRanges(dhs, blockLength=blockLength,
type="bootstrap", withinChrom=FALSE)
times=10) ),
## Unit: milliseconds
## expr min lq mean median uq max neval cld
## p_within 1611.0315 1643.5174 1771.0713 1659.9201 1703.3682 2736.9732 10 c
## b_within 1451.0458 1484.8843 1523.6632 1532.7313 1562.1934 1606.0634 10 b
## p_across 358.9839 385.4329 406.2684 416.1468 421.4092 442.6818 10 a
## b_across 406.2425 433.9784 452.2765 452.4915 459.0252 518.9751 10 a
We create some synthetic ranges in order to visualize the different options of the unsegmented bootstrap implemented in nullranges.
library(GenomicRanges)
<- rep(c("chr1","chr2","chr3"),c(4,5,2))
seq_nms <- GRanges(seqnames=seq_nms,
gr IRanges(start=c(1,101,121,201,
101,201,216,231,401,
1,101),
width=c(20, 5, 5, 30,
20, 5, 5, 5, 30,
80, 40)),
seqlengths=c(chr1=300,chr2=450,chr3=200),
chr=factor(seq_nms))
The following function uses functionality from plotgardener
to plot the ranges. Note in the plotting helper function that
chr
will be used to color ranges by chromosome of
origin.
suppressPackageStartupMessages(library(plotgardener))
<- function(gr) {
plotGRanges pageCreate(width = 5, height = 2, xgrid = 0,
ygrid = 0, showGuides = FALSE)
for (i in seq_along(seqlevels(gr))) {
<- seqlevels(gr)[i]
chrom <- seqlengths(gr)[[chrom]]
chromend suppressMessages({
<- pgParams(chromstart = 0, chromend = chromend,
p x = 0.5, width = 4*chromend/500, height = 0.5,
at = seq(0, chromend, 50),
fill = colorby("chr", palette=palette.colors))
<- plotRanges(data = gr, params = p,
prngs chrom = chrom,
y = 0.25 + (i-1)*.7,
just = c("left", "bottom"))
annoGenomeLabel(plot = prngs, params = p, y = 0.30 + (i-1)*.7)
})
} }
plotGRanges(gr)
Visualizing two permutations of blocks within chromosome:
for (i in 1:2) {
<- bootRanges(gr, blockLength=100, type="permute", withinChrom=TRUE)
gr_prime plotGRanges(gr_prime)
}
Visualizing two bootstraps within chromosome:
for (i in 1:2) {
<- bootRanges(gr, blockLength=100, withinChrom=TRUE)
gr_prime plotGRanges(gr_prime)
}
Visualizing two permutations of blocks across chromosome. Here we use larger blocks than previously.
for (i in 1:2) {
<- bootRanges(gr, blockLength=200, type="permute", withinChrom=FALSE)
gr_prime plotGRanges(gr_prime)
}
Visualizing two bootstraps across chromosome:
for (i in 1:2) {
<- bootRanges(gr, blockLength=200, withinChrom=FALSE)
gr_prime plotGRanges(gr_prime)
}
sessionInfo()
## R version 4.2.1 (2022-06-23)
## Platform: aarch64-apple-darwin20 (64-bit)
## Running under: macOS Ventura 13.0
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib
##
## locale:
## [1] C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] grid stats4 stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] microbenchmark_1.4.9 purrr_0.3.4
## [3] ggridges_0.5.3 tidyr_1.2.0
## [5] EnsDb.Hsapiens.v86_2.99.0 ensembldb_2.22.0
## [7] AnnotationFilter_1.22.0 GenomicFeatures_1.50.2
## [9] AnnotationDbi_1.60.0 patchwork_1.1.1
## [11] plyranges_1.18.0 nullrangesData_1.3.0
## [13] ExperimentHub_2.6.0 AnnotationHub_3.6.0
## [15] BiocFileCache_2.6.0 dbplyr_2.2.1
## [17] ggplot2_3.3.6 plotgardener_1.4.1
## [19] nullranges_1.4.0 InteractionSet_1.26.0
## [21] SummarizedExperiment_1.28.0 Biobase_2.58.0
## [23] MatrixGenerics_1.10.0 matrixStats_0.62.0
## [25] GenomicRanges_1.50.1 GenomeInfoDb_1.34.2
## [27] IRanges_2.32.0 S4Vectors_0.36.0
## [29] BiocGenerics_0.44.0
##
## loaded via a namespace (and not attached):
## [1] plyr_1.8.7 RcppHMM_1.2.2
## [3] lazyeval_0.2.2 splines_4.2.1
## [5] BiocParallel_1.32.1 TH.data_1.1-1
## [7] digest_0.6.29 yulab.utils_0.0.5
## [9] htmltools_0.5.2 fansi_1.0.3
## [11] magrittr_2.0.3 memoise_2.0.1
## [13] ks_1.13.5 Biostrings_2.66.0
## [15] sandwich_3.0-2 prettyunits_1.1.1
## [17] jpeg_0.1-9 colorspace_2.0-3
## [19] blob_1.2.3 rappdirs_0.3.3
## [21] xfun_0.31 dplyr_1.0.9
## [23] crayon_1.5.1 RCurl_1.98-1.7
## [25] jsonlite_1.8.0 survival_3.3-1
## [27] zoo_1.8-10 glue_1.6.2
## [29] gtable_0.3.0 zlibbioc_1.44.0
## [31] XVector_0.38.0 strawr_0.0.9
## [33] DelayedArray_0.24.0 scales_1.2.0
## [35] mvtnorm_1.1-3 DBI_1.1.3
## [37] Rcpp_1.0.9 xtable_1.8-4
## [39] progress_1.2.2 gridGraphics_0.5-1
## [41] bit_4.0.4 mclust_5.4.10
## [43] httr_1.4.3 RColorBrewer_1.1-3
## [45] speedglm_0.3-4 ellipsis_0.3.2
## [47] pkgconfig_2.0.3 XML_3.99-0.10
## [49] farver_2.1.1 sass_0.4.1
## [51] utf8_1.2.2 DNAcopy_1.72.0
## [53] ggplotify_0.1.0 tidyselect_1.1.2
## [55] labeling_0.4.2 rlang_1.0.4
## [57] later_1.3.0 munsell_0.5.0
## [59] BiocVersion_3.16.0 tools_4.2.1
## [61] cachem_1.0.6 cli_3.3.0
## [63] generics_0.1.3 RSQLite_2.2.14
## [65] evaluate_0.15 stringr_1.4.0
## [67] fastmap_1.1.0 yaml_2.3.5
## [69] knitr_1.39 bit64_4.0.5
## [71] KEGGREST_1.38.0 mime_0.12
## [73] pracma_2.3.8 xml2_1.3.3
## [75] biomaRt_2.54.0 compiler_4.2.1
## [77] filelock_1.0.2 curl_4.3.2
## [79] png_0.1-7 interactiveDisplayBase_1.36.0
## [81] tibble_3.1.7 bslib_0.3.1
## [83] stringi_1.7.8 highr_0.9
## [85] lattice_0.20-45 ProtGenerics_1.30.0
## [87] Matrix_1.4-1 vctrs_0.4.1
## [89] pillar_1.7.0 lifecycle_1.0.1
## [91] BiocManager_1.30.18 jquerylib_0.1.4
## [93] data.table_1.14.2 bitops_1.0-7
## [95] httpuv_1.6.5 rtracklayer_1.58.0
## [97] R6_2.5.1 BiocIO_1.8.0
## [99] promises_1.2.0.1 KernSmooth_2.23-20
## [101] codetools_0.2-18 MASS_7.3-58
## [103] assertthat_0.2.1 rjson_0.2.21
## [105] withr_2.5.0 GenomicAlignments_1.34.0
## [107] Rsamtools_2.14.0 multcomp_1.4-19
## [109] GenomeInfoDbData_1.2.8 parallel_4.2.1
## [111] hms_1.1.1 rmarkdown_2.14
## [113] shiny_1.7.1 restfulr_0.0.15