The gDRtestData
package is intended to store and generate example data that can be used through the gDR
suite.
Since gDR is a computational suite for drug response data from any experiment, a synthetic dataset is also needed for testing and exploration.
The basis of this package are two functions to generate the synthetic sets of cell lines and drugs.
cell_lines <- create_synthetic_cell_lines()
drugs <- create_synthetic_drugs()
These base objects can be extended with additional information.
cl_rep <- add_data_replicates(cell_lines)
head(cl_rep)
cl_conc <- add_concentration(cell_lines)
head(cl_conc)
Or the user can do both with one function:
df_layout <- prepareData(cell_lines, drugs)
head(df_layout)
Additionally, the user may fill in the full response data with the day 0 part.
df_layout_small <- prepareData(cell_lines[seq_len(2), ], drugs[seq_len(4), ])
df_layout_small$Duration <- 72
df_layout_small$ReadoutValue <- 0
df_layout_small_with_Day0 <- add_day0_data(df_layout_small)
head(df_layout_small_with_Day0)
In a further step, the user may generate a set of synthetic results:
hill <- generate_hill_coef(cell_lines, drugs)
ec50_met <- generate_ec50(cell_lines, drugs)
einf_met <- generate_e_inf(cell_lines, drugs)
Or the user can create full response data with one function (for single-agent):
response_data <- prepareMergedData(cell_lines, drugs)
head(response_data)
SUMMARY
Step | Function | Output (data.table ) |
---|---|---|
0 | create_synthetic_cell_lines() | synthetic cell lines |
0 | create_synthetic_drugs() | synthetic drugs |
1 | prepareData() | cell lines and drug merged with replicates and concentration |
2 | prepareMergedData() | full response data for single-agent |
2 | prepareComboMergedData() | full response data for combo |
2 | prepareCodilutionData () | full response data for co-dilution |
The gDR data model is built on the MultiAssayExperiments (MAE) structure. A detailed description of the gDR data model can be found in gDRcore
package vignette.
In inst/testdata
the user may find a set of qs
files that are examples of gDR data model for different data types. In the file synthetic_list.yml
one can find a list of these datasets. Currently available are:
#> * combo_2dose_nonoise,
#> * combo_2dose_nonoise2,
#> * combo_2dose_nonoise3,
#> * combo_codilution_small,
#> * combo_codilution,
#> * combo_matrix_small,
#> * combo_matrix,
#> * combo_triple,
#> * medium,
#> * small_no_noise,
#> * small,
#> * wLigand .
The script generate_example_data.R
which shows how to generate and process above-mentioned datasets is in inst/scripts
dir.
All key functions can be found in package gDRcore
in script generate_wrappers.R
.
In inst/annotation_data
the user can find CSV
files used in gDRcore
for testing annotation functions.
Other files which were not mentioned above are used for testing gDR suite functionality.
sessionInfo()
#> R Under development (unstable) (2024-10-21 r87258)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.1 LTS
#>
#> Matrix products: default
#> BLAS: /home/biocbuild/bbs-3.21-bioc/R/lib/libRblas.so
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_GB LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: America/New_York
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] gDRtestData_1.5.0 BiocStyle_2.35.0
#>
#> loaded via a namespace (and not attached):
#> [1] backports_1.5.0 digest_0.6.37 R6_2.5.1
#> [4] bookdown_0.41 fastmap_1.2.0 xfun_0.48
#> [7] cachem_1.1.0 knitr_1.48 htmltools_0.5.8.1
#> [10] rmarkdown_2.28 lifecycle_1.0.4 cli_3.6.3
#> [13] sass_0.4.9 data.table_1.16.2 jquerylib_0.1.4
#> [16] compiler_4.5.0 tools_4.5.0 checkmate_2.3.2
#> [19] evaluate_1.0.1 bslib_0.8.0 yaml_2.3.10
#> [22] BiocManager_1.30.25 jsonlite_1.8.9 rlang_1.1.4