-
Notifications
You must be signed in to change notification settings - Fork 39
Description
I am running chromVAR on human scATACseq data using 720 human motifs from the JASPAR2024 database. When I calculate differential accessibility I find that all 720 motifs have adjusted p-values < 0.05, and when I calculate differential variability, I find that 715/720 motifs have adjusted p-values < 0.05.
Is it typical for this many motifs to be found to be differentially accessible/variable? It seems unlikely that all of the test motifs would be differentially accessible between cell types.
The tSNE embedding that I generate based on the chromVAR results very specifically recapitulates our cell type clusters. When I look at specific motifs from known lineage-defining TF’s, they also are very specifically enriched in the appropriate cell types.
My code and sessionInfo() are below.
library(chromVAR)
library(motifmatchr)
library(Matrix)
library(SummarizedExperiment)
library(BiocParallel)
library(Signac)
library(Seurat)
library(BSgenome.Hsapiens.UCSC.hg38)
library(EnsDb.Hsapiens.v86)
library(JASPAR2024)
library(tidyverse)
library(TFBSTools)
set.seed(2025)
# Set Up
register(MulticoreParam(8, progressbar = TRUE))
# Load data
seu <- readRDS("seu.rds")
DefaultAssay(seu) <- "peaks"
# we remove the scaffolds, as some of these peaks are not in the BSgenome
main.chroms <- standardChromosomes(BSgenome.Hsapiens.UCSC.hg38)
keep.peaks <- which(as.character(seqnames(granges(seu))) %in% main.chroms)
seu[["peaks"]] <- subset(seu[["peaks"]],
features = rownames(seu[["peaks"]])[keep.peaks])
# Get peaks
peaks <- sort(seu@assays$peaks@ranges)
# Get counts
my_counts_matrix <- as.matrix(seu@assays$peaks@data)
fragment_counts <- SummarizedExperiment(assays =
list(counts = my_counts_matrix),
rowRanges = peaks)
# Adding GC content
fragment_counts <- addGCBias(fragment_counts, genome = BSgenome.Hsapiens.UCSC.hg38)
# Filtering
counts_filtered <- filterPeaks(fragment_counts, non_overlapping = TRUE)
# Get motifs
jaspar <- JASPAR2024()
sq24 <- RSQLite::dbConnect(RSQLite::SQLite(), db(jaspar))
motifs <- TFBSTools::getMatrixSet(sq24, list(species = "Homo sapiens"))
# Match motifs
motif_ix <- matchMotifs(motifs, counts_filtered,
genome = BSgenome.Hsapiens.UCSC.hg38)
# Compute deviations
dev <- computeDeviations(object = counts_filtered, annotations = motif_ix)
# Add colData
colData(dev) <- DataFrame([email protected])
variability <- computeVariability(dev)
plotVariability(variability, use_plotly = FALSE)
# Differential Activity and Variability
diff_acc <- differentialDeviations(dev, "celltype_level_3")
diff_var <- differentialVariability(dev, "celltype_level_3")
tsne_results <- deviationsTsne(dev, threshold = 1.5, perplexity = 10,
shiny = FALSE)
tsne_plots <-
plotDeviationsTsne(dev,
tsne_results,
sample_column = "celltype_level_3",
annotation = “MYOG”,
shiny = FALSE)
R version 4.3.3 (2024-02-29)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Rocky Linux 8.10 (Green Obsidian)
Matrix products: default
BLAS/LAPACK: /grid/meyer/home/chapin/mambaforge/envs/snakemake-r-analysis/lib/libopenblasp-r0.3.27.so; LAPACK version 3.12.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: America/New_York
tzcode source: system (glibc)
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] BiocParallel_1.36.0 SummarizedExperiment_1.32.0
[3] Biobase_2.62.0 GenomicRanges_1.54.1
[5] GenomeInfoDb_1.38.8 IRanges_2.36.0
[7] S4Vectors_0.40.2 BiocGenerics_0.48.1
[9] MatrixGenerics_1.14.0 matrixStats_1.5.0
[11] motifmatchr_1.24.0 data.table_1.17.0
[13] SnapATAC_1.0.0 rhdf5_2.46.1
[15] Matrix_1.6-5 lubridate_1.9.4
[17] forcats_1.0.0 stringr_1.5.1
[19] dplyr_1.1.4 purrr_1.0.4
[21] readr_2.1.5 tidyr_1.3.1
[23] tibble_3.3.0 ggplot2_3.5.2 [51/1973]
[25] tidyverse_2.0.0 Seurat_5.2.1
[27] SeuratObject_5.0.2 sp_2.2-0
[29] chromVAR_1.24.0
loaded via a namespace (and not attached):
[1] RcppAnnoy_0.0.22 splines_4.3.3
[3] later_1.4.1 BiocIO_1.12.0
[5] bitops_1.0-9 R.oo_1.27.0
[7] polyclip_1.10-7 XML_3.99-0.18
[9] DirichletMultinomial_1.44.0 fastDummies_1.7.5
[11] lifecycle_1.0.4 tcltk_4.3.3
[13] edgeR_4.0.16 doParallel_1.0.17
[15] vroom_1.6.5 globals_0.16.3
[17] lattice_0.22-6 MASS_7.3-60.0.1
[19] magrittr_2.0.3 limma_3.58.1
[21] plotly_4.10.4 plot3D_1.4.1
[23] yaml_2.3.10 httpuv_1.6.15
[25] sctransform_0.4.1 spam_2.11-1
[27] spatstat.sparse_3.1-0 reticulate_1.41.0
[29] cowplot_1.1.3 pbapply_1.7-2
[31] DBI_1.2.3 CNEr_1.38.0
[33] RColorBrewer_1.1-3 abind_1.4-8
[35] zlibbioc_1.48.2 Rtsne_0.17
[37] R.utils_2.13.0 RCurl_1.98-1.17
[39] misc3d_0.9-1 GenomeInfoDbData_1.2.11
[41] ggrepel_0.9.6 irlba_2.3.5.1
[43] listenv_0.9.1 spatstat.utils_3.1-2
[45] seqLogo_1.68.0 goftest_1.2-3
[47] RSpectra_0.16-2 bigmemory_4.6.4
[49] spatstat.random_3.3-2 annotate_1.80.0
[51] fitdistrplus_1.2-2 parallelly_1.42.0
[53] codetools_0.2-20 DelayedArray_0.28.0
[55] DT_0.33 tidyselect_1.2.1
[57] farver_2.1.2 viridis_0.6.5
[59] spatstat.explore_3.3-4 GenomicAlignments_1.38.2 [16/1973]
[61] jsonlite_1.9.0 progressr_0.15.1
[63] iterators_1.0.14 ggridges_0.5.6
[65] survival_3.8-3 foreach_1.5.2
[67] tools_4.3.3 snow_0.4-4
[69] TFMPvalue_0.0.9 ica_1.0-3
[71] Rcpp_1.1.0 glue_1.8.0
[73] gridExtra_2.3 SparseArray_1.2.4
[75] withr_3.0.2 fastmap_1.2.0
[77] rhdf5filters_1.14.1 caTools_1.18.3
[79] digest_0.6.37 timechange_0.3.0
[81] R6_2.6.1 mime_0.12
[83] colorspace_2.1-1 scattermore_1.2
[85] GO.db_3.18.0 gtools_3.9.5
[87] poweRlaw_1.0.0 tensor_1.5
[89] dichromat_2.0-0.1 spatstat.data_3.1-4
[91] RSQLite_2.3.9 R.methodsS3_1.8.2
[93] generics_0.1.3 rtracklayer_1.62.0
[95] httr_1.4.7 htmlwidgets_1.6.4
[97] S4Arrays_1.2.1 TFBSTools_1.40.0
[99] uwot_0.2.3 pkgconfig_2.0.3
[101] gtable_0.3.6 blob_1.2.4
[103] lmtest_0.9-40 XVector_0.42.0
[105] htmltools_0.5.8.1 dotCall64_1.2
[107] scales_1.4.0 png_0.1-8
[109] doSNOW_1.0.20 spatstat.univar_3.1-1
[111] bigmemory.sri_0.1.8 uuid_1.2-1
[113] tzdb_0.4.0 reshape2_1.4.4
[115] rjson_0.2.23 nlme_3.1-167
[117] cachem_1.1.0 zoo_1.8-13
[119] KernSmooth_2.23-26 parallel_4.3.3
[121] miniUI_0.1.1.1 AnnotationDbi_1.64.1
[123] restfulr_0.0.15 pillar_1.11.0
[125] grid_4.3.3 vctrs_0.6.5
[127] RANN_2.6.2 promises_1.3.2
[129] xtable_1.8-4 cluster_2.1.8
[131] locfit_1.5-9.12 cli_3.6.5
[133] compiler_4.3.3 Rsamtools_2.18.0
[135] rlang_1.1.6 crayon_1.5.3
[137] future.apply_1.11.3 plyr_1.8.9
[139] stringi_1.8.4 deldir_2.0-4
[141] viridisLite_0.4.2 Biostrings_2.70.3
[143] lazyeval_0.2.2 spatstat.geom_3.3-5
[145] RcppHNSW_0.6.0 BSgenome_1.70.2
[147] hms_1.1.3 patchwork_1.3.0
[149] bit64_4.6.0-1 future_1.34.0
[151] Rhdf5lib_1.24.2 statmod_1.5.0
[153] KEGGREST_1.42.0 shiny_1.10.0
[155] ROCR_1.0-11 igraph_2.1.4
[157] memoise_2.0.1 bit_4.5.0.1