Skip to content

All motifs found to be differentially accessible #126

@sarah-chapin

Description

@sarah-chapin

I am running chromVAR on human scATACseq data using 720 human motifs from the JASPAR2024 database. When I calculate differential accessibility I find that all 720 motifs have adjusted p-values < 0.05, and when I calculate differential variability, I find that 715/720 motifs have adjusted p-values < 0.05.

Is it typical for this many motifs to be found to be differentially accessible/variable? It seems unlikely that all of the test motifs would be differentially accessible between cell types.

The tSNE embedding that I generate based on the chromVAR results very specifically recapitulates our cell type clusters. When I look at specific motifs from known lineage-defining TF’s, they also are very specifically enriched in the appropriate cell types.

My code and sessionInfo() are below.

library(chromVAR)
library(motifmatchr)
library(Matrix)
library(SummarizedExperiment)
library(BiocParallel)
library(Signac)
library(Seurat)
library(BSgenome.Hsapiens.UCSC.hg38)
library(EnsDb.Hsapiens.v86)
library(JASPAR2024)
library(tidyverse)
library(TFBSTools)
set.seed(2025)

# Set Up
register(MulticoreParam(8, progressbar = TRUE))

# Load data
seu <- readRDS("seu.rds")

DefaultAssay(seu) <- "peaks"

# we remove the scaffolds, as some of these peaks are not in the BSgenome
main.chroms <- standardChromosomes(BSgenome.Hsapiens.UCSC.hg38)
keep.peaks <- which(as.character(seqnames(granges(seu))) %in% main.chroms)
seu[["peaks"]] <- subset(seu[["peaks"]],
                        features = rownames(seu[["peaks"]])[keep.peaks])

# Get peaks 
peaks <- sort(seu@assays$peaks@ranges)

# Get counts
my_counts_matrix <- as.matrix(seu@assays$peaks@data)

fragment_counts <- SummarizedExperiment(assays = 
                                          list(counts = my_counts_matrix),
                                        rowRanges = peaks)
# Adding GC content
fragment_counts <- addGCBias(fragment_counts, genome = BSgenome.Hsapiens.UCSC.hg38)

# Filtering
counts_filtered <- filterPeaks(fragment_counts, non_overlapping = TRUE)

# Get motifs
jaspar <- JASPAR2024()
sq24 <- RSQLite::dbConnect(RSQLite::SQLite(), db(jaspar))
motifs <- TFBSTools::getMatrixSet(sq24, list(species = "Homo sapiens"))

# Match motifs
motif_ix <- matchMotifs(motifs, counts_filtered, 
                        genome = BSgenome.Hsapiens.UCSC.hg38)


# Compute deviations
dev <- computeDeviations(object = counts_filtered, annotations = motif_ix)

# Add colData

colData(dev) <- DataFrame([email protected])

variability <- computeVariability(dev)
plotVariability(variability, use_plotly = FALSE)

# Differential Activity and Variability
diff_acc <- differentialDeviations(dev, "celltype_level_3")
diff_var <- differentialVariability(dev, "celltype_level_3")

tsne_results <- deviationsTsne(dev, threshold = 1.5, perplexity = 10, 
                               shiny = FALSE)

tsne_plots <-
  plotDeviationsTsne(dev,
                     tsne_results,
                     sample_column = "celltype_level_3",
annotation = “MYOG”,
                     shiny = FALSE) 
               R version 4.3.3 (2024-02-29)                                                                                                                                                                      
Platform: x86_64-conda-linux-gnu (64-bit)                                                                                                                                                                        
Running under: Rocky Linux 8.10 (Green Obsidian)                                                                                                                                                                 
                                                                                                                                                                                                                 
Matrix products: default                                                                                                                                                                                         
BLAS/LAPACK: /grid/meyer/home/chapin/mambaforge/envs/snakemake-r-analysis/lib/libopenblasp-r0.3.27.so;  LAPACK version 3.12.0                                                                                    
                                                                                                                                                                                                                 
locale:                                                                                                                                                                                                          
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C                                                                                                                                                                     
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8                                                                                                                                                           
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8                                                                                                                                                          
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                                                                                                                                                                        
 [9] LC_ADDRESS=C               LC_TELEPHONE=C                                                                                                                                                                   
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C                                                                                                                                                              
                                                                                                                                                                                                                 
time zone: America/New_York                                                                                                                                                                                      
tzcode source: system (glibc)

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
 [1] BiocParallel_1.36.0         SummarizedExperiment_1.32.0
 [3] Biobase_2.62.0              GenomicRanges_1.54.1
 [5] GenomeInfoDb_1.38.8         IRanges_2.36.0
 [7] S4Vectors_0.40.2            BiocGenerics_0.48.1
 [9] MatrixGenerics_1.14.0       matrixStats_1.5.0
[11] motifmatchr_1.24.0          data.table_1.17.0
[13] SnapATAC_1.0.0              rhdf5_2.46.1
[15] Matrix_1.6-5                lubridate_1.9.4
[17] forcats_1.0.0               stringr_1.5.1
[19] dplyr_1.1.4                 purrr_1.0.4
[21] readr_2.1.5                 tidyr_1.3.1
[23] tibble_3.3.0                ggplot2_3.5.2                                                                                                                                                          [51/1973]
[25] tidyverse_2.0.0             Seurat_5.2.1
[27] SeuratObject_5.0.2          sp_2.2-0
[29] chromVAR_1.24.0

loaded via a namespace (and not attached):
  [1] RcppAnnoy_0.0.22            splines_4.3.3
  [3] later_1.4.1                 BiocIO_1.12.0
  [5] bitops_1.0-9                R.oo_1.27.0
  [7] polyclip_1.10-7             XML_3.99-0.18
  [9] DirichletMultinomial_1.44.0 fastDummies_1.7.5
 [11] lifecycle_1.0.4             tcltk_4.3.3
 [13] edgeR_4.0.16                doParallel_1.0.17
 [15] vroom_1.6.5                 globals_0.16.3
 [17] lattice_0.22-6              MASS_7.3-60.0.1
 [19] magrittr_2.0.3              limma_3.58.1
 [21] plotly_4.10.4               plot3D_1.4.1
 [23] yaml_2.3.10                 httpuv_1.6.15
 [25] sctransform_0.4.1           spam_2.11-1
 [27] spatstat.sparse_3.1-0       reticulate_1.41.0
 [29] cowplot_1.1.3               pbapply_1.7-2
 [31] DBI_1.2.3                   CNEr_1.38.0
 [33] RColorBrewer_1.1-3          abind_1.4-8
 [35] zlibbioc_1.48.2             Rtsne_0.17
 [37] R.utils_2.13.0              RCurl_1.98-1.17
 [39] misc3d_0.9-1                GenomeInfoDbData_1.2.11
 [41] ggrepel_0.9.6               irlba_2.3.5.1
 [43] listenv_0.9.1               spatstat.utils_3.1-2
 [45] seqLogo_1.68.0              goftest_1.2-3
 [47] RSpectra_0.16-2             bigmemory_4.6.4
 [49] spatstat.random_3.3-2       annotate_1.80.0
 [51] fitdistrplus_1.2-2          parallelly_1.42.0
 [53] codetools_0.2-20            DelayedArray_0.28.0
 [55] DT_0.33                     tidyselect_1.2.1
 [57] farver_2.1.2                viridis_0.6.5
 [59] spatstat.explore_3.3-4      GenomicAlignments_1.38.2                                                                                                                                              [16/1973]
 [61] jsonlite_1.9.0              progressr_0.15.1
 [63] iterators_1.0.14            ggridges_0.5.6
 [65] survival_3.8-3              foreach_1.5.2
 [67] tools_4.3.3                 snow_0.4-4
 [69] TFMPvalue_0.0.9             ica_1.0-3
 [71] Rcpp_1.1.0                  glue_1.8.0
 [73] gridExtra_2.3               SparseArray_1.2.4
 [75] withr_3.0.2                 fastmap_1.2.0
 [77] rhdf5filters_1.14.1         caTools_1.18.3
 [79] digest_0.6.37               timechange_0.3.0
 [81] R6_2.6.1                    mime_0.12
 [83] colorspace_2.1-1            scattermore_1.2
 [85] GO.db_3.18.0                gtools_3.9.5
 [87] poweRlaw_1.0.0              tensor_1.5
 [89] dichromat_2.0-0.1           spatstat.data_3.1-4
 [91] RSQLite_2.3.9               R.methodsS3_1.8.2
 [93] generics_0.1.3              rtracklayer_1.62.0
 [95] httr_1.4.7                  htmlwidgets_1.6.4
 [97] S4Arrays_1.2.1              TFBSTools_1.40.0
 [99] uwot_0.2.3                  pkgconfig_2.0.3
[101] gtable_0.3.6                blob_1.2.4
[103] lmtest_0.9-40               XVector_0.42.0
[105] htmltools_0.5.8.1           dotCall64_1.2
[107] scales_1.4.0                png_0.1-8
[109] doSNOW_1.0.20               spatstat.univar_3.1-1
[111] bigmemory.sri_0.1.8         uuid_1.2-1
[113] tzdb_0.4.0                  reshape2_1.4.4
[115] rjson_0.2.23                nlme_3.1-167
[117] cachem_1.1.0                zoo_1.8-13
[119] KernSmooth_2.23-26          parallel_4.3.3
[121] miniUI_0.1.1.1              AnnotationDbi_1.64.1
[123] restfulr_0.0.15             pillar_1.11.0
[125] grid_4.3.3                  vctrs_0.6.5
[127] RANN_2.6.2                  promises_1.3.2
[129] xtable_1.8-4                cluster_2.1.8
[131] locfit_1.5-9.12             cli_3.6.5
[133] compiler_4.3.3              Rsamtools_2.18.0
[135] rlang_1.1.6                 crayon_1.5.3
[137] future.apply_1.11.3         plyr_1.8.9
[139] stringi_1.8.4               deldir_2.0-4
[141] viridisLite_0.4.2           Biostrings_2.70.3
[143] lazyeval_0.2.2              spatstat.geom_3.3-5
[145] RcppHNSW_0.6.0              BSgenome_1.70.2
[147] hms_1.1.3                   patchwork_1.3.0
[149] bit64_4.6.0-1               future_1.34.0
[151] Rhdf5lib_1.24.2             statmod_1.5.0
[153] KEGGREST_1.42.0             shiny_1.10.0
[155] ROCR_1.0-11                 igraph_2.1.4
[157] memoise_2.0.1               bit_4.5.0.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions