Compute precision weights for pseudobulk using the delta method to approximate the variance of the log2 counts per million considering variation in the number of cells and gene expression variance across cells within each sample. By default, used number of cells; if specified use delta method. Note that processAssays()
uses number of cells as weights when no weights are specificed
Usage
pbWeights(
sce,
sample_id,
cluster_id,
geneList = NULL,
method = c("delta", "ncells"),
shrink = TRUE,
prior.count = 0.5,
maxRatio = 20,
h5adBlockSizes = 1e+09,
details = FALSE,
verbose = TRUE
)
Arguments
- sce
SingleCellExperiment
of wherecounts(sce)
stores the raw count data at the single cell level- sample_id
character string specifying which variable to use as sample id
- cluster_id
character string specifying which variable to use as cluster id
- geneList
list of genes to be included for each cell type
- method
select method to compute precision weights.
'delta'
use the delta method based on normal approximation to a negative binomial model, slower but can increase power.'ncells'
use the number of cells, this is faster; Subsequent arguments are ignored. Included for testing- shrink
Defaults to
TRUE
. Use empirical Bayes variance shrinkage fromlimma
to shrink estimates of expression variance across cells within each sample- prior.count
Defaults to
0.5
. Count added to each observation at the pseudobulk level. This is scaled but the number of cells before added to the cell level- maxRatio
When computing precision as the reciprocal of variance
1/(x+tau)
select tau to have a maximum ratio between the largest and smallest precision- h5adBlockSizes
set the automatic block size block size (in bytes) for DelayedArray to read an H5AD file. Larger values use more memory but are faster.
- details
include
data.frame
of cell-level statistics asattr(., "details")
- verbose
Show messages, defaults to TRUE
Examples
library(muscat)
data(example_sce)
# create pseudobulk for each sample and cell cluster
pb <- aggregateToPseudoBulk(example_sce,
assay = "counts",
sample_id = "sample_id",
cluster_id = "cluster_id",
verbose = FALSE
)
# Gene expressed genes for each cell type
geneList = getExprGeneNames(pb)
# Create precision weights for pseudobulk
# By default, weights are set to cell count,
# which is the default in processAssays()
# even when no weights are specified
weightsList <- pbWeights(example_sce,
sample_id = "sample_id",
cluster_id = "cluster_id",
geneList = geneList
)
#> Processing: B cells
#> Computing library sizes...
#> Processing samples...
#> Processing: CD14+ Monocytes
#> Computing library sizes...
#> Processing samples...
#> Processing: CD4 T cells
#> Computing library sizes...
#> Processing samples...
#> Processing: CD8 T cells
#> Computing library sizes...
#> Processing samples...
#> Processing: FCGR3A+ Monocytes
#> Computing library sizes...
#> Processing samples...
# voom-style normalization using initial weights
res.proc <- processAssays(pb, ~group_id, weightsList = weightsList)
#> B cells...
#> 0.28 secs
#> CD14+ Monocytes...
#> 0.38 secs
#> CD4 T cells...
#> 0.3 secs
#> CD8 T cells...
#> 0.2 secs
#> FCGR3A+ Monocytes...
#> 0.4 secs