Find Highly Variable Genes — findHVGs • lucida

Compute the sum of squares of the residuals and compare to a chisq null distribution. This gives very liberal p-values sicen the null distribution is often not satisifed with real data

Usage

findHVGs(x, fit)

Arguments

x: ResidualMatrixGLM matrix
fit: model fit from lucida()

Value

tibble storing

ID: gene identifier
sumSq: p-value that sumSq is larger than expected under the null
P.Value: p-value that sumSq is larger than expected under the null
FDR: false discovery rate
isHVG: is FDR < 0.05

Examples

library(SingleCellExperiment)

# Load example data
data(example_sce, package="muscat")
sce <- example_sce

# Compute library size for each cell
sce$libSize <- colSums(counts(sce))

# fit joint model on all cells
fit <- lucida(sce, ~ 1)
#> Analyze all cells jointly...
#> all 
#> 
  |                                                              |   0%, ETA NA
  |=======================================================| 100%, Elapsed 00:01
#> 
  |                                                              |   0%, ETA NA
  |=======================================================| 100%, Elapsed 00:00
#> 

# extract residuals as a ResidualMatrixGLM
res <- residuals(fit, sce)

# Find highly variable genes
findHVGs( res, fit)
#> # A tibble: 1,205 × 5
#>    ID         sumSq   P.Value       FDR isHVG
#>    <chr>      <dbl>     <dbl>     <dbl> <lgl>
#>  1 HES4       5033. 0         0         TRUE 
#>  2 ISG15    226595. 0         0         TRUE 
#>  3 AURKAIP1   7030. 0         0         TRUE 
#>  4 MRPL20     4081. 1.88e-225 2.05e-225 TRUE 
#>  5 SSU72      5723. 0         0         TRUE 
#>  6 RER1       7515. 0         0         TRUE 
#>  7 RPL22      8038. 0         0         TRUE 
#>  8 PARK7      9331. 0         0         TRUE 
#>  9 ENO1      18893. 0         0         TRUE 
#> 10 FBXO6      4181. 4.45e-239 4.96e-239 TRUE 
#> # ℹ 1,195 more rows