Fit GLMM For Many Features — glmmFitFeatures-class • BatchRegression

Fit regression model y ~ design + X_features[,j] for each feature j

Usage

glmmFitFeatures(
  y,
  design,
  data,
  family,
  U,
  s,
  weights = NULL,
  offset = NULL,
  dcmpMethod = c("general", "categorical"),
  delta = NULL,
  delta.range = c(-10, 10),
  tol = 0.001,
  tol.eta = 0.001,
  detail = 1,
  nthreads = 1,
  fastApprox = FALSE,
  lambda = 0,
  verbose = TRUE,
  ...
)

# S4 method for class 'ANY,ANY,matrix'
glmmFitFeatures(
  y,
  design,
  data,
  family,
  U,
  s,
  weights = NULL,
  offset = NULL,
  dcmpMethod = c("general", "categorical"),
  delta = NULL,
  delta.range = c(-10, 10),
  tol = 0.001,
  tol.eta = 0.001,
  detail = 1,
  nthreads = 1,
  fastApprox = FALSE,
  lambda = 0,
  verbose = TRUE,
  ...
)

Arguments

y: response vector
design: design matrix shared across all models
data: feature matrix with model j using feature j
family: a description of the error distribution and link function to be used in the modelm just like for glm(). Also supports negative binomial as string "nb:theta".
U: eigen-vectors of random effect
s: eigen-values of random effect
weights: sample-level weights
offset: sample-level offset values
dcmpMethod: use a "general" method (default) for SVD of Z. If Z is a categorical design matrix, used faster method "categorical"
delta: if NULL estimate delta, if value is given use this fixed value
delta.range: min and max values (in log space), of the search space for delta to fit the random effect
tol: convergence criterion for the 1D search of the delta space
tol.eta: convergence criterion in the PQL iteration
detail: level of model detail returned, with LEAST = 0, LOW = 1, MEDIUM = 2, HIGH = 3, MOST = 4, MAX = 5. LEAST (beta), LOW (beta, se, sigSq, rdf), MEDIUM (vcov), HIGH (residuals), MOST (hatvalues), MAX (deviance residuals)
nthreads: number of threads. Each model is fit in serial, analysis is parallelized across features
fastApprox: default false. if true, use pre-projection on the working response from an initial regression fit on only the design. Under the null for data, this is a very good approximation and _much_ faster
lambda: ridge shrinkage parameter
verbose: show progress
...: other args

Value

List of parameter estimates with entries coef, se, sigSq, rdf and other depending on detail

Examples

library(fastglmm)
library(lme4)
#> Loading required package: Matrix
#> 
#> Attaching package: ‘lme4’
#> The following object is masked from ‘package:nlme’:
#> 
#>     lmList

set.seed(1)
sleepstudy$V = rnorm(nrow(sleepstudy))

# lmer
fit <- lmer( Reaction ~ Days + V + (1 | Subject), sleepstudy)
fit
#> Linear mixed model fit by REML ['lmerMod']
#> Formula: Reaction ~ Days + V + (1 | Subject)
#>    Data: sleepstudy
#> REML criterion at convergence: 1782.453
#> Random effects:
#>  Groups   Name        Std.Dev.
#>  Subject  (Intercept) 37.12   
#>  Residual             31.06   
#> Number of obs: 180, groups:  Subject, 18
#> Fixed Effects:
#> (Intercept)         Days            V  
#>     251.457       10.472       -1.295  

# prepare response, design and random efect
y = sleepstudy$Reaction
design = model.matrix(~ Days, sleepstudy)
dcmp = indicator_decomp(sleepstudy$Subject)

data = as.matrix(sleepstudy$V)
rownames(data) = rownames(sleepstudy)
colnames(data) = paste0("SNP_", seq(ncol(data)))

fit1 = glmmFitFeatures(y, design, data, dcmp$vectors, dcmp$values, family=gaussian())
coef(fit1)
#>       (Intercept)     Days         x
#> SNP_1    251.4575 10.47224 -1.295992