Fit regression model y ~ design + X_features[,j] for each feature j
Usage
glmFitFeatures(
y,
design,
data,
family,
weights,
offset,
detail = 1,
doCoxReid = length(y) < 100,
shareTheta = FALSE,
fastApprox = FALSE,
nthreads = 1,
epsilon = 1e-08,
maxit = 25,
epsilon_nb = 1e-04,
maxit_nb = 5,
lambda = 0,
...
)
# S4 method for class 'ANY,ANY,matrix'
glmFitFeatures(
y,
design,
data,
family,
weights,
offset,
detail = 1,
doCoxReid = length(y) < 100,
shareTheta = FALSE,
fastApprox = FALSE,
nthreads = 1,
epsilon = 1e-08,
maxit = 25,
epsilon_nb = 1e-04,
maxit_nb = 5,
lambda = 0,
...
)Arguments
- y
response vector
- design
design matrix shared across all models
- data
feature matrix with model j using feature j
- family
a description of the error distribution and link function to be used in the modelm just like for
glm(). Also supports negative binomial as string"nb:theta", see details below- weights
vector of sample-level weights
- offset
vector of sample-level offset values
- detail
level of model detail returned, with LEAST = 0, LOW = 1, MEDIUM = 2, HIGH = 3, MOST = 4, MAX = 5. LEAST (beta), LOW (beta, se, sigSq, rdf), MEDIUM (vcov), HIGH (residuals), MOST (hatvalues), MAX (deviance residuals)
- doCoxReid
use Cox-Reid adjustment when estimating overdispersion for negative binomial models. Default TRUE for less than 100 samples
estimate theta from design matrix, and share across all features instead of re-estimating for each feature
- fastApprox
default false. if true, use pre-projection on the working response from an initial regression fit on only the design. Under the null for data, this is a very good approximation and _much_ faster
- nthreads
number of threads. Each model is fit in serial, analysis is parallelized across features
- epsilon
tolerance for GLM IRLS
- maxit
max iterations for GLM IRLS
- epsilon_nb
tolerance for negative binomial
- maxit_nb
max iterations for negative binomial
- lambda
ridge shrinkage parameter
- ...
other args
Value
List of parameter estimates with entries coef, se, dispersion, rdf and other depending on detail
Details
Generalized linear models can be fit with family like in glm() using gaussian(), poisson(), binomial(), binomial("probit"), quasibinomial(), quasipoisson(), negative.binomial(theta), "nb", "nb:theta". Or array of entries of form "nb:theta", where theta is the parameter for the negative binomial distribution
Examples
n <- 100 # number of samples
p <- 10 # number of features
nc <- 3 # number shared covariates
set.seed(1)
y <- rpois(n, 10)
X <- cbind(1, matrix(rnorm(n * p), n, p))
colnames(X) <- seq(ncol(X))
design <- matrix(rnorm(n * nc), n, nc)
# fit regressions with model j including X[,j]
fit <- glmFitFeatures(y, design, X, "poisson")
fit
#> glmFitFeatures
#>
#> coefs(4): V1, V2, V3, x
#> features(11): 1, 2, ..., 10, 11
#> family: poisson/log
#> Estimated: se, dispersion, rdf, varFitted, mu_mean, y_mean
#>