Evaluate multivariate BIC while considering correlation between response variables. For n samples, p responses and m parameters for each model, evaluate the multivariate BIC as $$n * logDet(\Sigma) + log(n) * (p*m + 0.5*p*(p+1))$$ where \(\Sigma\) is the residual covariance matrix. This formula extends the standard univariate BIC to the multivariate case. For one response the standard penalty is \(log(n)*m\), this just adds a \(log(n)\) to that value, but only the differences between two models is important. Estimating the \(p x p\) covariance matrix requires \(0.5*p*(p+1)\) parameters. When \(p > m\) the residual covariance matrix Sigma is not full rank. In this case the psudo-determinant is used instead.

mvIC_fit(
  exprObj,
  formula,
  data,
  criterion = c("BIC", "sum BIC", "AIC", "AICC", "CAIC", "sum AIC"),
  shrink.method = c("EB", "none", "var_equal", "var_unequal"),
  nparamsMethod = c("edf", "countLevels", "lme4"),
  pca = TRUE,
  verbose = FALSE,
  ...
)

Arguments

exprObj

matrix of expression data (g genes x n samples), or ExpressionSet, or EList returned by voom() from the limma package

formula

specifies variables for the linear (mixed) model. Must only specify covariates, since the rows of exprObj are automatically used as a response. e.g.: ~ a + b + (1|c) Formulas with only fixed effects also work, and lmFit() followed by contrasts.fit() are run.

data

data.frame with columns corresponding to formula

criterion

multivariate criterion ('AIC', 'BIC') or summing score assuming independence of reponses ('sum AIC', 'sum BIC')

shrink.method

Shrink covariance estimates to be positive definite. Using "var_equal" assumes all variance on the diagonal are equal. This method is the fastest because it is linear time. Using "var_unequal" allows each response to have its own variance term, however this method is quadratic time. Using "none" does not apply shrinkge, but is only valid when there are very few responses

nparamsMethod

"edf": effective degrees of freedom. "countLevels" count number of levels in each random effect. "lme4" number of variance compinents, as used by lme4. See description in nparam

pca

use PCA to transform variables

verbose

Default TRUE. Print messages

...

additional arguements passed to logDet

Value

multivariate BIC value

Details

Evaluate multivariate BIC on matrix of response variables. Smaller is better.

References

Pauler DK (1998). “The Schwarz criterion and related methods for normal linear models.” Biometrika, 85(1), 13--27. Bedrick EJ, Tsai C (1994). “Model selection for multivariate regression in small samples.” Biometrics, 226--231. Wu T, Chen P, Yan Y (2013). “The weighted average information criterion for multivariate regression model selection.” Signal Processing, 93(1), 49--55.

Examples


# create matrix of responses
Y = with(iris, rbind(Sepal.Width, Sepal.Length))

# Evaluate model 1
mvIC_fit( Y, ~ Species, data=iris)
#> 		Multivariate IC score
#> 
#>   Samples:	 150 
#>   Responses:	 2 
#>   Coef param:	 3 
#>   Cov param:	 2 
#>   Regression:	 lm 
#>   Shrink method: EB 
#>   lambda:	 1 
#>   Criterion:	 BIC 
#>   Score:	 389.4871 
#> 

# Evaluate model 2
# smaller mvIC means better model
mvIC_fit( Y, ~ Petal.Width + Petal.Length + Species, data=iris)
#> 		Multivariate IC score
#> 
#>   Samples:	 150 
#>   Responses:	 2 
#>   Coef param:	 5 
#>   Cov param:	 2 
#>   Regression:	 lm 
#>   Shrink method: EB 
#>   lambda:	 1 
#>   Criterion:	 BIC 
#>   Score:	 216.5552 
#>