Compute coefficient estimate from regression model and its standard error give z-statistic, sample size and standard deviation of response and covariate
Details
Given a z-statistic, we want to obtain the coefficient value from a linear regression. Adapting the approach from Zhu, et al (2016, Methods eqn 6), we estimate the coefficient as $$\beta_{linear} = z * sd_y / (sd_x*sqrt(n + z^2)),$$ where \(sd_x\) is the standard deviation of the covariate and \(sd_y\) is the standard error of the response. For a model with no covariates, this transformation gives the exact coefficient estimate. With covariates, it is approximate.
The coeffient estimate from linear regression can be converted to the logistic scale using the approach of Pirinen, et al. (2013) according to $$\beta_{logistic} = \beta_{linear} / (\phi(1-\phi)),$$ where \(\phi\) is the case ratio in the logistic regression. This approximates the coefficient as if the model had been fit with logistic regression.
References
Zhu, et al. (2016). Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nature Genetics. 48:481–487 doi:10.1038/ng.3538
Pirinen, Donnelly and Spencer. (2013). Efficient computation with a linear mixed model on large-scale data sets with applications to genetic studies. Ann. Appl. Stat. 7(1): 369-390 doi:10.1214/12-AOAS586
Examples
# Linear regression
#------------------
# simulate data
set.seed(1)
n = 100
x = rnorm(n, 0, 7)
y = x*3 + rnorm(n)
data = data.frame(x, y)
# fit regression model
fit <- lm(y ~ x, data=data)
# get z-statistic
z = coef(summary(fit))[2,'t value']
# coef and se from regression model
coef(summary(fit))[2,-4]
#> Estimate Std. Error t value
#> 2.99984852 0.01538958 194.92730329
# coef and se from summary statistics
coef_from_z(z, n, sd(x), sd(y))
#> coef se method
#> 1 2.99977 0.01538917 linear
# Logistic regression
#--------------------
# simulate data
n = 1000
p = .3
x = rbinom(n, 2, p)
eta = x*.1
y = rbinom(n, size=1, prob=plogis(eta))
data = data.frame(x, y)
# fit regression model
fit <- glm(y ~ x, data=data, family=binomial)
# get z-statistic
z = coef(summary(fit))[2,3]
# get case ratio
phi = sum(y) / length(y)
# coef and se from regression model
coef(summary(fit))[2,]
#> Estimate Std. Error z value Pr(>|z|)
#> 0.14614232 0.09590596 1.52380859 0.12755653
# when phi is given, coef is transformed to logistic scale
coef_from_z(z, n, sd(x), phi=phi)
#> coef se method
#> 1 0.1471922 0.09659496 logistic