Relative risk and odds ratio are often confused or misinterpreted. Especially while coefficients in logistic regression are directly interpreted as (adjusted) odds ratio, they are unwittingly translated as (adjusted) relative risks in many public health studies. In that relative risks are useful in many thousands of applications, along with odds ratio, we propose a software tool to easily convert from odds ratio to relative risks under logistic regression. Unlike adjusted odds ratio conditional on other confounders, adjusted relative risks may vary depending on other confounders in the logistic model so we also analytically examine the effect of those confounders on the adjusted relative risk.
Let us first define adjusted relative risks of binary exposure \(X\) on binary outcome \(Y\) conditional on \(\mathbf{Z}\).
\[\frac{p(Y = 1 \mid X = 1, \mathbf{Z} )}{p(Y = 1 \mid X = 0, \mathbf{Z})}\]
Generally speaking, when exposure variable of \(X\) is continuous or ordinal, we can define adjusted relative risks as ratio between probability of observing \(Y = 1\) when \(X = x+1\) over \(X = x\) conditional on \(\mathbf{Z}\). Unlike adjusted odds ratio, these ratio depend on baseline value of exposure \(x\) under logistic regression.
\[\frac{p(Y = 1 \mid X = x+1, \mathbf{Z} )}{p(Y = 1 \mid X = x, \mathbf{Z})}\]
On the other hand, when exposure variable is nominal, it is
impossible to compare the probabilities in one unit change. Therefore,
when a type of exposure variable is factor, we allow users
to specify two values of exposure variable including baseline (\(x_{0}\)) and comparative level (\(x_{1}\)) and derive the relative risks
given those two exposure levels.
\[\frac{p(Y = 1 \mid X = x_{1}, \mathbf{Z} )}{p(Y = 1 \mid X = x_{0}, \mathbf{Z})}\] The above is more generalized version. By setting \(x_{1} = 1\) and \(x_{0} = 0\) we can go back to binary case.
Consider nominal outcome of interest that could take more than two values. Denote a value of outcome of \(Y\) as \(0, 1, 2, \ldots, K\) and treat \(Y=0\) as reference.
\[\ln\left( \frac{p(Y_{i} = j | X_{i} = x, \mathbf{Z})}{p(Y_{i} = 0 | X_{i} = x, \mathbf{Z})} \right) = \alpha_{j} + \beta_{j} x + \mathbf{\gamma}^{T} \mathbf{z}_{i}\]
Multinomial logistic regression is widely used for studies from diverse disciplines but unfortunately, we have commonly found the literatures that used relative risk from multinomial logistic regression without full discussion of its derivation or its varying value of conditioning covariates.
In this case, relative risks of reference group (\(Y_{j} = 0\)) for subject \(i\) is :
\[\frac{p( Y_{i} = 0 | X_{i} = x_{1}, \mathbf{Z} = \mathbf{z}_{i} ) }{p(Y_{i} = 0 | X_{i} = x_{0}, \mathbf{Z} = \mathbf{z}_{i} )} = \frac{1}{ 1 + \sum_{k=1}^{K} \exp(a_{k} + \beta_{k} x_{1} + \boldsymbol{\gamma}^{T}_{k} \mathbf{z}_{i} ) } \left( \frac{1}{ 1 + \sum_{k=1}^{K} \exp(a_{k} + \beta_{k} x_{0} + \boldsymbol{\gamma}^{T}_{k} \mathbf{z}_{i} ) } \right)^{-1} \\ = \left( 1 + \sum\limits_{k=1}^{K} \exp( \alpha_{k} + \beta_{k} x_{0} + \boldsymbol{\gamma}^{T}_{k} \mathbf{z}_{i} ) \right) \big/ \left(1 + \sum\limits_{k=1}^{K} \exp(\alpha_{k} + \beta_{k}x_{1} + \boldsymbol{\gamma}^{T}_{k} \mathbf{z}_{i} ) \right)\]
In this case, relative risks of \(Y_{j} = j\) (\(j=1,2,\ldots, K\))
\[\frac{p( Y_{i} = j | X_{i} = x_{1}, \mathbf{Z} = \mathbf{z}_{i} ) }{p(Y_{i} = j | X_{i} = x_{0}, \mathbf{Z} = \mathbf{z}_{i} )} = \frac{\exp(a_{j} + \beta_{j} x_{1} + \boldsymbol{\gamma}^{T}_{k} \mathbf{z}_{i} ) }{ 1 + \sum_{k=1}^{K} \exp(a_{k} + \beta_{k} x_{1} + \boldsymbol{\gamma}^{T}_{k} \mathbf{z}_{i} ) } \left( \frac{\exp(a_{j} + \beta_{j} x_{0} + \boldsymbol{\gamma}^{T}_{k} \mathbf{z}_{i} ) }{ 1 + \sum_{k=1}^{K} \exp(a_{k} + \beta_{k} x_{0} + \boldsymbol{\gamma}^{T}_{k} \mathbf{z}_{i} ) } \right)^{-1} \\ = \frac{\exp( \beta_{j} (x_{1} - x_{0} )) \left( 1 + \sum\limits_{k=1}^{K} \exp( \alpha_{k} + \beta_{k} x_{0} + \boldsymbol{\gamma}^{T}_{k} \mathbf{z}_{i} ) \right) }{ 1 + \sum\limits_{k=1}^{K} \exp(\alpha_{k} + \beta_{k}x_{1} + \boldsymbol{\gamma}^{T}_{k} \mathbf{z}_{i} ) }\]
Observe that relative risks for each of \(K+1\) possible outcomes are all dependent on the regression coefficients of other groups and conditioning coefficient values (\(\mathbf{z}_{i}\)).
Other than relative risks, relative risk ratio (RRR) between response of j and response of 0 is often of interest.
\[ \frac{p( Y_{i} = j | X_{i} = x_{1}, \mathbf{Z} = \mathbf{z}_{i} ) }{p(Y_{i} = j | X_{i} = x_{0}, \mathbf{Z} = \mathbf{z}_{i} )} \big/ \frac{p( Y_{i} = 0 | X_{i} = x_{1}, \mathbf{Z} = \mathbf{z}_{i} ) }{p(Y_{i} = 0 | X_{i} = x_{0}, \mathbf{Z} = \mathbf{z}_{i} )} = \exp\left( \beta_{j}(x_{1} - x_{0}) \right) \]
In case of (adjusted) odds ratio derived from logistic regression, we
can directly obtain variance-covariance matrix for coefficients using
glm function in R. However, deriving variance
of adjusted relative risks, as a function of those coefficients, is more
challenging.
We first provide a estimated variance of relative risk using Delta
method upon estimated variance of odds ratio from glm. The
second method to estimate variance is using sampling variance of
bootstrap samples.
Let \(\boldsymbol{\beta}\) be a vector of coefficients used in logistic regression and among them \(\beta_{1}\) is a coefficient associated with an exposure variable of interest taking a value of \(x_{0}\) as baseline level and \(x_{1}\) as comparative level. Then we can represent the adjusted relative risk as a function of \(\boldsymbol{\beta}\) conditional on \(\mathbf{Z} = \mathbf{z}\):
\[g(\boldsymbol{\beta}) = \frac{1 + \exp(-\beta_{0} - \beta_{1} x_{0} - \boldsymbol{\beta}^{T}_{2:p} \mathbf{z}) }{ 1 + \exp (-\beta_{0} - \beta_{1} x_{1} - \boldsymbol{\beta}^{T}_{2:p} \mathbf{z}) }\]
Then by Delta method, \[var[g(\boldsymbol{\beta})] = \left\{\frac{\partial g(\boldsymbol{\beta})}{\partial \boldsymbol{\beta}} \right\}^{T} var(\boldsymbol{\beta}) \left\{\frac{\partial g(\boldsymbol{\beta})}{\partial \boldsymbol{\beta}} \right\}\] Note that \(\frac{\partial g(\boldsymbol{\beta})}{\partial \boldsymbol{\beta}}\) is \(p \times 1\) and \(var(\boldsymbol{\beta})\) is \(p \times p\), so \(var[g(\boldsymbol{\beta})]\) is a scalar value.
A \(p \times p\) matrix of \(var{(\boldsymbol{\beta})}\) is obtained by
summary(fit)$cov.unscaled when fit is a
glm object.
Let \(e_{0} = \exp(-\beta_{0} - \beta_{1} x_{0} - \boldsymbol{\beta}^{T}_{2:p} \mathbf{z})\) and \(e_{1} = \exp (-\beta_{0} - \beta_{1} x_{1} - \boldsymbol{\beta}^{T}_{2:p} \mathbf{z})\).
\[\frac{\partial g(\boldsymbol{\beta})}{\partial \beta_{0}} = \frac{- e_{1} + e_{0}}{(1 + e_{1})^2 } = \frac{e_{0}(1 - \exp(-\beta_{1}(x_{1} - x_{0}) ) ) }{(1 + e_{1})^2}\] \[\frac{\partial g(\boldsymbol{\beta})}{\partial \beta_{1}} = \frac{-x_{1} e_{1}( 1 + e_{0}) + x_{0} e_{0}(1 + e_{1}) }{(1 + e_{1})^2 }\] For any \(j = 2,3,\ldots, p\) where \(z_{j}\) is a covariate of which effect is associated with \(\beta_{j}\): \[\frac{\partial g(\boldsymbol{\beta})}{\partial \beta_{j}} = \frac{z_{j}(e_{0} - e_{1} ) }{ (1+e_{1})^2} = \frac{1 - \exp(-(x_{1} - x_{0})\beta_{1}) }{(1 + e_{1})^2}\]
By combining information of estimated \(var{(\boldsymbol{\beta})}\) and \(\frac{\partial g(\boldsymbol{\hat{\beta}})}{\partial \boldsymbol{\hat{\beta}}}\), we can derive the estimated variance of \(g(\boldsymbol{\beta})\).
Similar to glm(), multinom() also produces
Hessian matrix (fit$Hessian for multinom
object of fit) of the coefficients of which inverse is
covariance matrix. Therefore, similar to binary logistic regression, we
can use Delta method to estimate the variance of relative risk.
Let us consider the response variable of \(Y\) taking \((K+1)\) categorical values, \(Y=0,1,2, \ldots, K\), and having one binary exposure variable \(X\) and \(p\) conditional covariates \(\mathbf{Z}\). Then we can represent the adjusted relative risk of \(k \in \{1,2, \ldots, K\}\) response as a function of \(\boldsymbol{\Theta} = (\boldsymbol{\alpha}, \boldsymbol{\beta}, \boldsymbol{\Gamma})\) where \(\boldsymbol{\alpha} = (\alpha_{1}, \alpha_{2}, \cdots, \alpha_{K})\); \(\boldsymbol{\beta} = (\beta_{1}, \beta_{2}, \cdots, \beta_{K})\); and \(\Gamma\) is \((K \times p)\) matrix of which \(k^{th}\) row has \(\boldsymbol{\gamma}_{k} = (\beta_{1}, \beta_{2}, \cdots, \beta_{K})\).
For \(j \in \{1,2,\ldots, K\}\), let \(g_{j}(\boldsymbol{\Theta})\) be a relative risk of \(Y = j\) conditional on \(\mathbf{Z} = \mathbf{z}\).
\[g_{j}(\boldsymbol{\Theta}) = \frac{p( Y = j | X = x_{1}, \mathbf{Z} = \mathbf{z}) }{p(Y = j| X = x_{0}, \mathbf{Z} = \mathbf{z})}\]
For simplicity, \(e_{1k}(\mathbf{z}) = \exp(\alpha_{k} + \beta_{k}x_{1} + \boldsymbol{\gamma}^{T}_{k} \mathbf{z})\) and \(e_{0k} = \exp(\alpha_{k} + \beta_{k}x_{1} + \boldsymbol{\gamma}^{T}_{k} \mathbf{z})\).
\[\frac{\partial g_{j}(\boldsymbol{\Theta})}{\partial \alpha_{i}} = (1 + \sum\limits_{k=1}^{K} e_{1k}(\mathbf{z}))^{-2} \exp(\beta_{j}(x_{1} - x_{0}) ) \left\{ e_{1i}(\mathbf{z}) (1 + \sum\limits_{k=1}^{K} e_{0k}(\mathbf{z}) ) - e_{0i}(\mathbf{z}) ( 1 + \sum\limits_{k=1}^{K} e_{1k}(\mathbf{Z}) ) \right\} \] If \(i \neq j\): \[\frac{\partial g_{j}(\boldsymbol{\Theta})}{\partial \beta_{i}} = (1 + \sum\limits_{k=1}^{K} e_{1k}(\mathbf{z}))^{-2} \exp(\beta_{j}(x_{1} - x_{0}) ) \left\{ x_{1} e_{1i}(\mathbf{z}) (1 + \sum\limits_{k=1}^{K} e_{0k}(\mathbf{z}) ) - x_{0} e_{0i}(\mathbf{z}) ( 1 + \sum\limits_{k=1}^{K} e_{1k}(\mathbf{Z}) ) \right\} \]
If \(i=j\): \[\frac{\partial g(\boldsymbol{\Theta})}{\partial \beta_{j}} =(1 + \sum\limits_{k=1}^{K} e_{1k}(\mathbf{z}))^{-2} \exp(\beta_{j}(x_{1} - x_{0}) ) \left[ x_{1} e_{1i}(\mathbf{z}) (1 + \sum\limits_{k=1}^{K} e_{0k}(\mathbf{z}) ) - \left\{ (x_{1} - x_{0}) ( 1 + \sum\limits_{k=1}^{K} e_{0k}(\mathbf{Z})) + x_{0} e_{0i}(\mathbf{z}) \right\}( 1 + \sum\limits_{k=1}^{K} e_{1k}(\mathbf{Z}) ) \right]\]
For \(l = 1,2,\ldots,p\): \[\frac{\partial g_{j}(\boldsymbol{\Theta})}{\partial \gamma_{jl}} = (1 + \sum\limits_{k=1}^{K} e_{1k}(\mathbf{z}))^{-2} \exp(\beta_{j}(x_{1} - x_{0}) ) z_{l} \left\{ e_{1i}(\mathbf{z}) (1 + \sum\limits_{k=1}^{K} e_{0k}(\mathbf{z}) ) - e_{0i}(\mathbf{z}) ( 1 + \sum\limits_{k=1}^{K} e_{1k}(\mathbf{Z}) ) \right\} \] On the other hand, relative risks of \(Y = 0\), \(g_{0}(\boldsymbol{\theta})\) can be also derived as similar way without \(\exp(\beta_{j}(x_{1} - x_{0}) )\) in the nominator.
Then a total of \(q = (K + K + K\times p)\) vector of \(\frac{\partial g_{j}(\boldsymbol{\Theta}) }{\partial \boldsymbol{\Theta}}\) for \(j=0,1,2,\ldots, K\) can be constructed and let \(\mathbf{G}\) be a \((K \times q)\) matrix of each row is \(\frac{\partial g_{j}(\boldsymbol{\Theta}) }{\partial \boldsymbol{\Theta}}\).
Then by Delta method,
\[\begin{bmatrix} var\left[
g_{0}(\boldsymbol{\Theta}) \right] \\ var\left[
g_{1}(\boldsymbol{\Theta}) \right] \\ \vdots \\ var\left[
g_{K}(\boldsymbol{\Theta}) \right] \end{bmatrix} =
\mathbf{G}~var[\boldsymbol{\Theta}]~ \mathbf{G}^{-1}\], where
\(var[\boldsymbol{\Theta}]\) is from
Hessian matrix of fit$Hessian.
In both of logistic regression and multinomial logistic regression, having nominal exposure variable makes derivation more complicated but we can extend the binary exposure variable case.
All of logisticRR, nominalRR,
multiRR, and multinRR, we add a logical input
of boot: by setting boot = TRUE those
functions print out a vector of n.boot number of (adjusted)
relative risks.
As a first example, we generate hypothetical data of size \(n=500\).
library(logisticRR)
n <- 500
set.seed(1234)
X <- rbinom(n, 1, 0.3)
W <- rbinom(n, 1, 0.3); W[sample(1:n, n/3)] = 2
Z <- rep(0, n)
Z[sample(1:n, n/2)] <- "female"; Z <- ifelse(Z == 0, "male", Z)
dummyZ <- ifelse(Z == "female", 1, 0)
Y <- rbinom(n, 1, plogis(X - W + 2*dummyZ))
dat <- as.data.frame(cbind(Y, X, W, Z))
dat$X <- as.numeric(dat$X); dat$X <- ifelse(dat$X == 2, 1, 0)
dat$Y <- as.numeric(dat$Y); dat$Y <- ifelse(dat$Y == 2, 1, 0)
dat$W <- as.factor(dat$W)
dat$Z <- as.factor(dat$Z)
head(dat)
#> Y X W Z
#> 1 0 0 0 male
#> 2 0 0 1 female
#> 3 0 0 2 female
#> 4 0 0 0 female
#> 5 0 0 0 male
#> 6 0 0 2 maleThe code below estimates variance of adjusted relative risks of
binary \(X\) on binary outcome of \(Y\) by generating n.boot = 200
bootstrap samples. Because we do not specify baseline level of exposure
variable (basecov) nor the value of conditioning covariates
of W and Z (fixcov), baseline
exposure level is set to 0 as default. Since W
and Z are both factor, they are fixed to their first level
which are 0 and female.
simresult200 <- logisticRR(Y ~ X + W + Z, data = dat, boot = TRUE, n.boot = 200)
simresult200$RR
var(simresult200$boot.rr)
simresult200$delta.var
## print out conditioning
simresult200$fix.cov
This time we increase the number of bootstrap samples to
n.boot = 1000. Note that sampling variance gets closer to
the estimated variance using Delta method (delta.var).
simresult1000 <- logisticRR(Y ~ X + W + Z, data = dat, boot = TRUE, n.boot = 1000)
var(simresult1000$boot.rr)
simresult1000$delta.var
We have a total of six combination of confounder variables. By the assumption made in logistic regression, adjusted odds ratio is consistent against of these confounders but adjusted relative risk is not.
levels(dat$W)
levels(dat$Z)
adjusted <- cbind(rep(levels(dat$W), 2), rep(levels(dat$Z), each = 3))
adjusted <- as.data.frame(adjusted)
names(adjusted) <- c("W", "Z")
Adjusted relative risk tends to be higher for male and for higher
level of W.
## compare with odds ratio
results <- list()
for(i in 1:nrow(adjusted)){
results[[i]] <- logisticRR(Y ~ X + W + Z, data = dat, fixcov = adjusted[i,], boot = FALSE)
}
## adjusted relative risk
# female
print(c(results[[1]]$RR, results[[2]]$RR, results[[3]]$RR))
# male
print(c(results[[4]]$RR, results[[5]]$RR, results[[6]]$RR))
## adjusted odds ratio
## all the same : by the assumption of logistic regression
print(exp(coefficients(results[[1]]$fit)[2]))
# betas <- coefficients(fit)
# exposed <- exp(-predict(fit, expose.cov, type = "link"))
# unexposed <- exp(-predict(fit, unexpose.cov, type = "link"))
# RR <- (1 + unexposed) / (1 + exposed)
Let us change the prevalence of exposure variable (\(X\)).
dat2 <- dat
dat2$Y <- ifelse(dat$Y == 1, rbinom(n, 1, 0.2), rbinom(n, 1, 0.01))
## compare with odds ratio
results2 <- list()
for(i in 1:nrow(adjusted)){
results2[[i]] <- logisticRR(Y ~ X + W + Z, data = dat2, fixcov = adjusted[i,], boot = TRUE, n.boot = 1000)
}
## adjusted relative risk
# female
print(c(results2[[1]]$RR, results2[[2]]$RR, results2[[3]]$RR))
# male
print(c(results2[[4]]$RR, results2[[5]]$RR, results2[[6]]$RR))
## adjusted odds ratio
## all the same : by the assumption of logistic regression
print(exp(coefficients(results2[[1]]$fit)[2]))
# betas <- coefficients(fit)
# exposed <- exp(-predict(fit, expose.cov, type = "link"))
# unexposed <- exp(-predict(fit, unexpose.cov, type = "link"))
# RR <- (1 + unexposed) / (1 + exposed)
RRboot = data.frame(rr = c(results2[[1]]$boot.rr, results2[[2]]$boot.rr, results2[[3]]$boot.rr,
results2[[4]]$boot.rr, results2[[5]]$boot.rr, results2[[6]]$boot.rr),
Z = c(rep("female", 3000), rep("male", 3000)),
W = c( rep(c(rep("0", 1000), rep("1", 1000), rep("2", 1000)), 2)))
#pdf("Figure/results2.pdf", width = 19, height = 9)
par(mfrow = c(1,1), mar = c(3,3,3,3), cex.lab = 2,
cex.main = 1, cex.axis = 1, tcl = 0.5, omi = c(0.5,0.5,0,0))
bx = boxplot(rr ~ interaction(Z, W), data = RRboot, ylim = c(0,2.5),
col = rep(c("palegreen", "dodgerblue"), 3),
outcol = rep(c("palegreen", "dodgerblue"), 3),
main= "Adjusted relative risks",
xlab="", ylab= "", mgp = c(4,1,0),
yaxt = 'n', at = seq(1,6,1), outline = FALSE)
mtext("(Z,W)", 1, 4, cex = 1)
mtext("Adjusted relative risk", 2, 4, cex = 1)
axis(2, at = seq(0, 2.5, 0.5), labels = seq(0, 2.5, 0.5))
for(p in 1:6){
points(p, results2[[p]]$RR, type = "p", pch = 20, cex = 3)
}
abline(h = exp(coefficients(results2[[1]]$fit)[2]), col = "red", lwd = 2)
When reponse variable takes more than two values, multinomial logistic regression is widely used to reveal association between the response variable and exposure variable. In that case, relative risk of each category compared to the reference category can be considered, conditional on other fixed covariates. Other than (adjusted) relative risk, relative risks ratio (RRR) is often of interest in multinomial logistic regression.
library(logisticRR)
dat$multiY <- ifelse(dat$X == 1, rbinom(n, 1, 0.8) + dat$Y, rbinom(n, 1, 0.2) + dat$Y)
multiresult <- multiRR(multiY ~ X + W + Z, data = dat, boot = TRUE, n.boot = 1000)
apply(multiresult$boot.rr, 2, sd)
sqrt(multiresult$delta.var)
## relative risk ratio (RRR)
print(multiresult$RRR)
## relative risk (RR)
print(multiresult$RR)
Similar to the binary reponse, in multinomial logistic regression model, categorical exposure variable can be introduced; in this case, baseline value and comparative value of exposure variable should be specified.
multinresult <- multinRR(multiY ~ W + X + Z, data = dat, basecov = 0, comparecov = 1, boot = TRUE, n.boot = 1000)
apply(multinresult$boot.rr, 2, sd)
sqrt(multinresult$delta.var)
## relative risk ratio (RRR)
print(multinresult$RRR)
## relative risk (RR)
print(multinresult$RR)
We introduce airquality
data to illustrate the use of logisticRR. You can download
the data set by :
data("airquality")
Because outcome variable of Ozone is continuous, we are
going to binarize this variable into ozone1 (top 10% take 1
and 0 otherwise), ozone2 (top 20% take 1 and 0 otherwise),
and ozone3 (top 30% take 1 and 0 otherwise).
# delete observations having NAs
ozonedat <- na.omit(airquality)
# define binary ozone level
ozonedat$ozone1 <- ifelse(ozonedat$Ozone < quantile(ozonedat$Ozone, prob = 0.1), 1, 0)
ozonedat$ozone2 <- ifelse(ozonedat$Ozone < quantile(ozonedat$Ozone, prob = 0.2), 1, 0)
ozonedat$ozone3 <- ifelse(ozonedat$Ozone < quantile(ozonedat$Ozone, prob = 0.3), 1, 0)
As an exposure variable of main interest, we use numerical
Temp (temperature). Because a range of Temp is
wide we devide it by 10 and use Temp2 in the model so that
one unit change of Temp2 is ten unit (10 Fahrenheit) in
Temp.
summary(ozonedat$Temp)
ozonedat$Temp2 <- ozonedat$Temp / 10
summary(ozonedat$Temp2)
As other confounding variables, we chose solar radiation
(Solar.R) and average wind speed (Wind) so
that formula used for glm is
ozone1 ~ Temp2 + Solar.R + Wind, for example. We specify
conditioning confounder values as an average of each variable by
fixcov = data.frame(Solar.R = mean(ozonedat$Solar.R), Wind = mean(ozonedat$Wind)).
ozone.fit1 <- logisticRR(ozone1 ~ Temp2 + Solar.R + Wind, data = ozonedat, basecov = min(ozonedat$Temp2),
fixcov = data.frame(Solar.R = mean(ozonedat$Solar.R), Wind = mean(ozonedat$Wind)),
boot = FALSE)
ozone.fit2 <- logisticRR(ozone2 ~ Temp2 + Solar.R + Wind, data = ozonedat, basecov = min(ozonedat$Temp2),
fixcov = data.frame(Solar.R = mean(ozonedat$Solar.R), Wind = mean(ozonedat$Wind)),
boot = FALSE)
ozone.fit3 <- logisticRR(ozone3 ~ Temp2 + Solar.R + Wind, data = ozonedat, basecov = min(ozonedat$Temp2),
fixcov = data.frame(Solar.R = mean(ozonedat$Solar.R), Wind = mean(ozonedat$Wind)),
boot = FALSE)
As prevalence of outcome is smaller (ozone1 <
ozone2 < ozone3), estimated adjusted
relative risk is closer to adjusted odds ratio.
print(c(ozone.fit1$RR, ozone.fit2$RR, ozone.fit3$RR))
## odds ratio
exp(ozone.fit1$fit$coefficients[2])
Next we are going to use nominalRR when an exposure
variable is converted into nominal variable of Temp.factor
having three categories – low, medium, and
high.
Note that adjusted relative risk when
basecov = "low", comparecov = "medium" is the reciprocal of
that when basecov = "medium", comparecov = "low".
# define binary ozone level
ozonedat$ozone1 <- ifelse(ozonedat$Ozone < quantile(ozonedat$Ozone, prob = 0.1), 1, 0)
ozonedat$Temp.factor <- ifelse(ozonedat$Temp <= quantile(ozonedat$Temp, prob = 0.25), "low",
ifelse(ozonedat$Temp > quantile(ozonedat$Temp, prob = 0.8), "high", "medium"))
ozonedat$Temp.factor <- as.factor(ozonedat$Temp.factor)
ozone.fit.factor <- nominalRR(ozone1 ~ Temp.factor + Solar.R + Wind, data = ozonedat,
basecov = "low", comparecov = "medium",
fixcov = data.frame(Solar.R = mean(ozonedat$Solar.R), Wind = mean(ozonedat$Wind)), boot = FALSE)
ozone.fit.factor2 <- nominalRR(ozone1 ~ Temp.factor + Solar.R + Wind, data = ozonedat,
basecov = "medium", comparecov = "low",
fixcov = data.frame(Solar.R = mean(ozonedat$Solar.R), Wind = mean(ozonedat$Wind)), boot = FALSE)
print(c(ozone.fit.factor$RR, ozone.fit.factor2$RR))
print(1/ozone.fit.factor2$RR)