Title: | Copula-Based Sensitivity Analysis for Observational Causal Inference |
---|---|
Description: | Implements the copula-based sensitivity analysis method, as discussed in Copula-based Sensitivity Analysis for Multi-Treatment Causal Inference with Unobserved Confounding <arXiv:2102.09412>, with Gaussian copula adopted in particular. |
Authors: | Jiajing Zheng [aut, cre], Alexander Franks [aut], Alex D'Amour [ctb] |
Maintainer: | Jiajing Zheng <[email protected]> |
License: | GPL-3 |
Version: | 0.1.0 |
Built: | 2025-02-15 03:55:47 UTC |
Source: | https://github.com/jiajingz/copsens |
Calibrates the naive estimates to account for unobserved confounding when outcome variables are binary. The calibration can be done with user-specific sensitivity parameter or with our pre-provided calibration methods, the worst-case calibration for a single contrast or multivariate calibration for multiple contrasts.
bcalibrate( y, tr, t, gamma, R2 = NULL, mu_y_t = NULL, mu_u_tr = NULL, mu_u_t = NULL, cov_u_t = NULL, nU = NULL, nsim = 4000, ... )
bcalibrate( y, tr, t, gamma, R2 = NULL, mu_y_t = NULL, mu_u_tr = NULL, mu_u_t = NULL, cov_u_t = NULL, nU = NULL, nsim = 4000, ... )
y |
|
tr |
|
t |
|
gamma |
a vector specifying the direction of sensitivity parameters. |
R2 |
an optional scalar or vector specifying the proportion of residual variance in outcome given the treatment that can be explained by confounders, which determines the magnitude of sensitivity parameters. |
mu_y_t |
an optional scalar or vector that contains naive estimates of treatment effects ignoring confounding. |
mu_u_tr |
an optional matrix of conditional confounder means for all observed treatments with latent variables in columns. |
mu_u_t |
an optional matrix of conditional confounder means for treatments of interest with latent variables in columns. |
cov_u_t |
an optional covariance matrix of confounders conditional on treatments. |
nU |
Number of latent confounders to consider. |
nsim |
an optional scalar specifying the number of sample draws. |
... |
A data.frame
with naive and calibrated estimates of population average outcome receiving
treatment t
.
# load the example data # y <- GaussianT_BinaryY$y tr <- subset(GaussianT_BinaryY, select = -c(y)) t1 <- tr[1:5,] t2 <- rep(0, times = ncol(tr)) # calibration # est_b <- bcalibrate(y = y, tr = tr, t = rbind(t1, t2), nU = 3, gamma = c(1.27, -0.28, 0), R2 = c(0.2, 0.7)) est_b_rr <- list(est_df = est_b$est_df[1:5,] / as.numeric(est_b$est_df[6,]), R2 = c(0.2, 0.7)) plot_estimates(est_b_rr)
# load the example data # y <- GaussianT_BinaryY$y tr <- subset(GaussianT_BinaryY, select = -c(y)) t1 <- tr[1:5,] t2 <- rep(0, times = ncol(tr)) # calibration # est_b <- bcalibrate(y = y, tr = tr, t = rbind(t1, t2), nU = 3, gamma = c(1.27, -0.28, 0), R2 = c(0.2, 0.7)) est_b_rr <- list(est_df = est_b$est_df[1:5,] / as.numeric(est_b$est_df[6,]), R2 = c(0.2, 0.7)) plot_estimates(est_b_rr)
Calculate Robustness Value When Executing Worstcase Calibration
cal_rv( y, tr, t1, t2, mu_y_dt = NULL, sigma_y_t = NULL, mu_u_dt = NULL, cov_u_t = NULL, nU = NULL, ... )
cal_rv( y, tr, t1, t2, mu_y_dt = NULL, sigma_y_t = NULL, mu_u_dt = NULL, cov_u_t = NULL, nU = NULL, ... )
y |
|
tr |
|
t1 |
|
t2 |
|
mu_y_dt |
an optional scalar or vector that contains naive estimates of treatment effects ignoring confounding. |
sigma_y_t |
an optional scalar of the standard deviation of outcome conditional on treatments. |
mu_u_dt |
an optional matrix of difference in conditional confounder means, |
cov_u_t |
an optional covariance matrix of confounders conditional on treatments. |
nU |
Number of latent confounders to consider. |
... |
A numeric vector
with elements being the robustness value or NA
if the ignorance region doesn't
contains 0 for each contrast of interest.
# load the example data # y <- GaussianT_GaussianY$y tr <- subset(GaussianT_GaussianY, select = -c(y)) # calculate robustness value # cal_rv(y = y, tr = tr, t1 = tr[1:2,], t2 = tr[3:4,])
# load the example data # y <- GaussianT_GaussianY$y tr <- subset(GaussianT_GaussianY, select = -c(y)) # calculate robustness value # cal_rv(y = y, tr = tr, t1 = tr[1:2,], t2 = tr[3:4,])
Calibrate Estimate of Intervention Mean for Binary Outcome
cali_mean_ybinary_algm(i, gamma, mu_u_tr, mu_u_t, mu_y_t, nsim = 4000)
cali_mean_ybinary_algm(i, gamma, mu_u_tr, mu_u_t, mu_y_t, nsim = 4000)
i |
Observation index. |
gamma |
Scalar or vector specifying the sensitivity parameters. |
mu_u_tr |
Matrix of conditional confounder means for all observed treatments with latent variables in columns. |
mu_u_t |
Matrix of conditional confounder means for treatments of interest with latent variables in columns. |
mu_y_t |
Scalar or vector that contains naive estimates of treatment effects ignoring confounding. |
nsim |
Number of simulation sample draws. |
Scalar of calibrated intervention mean.
A dataset containing Gaussian treatments and binary outcomes of 10,000 observations.
GaussianT_BinaryY
GaussianT_BinaryY
A data frame with eleven variables: one binary outcome, y
, and ten Gaussian
treatments, t1
, t2
, ..., t10
.
For data generating process, see data-raw/Data_Generation.R
.
A dataset containing Gaussian treatments and outcomes of 10,000 observations.
GaussianT_GaussianY
GaussianT_GaussianY
A data frame with eleven variables: one Gaussian outcome, y
, and ten Gaussian
treatments, t1
, t2
, ..., t10
.
For data generating process, see data-raw/Data_Generation.R
.
Calibrates the naive estimates to account for unobserved confounding when outcome variables are Gaussian. The calibration can be done with user-specific sensitivity parameters or with our pre-provided calibration methods, the worst-case calibration for a single contrast or multivariate calibration for multiple contrasts.
gcalibrate( y, tr, t1, t2, calitype = c("worstcase", "multicali", "null"), mu_y_dt = NULL, sigma_y_t = NULL, mu_u_dt = NULL, cov_u_t = NULL, nU = NULL, R2 = 1, gamma = NULL, R2_constr = 1, nc_index = NULL, ... )
gcalibrate( y, tr, t1, t2, calitype = c("worstcase", "multicali", "null"), mu_y_dt = NULL, sigma_y_t = NULL, mu_u_dt = NULL, cov_u_t = NULL, nU = NULL, R2 = 1, gamma = NULL, R2_constr = 1, nc_index = NULL, ... )
y |
|
tr |
|
t1 |
|
t2 |
|
calitype |
character. The calibration method to be applied. Can be one of: |
mu_y_dt |
an optional scalar or vector that contains naive estimates of treatment effects ignoring confounding. |
sigma_y_t |
an optional scalar of the standard deviation of outcome conditional on treatments. |
mu_u_dt |
an optional matrix of difference in conditional confounder means, |
cov_u_t |
an optional covariance matrix of confounders conditional on treatments. |
nU |
Number of latent confounders to consider. |
R2 |
an optional scalar or vector specifying the proportion of residual variance in outcome given the treatment that can be explained by confounders. |
gamma |
sensitivity parameter vector. Must be given when |
R2_constr |
an optional scalar or vector specifying the upper limit constraint on |
nc_index |
an optional vector containing indexes of negative control treatments. If not |
... |
further arguments passed to |
gcalibrate
returns a list containing the following components:
est_df
a data.frame
with naive and calibrated estimates of average treatment effects.
R2
a vector of with elements corresponding to columns of
est_df
.
gamma
a matrix returned when calitype = "multicali"
or "worstcase"
.
If calitype = "multicali"
, optimized gamma are in columns,
respectively resulting in estimates in columns of est_df
.
If calitype = "worstcase"
, gamma are in rows,
which respectively lead to the worstcase ignorance region with for each contrast of interest.
rv
a numeric vector
returned when calitype = "worstcase"
,
with elements being the robustness value or NA
if the ignorance region doesn't
contains 0 for each contrast of interest.
# load the example data # y <- GaussianT_GaussianY$y tr <- subset(GaussianT_GaussianY, select = -c(y)) # worst-case calibration # t1 <- data.frame(diag(ncol(tr))) t2 <- data.frame(matrix(0, nrow = ncol(tr), ncol = ncol(tr))) colnames(t1) = colnames(t2) <- colnames(tr) est_g1 <- gcalibrate(y = y, tr = tr, t1 = t1, t2 = t2, nU = 3, calitype = "worstcase", R2 = c(0.3, 1)) plot_estimates(est_g1) # with negative conotrls # est_g1_nc <- gcalibrate(y = y, tr = tr, t1 = t1, t2 = t2, nU = 3, calitype = "worstcase", R2 = c(0.3, 1), nc_index = c(3, 6)) plot_estimates(est_g1_nc) # multivariate calibration # est_g2 <- gcalibrate(y = y, tr = tr, t1 = tr[1:10,], t2 = tr[11:20,], nU = 3, calitype = "multicali", R2_constr = c(1, 0.15)) plot_estimates(est_g2) # user-specified calibration # est_g3 <- gcalibrate(y = y, tr = tr, t1 = tr[1:2,], t2 = tr[3:4,], nU = 3, calitype = "null", gamma = c(0.96, -0.29, 0), R2 = c(0.2, 0.6, 1)) plot_estimates(est_g3) # apply gamma that maximizes the bias for the first contrast considered in est_g1 # est_g4 <- gcalibrate(y = y, tr = tr, t1 = tr[1:2,], t2 = tr[3:4,], nU = 3, calitype = "null", gamma = est_g1$gamma[1,], R2 = c(0.2, 0.6, 1)) plot_estimates(est_g4)
# load the example data # y <- GaussianT_GaussianY$y tr <- subset(GaussianT_GaussianY, select = -c(y)) # worst-case calibration # t1 <- data.frame(diag(ncol(tr))) t2 <- data.frame(matrix(0, nrow = ncol(tr), ncol = ncol(tr))) colnames(t1) = colnames(t2) <- colnames(tr) est_g1 <- gcalibrate(y = y, tr = tr, t1 = t1, t2 = t2, nU = 3, calitype = "worstcase", R2 = c(0.3, 1)) plot_estimates(est_g1) # with negative conotrls # est_g1_nc <- gcalibrate(y = y, tr = tr, t1 = t1, t2 = t2, nU = 3, calitype = "worstcase", R2 = c(0.3, 1), nc_index = c(3, 6)) plot_estimates(est_g1_nc) # multivariate calibration # est_g2 <- gcalibrate(y = y, tr = tr, t1 = tr[1:10,], t2 = tr[11:20,], nU = 3, calitype = "multicali", R2_constr = c(1, 0.15)) plot_estimates(est_g2) # user-specified calibration # est_g3 <- gcalibrate(y = y, tr = tr, t1 = tr[1:2,], t2 = tr[3:4,], nU = 3, calitype = "null", gamma = c(0.96, -0.29, 0), R2 = c(0.2, 0.6, 1)) plot_estimates(est_g3) # apply gamma that maximizes the bias for the first contrast considered in est_g1 # est_g4 <- gcalibrate(y = y, tr = tr, t1 = tr[1:2,], t2 = tr[3:4,], nU = 3, calitype = "null", gamma = est_g1$gamma[1,], R2 = c(0.2, 0.6, 1)) plot_estimates(est_g4)
Obtain Optimized Sensitivity Parameters Using Multivariate Calibration Criterion
get_opt_gamma( mu_y_dt, mu_u_dt, cov_u_t, sigma_y_t, R2_constr = 1, normtype = "L2", idx = NULL, ... )
get_opt_gamma( mu_y_dt, mu_u_dt, cov_u_t, sigma_y_t, R2_constr = 1, normtype = "L2", idx = NULL, ... )
mu_y_dt |
Scalar or vector that contains naive estimates of treatment effects ignoring confounding. |
mu_u_dt |
Matrix of difference in conditional confounder means, |
cov_u_t |
Covariance matrix of confounders conditional on treatments. |
sigma_y_t |
Scalar of the standard deviation of outcome conditional on treatments. |
R2_constr |
an optional scalar or vector specifying the upper limit constraint on |
normtype |
character. Optional function |
idx |
A zero-one vector with 1 in the i-th coordinate if the i-th outcome to be applied with the MCC calibration over, otherwise 0. |
... |
further arguments passed to |
Optimized sensitivity parameters.
The dataset consists of estimates of treatment effects of 17 genes, which are likely to affect mouse weight, by using the null treatments approach from Miao et al. (2020), assuming that at least half of the confounded treatments have no causal effect on the outcome.
mice_est_nulltr
mice_est_nulltr
A data frame with 17 rows and 6 variables:
mean estimates of genes' treatment effects on mouse body weight
2.5% percentile of the estimates of genes' treatment effects on mouse body weight
97.5% percentile of the estimates of genes' treatment effects on mouse body weight
5% percentile of the estimates of genes' treatment effects on mouse body weight
95% percentile of the estimates of genes' treatment effects on mouse body weight
significance
https://arxiv.org/abs/2011.04504
A dataset are collected from 287 mice, including the body weight, 37 gene expressions, and 5 single nucleotide polymorphisms.
micedata
micedata
A data frame with forty-three variables: the mice body weight, y
,
5 single nucleotide polymorphisms, rs3663003
, rs4136518
, rs3694833
,
rs4231406
, rs3661189
, and the rest are thirty-seven genes.
https://arxiv.org/abs/2011.04504
Visualize Estimates of Treatment Effects
plot_estimates(est, show_rv = TRUE, order = "naive", labels = NULL, ...)
plot_estimates(est, show_rv = TRUE, order = "naive", labels = NULL, ...)
est |
an return object from |
show_rv |
logical. Whether robustness values should be printed in the plot or not? Available only for the "worstcase" calibration. |
order |
character. The type of order used to plot treatment effects from left to right.
Can be one of the following: |
labels |
character. Labels of treatments. |
... |
further arguments passed to |
A graph plotting ignorance regions of the causal estimands of interest.
For examples, please refer to bcalibrate
or gcalibrate