Package 'CopSens' reference manual

Title:	Copula-Based Sensitivity Analysis for Observational Causal Inference
Description:	Implements the copula-based sensitivity analysis method, as discussed in Copula-based Sensitivity Analysis for Multi-Treatment Causal Inference with Unobserved Confounding <arXiv:2102.09412>, with Gaussian copula adopted in particular.
Authors:	Jiajing Zheng [aut, cre], Alexander Franks [aut], Alex D'Amour [ctb]
Maintainer:	Jiajing Zheng <[email protected]>
License:	GPL-3
Version:	0.1.0
Built:	2025-03-17 04:05:18 UTC
Source:	https://github.com/jiajingz/copsens

Calibration for Binary Outcomes

Description

Calibrates the naive estimates to account for unobserved confounding when outcome variables are binary. The calibration can be done with user-specific sensitivity parameter or with our pre-provided calibration methods, the worst-case calibration for a single contrast or multivariate calibration for multiple contrasts.

Usage

bcalibrate(
  y,
  tr,
  t,
  gamma,
  R2 = NULL,
  mu_y_t = NULL,
  mu_u_tr = NULL,
  mu_u_t = NULL,
  cov_u_t = NULL,
  nU = NULL,
  nsim = 4000,
  ...
)
bcalibrate(
  y,
  tr,
  t,
  gamma,
  R2 = NULL,
  mu_y_t = NULL,
  mu_u_tr = NULL,
  mu_u_t = NULL,
  cov_u_t = NULL,
  nU = NULL,
  nsim = 4000,
  ...
)

Arguments

`y`	`data.frame`, `matrix` or `vector`. Binary outcome variable.
`tr`	`data.frame`. Treatment variables with rows corresponding to observations and columns to variables.
`t`	`data.frame`. Treatment arms of interest. May contain a single or multiple treatments in rows.
`gamma`	a vector specifying the direction of sensitivity parameters.
`R2`	an optional scalar or vector specifying the proportion of residual variance in outcome given the treatment that can be explained by confounders, which determines the magnitude of sensitivity parameters.
`mu_y_t`	an optional scalar or vector that contains naive estimates of treatment effects ignoring confounding.
`mu_u_tr`	an optional matrix of conditional confounder means for all observed treatments with latent variables in columns.
`mu_u_t`	an optional matrix of conditional confounder means for treatments of interest with latent variables in columns.
`cov_u_t`	an optional covariance matrix of confounders conditional on treatments.
`nU`	Number of latent confounders to consider.
`nsim`	an optional scalar specifying the number of sample draws.
`...`	further arguments passed to `kEstimate`or `pca`.

Value

A data.frame with naive and calibrated estimates of population average outcome receiving treatment t.

Examples


# load the example data #
y <- GaussianT_BinaryY$y
tr <- subset(GaussianT_BinaryY, select = -c(y))
t1 <- tr[1:5,]
t2 <- rep(0, times = ncol(tr))
# calibration #
est_b <- bcalibrate(y = y, tr = tr, t = rbind(t1, t2),
                    nU = 3, gamma = c(1.27, -0.28, 0),
                    R2 = c(0.2, 0.7))
est_b_rr <- list(est_df = est_b$est_df[1:5,] / as.numeric(est_b$est_df[6,]),
                 R2 = c(0.2, 0.7))
plot_estimates(est_b_rr)

# load the example data #
y <- GaussianT_BinaryY$y
tr <- subset(GaussianT_BinaryY, select = -c(y))
t1 <- tr[1:5,]
t2 <- rep(0, times = ncol(tr))
# calibration #
est_b <- bcalibrate(y = y, tr = tr, t = rbind(t1, t2),
                    nU = 3, gamma = c(1.27, -0.28, 0),
                    R2 = c(0.2, 0.7))
est_b_rr <- list(est_df = est_b$est_df[1:5,] / as.numeric(est_b$est_df[6,]),
                 R2 = c(0.2, 0.7))
plot_estimates(est_b_rr)

Calculate Robustness Value When Executing Worstcase Calibration

Description

Calculate Robustness Value When Executing Worstcase Calibration

Usage

cal_rv(
  y,
  tr,
  t1,
  t2,
  mu_y_dt = NULL,
  sigma_y_t = NULL,
  mu_u_dt = NULL,
  cov_u_t = NULL,
  nU = NULL,
  ...
)
cal_rv(
  y,
  tr,
  t1,
  t2,
  mu_y_dt = NULL,
  sigma_y_t = NULL,
  mu_u_dt = NULL,
  cov_u_t = NULL,
  nU = NULL,
  ...
)

Arguments

`y`	`data.frame`, `matrix` or `vector`. Gaussian outcome variable.
`tr`	`data.frame`. Treatment variables with rows corresponding to observations and columns to variables.
`t1`	`data.frame`. First treatment arms of interest. May contain a single or multiple treatments in rows.
`t2`	`data.frame`. Second treatment arms of interest, which has same number of row as `t1`.
`mu_y_dt`	an optional scalar or vector that contains naive estimates of treatment effects ignoring confounding.
`sigma_y_t`	an optional scalar of the standard deviation of outcome conditional on treatments.
`mu_u_dt`	an optional matrix of difference in conditional confounder means, $E(U \mid t1) - E(U \mid t2)$ , with latent variables in columns.
`cov_u_t`	an optional covariance matrix of confounders conditional on treatments.
`nU`	Number of latent confounders to consider.
`...`	further arguments passed to `kEstimate`, `pca`

Value

A numeric vector with elements being the robustness value or NA if the ignorance region doesn't contains 0 for each contrast of interest.

Examples


# load the example data #
y <- GaussianT_GaussianY$y
tr <- subset(GaussianT_GaussianY, select = -c(y))
# calculate robustness value #
cal_rv(y = y, tr = tr, t1 = tr[1:2,], t2 = tr[3:4,])

# load the example data #
y <- GaussianT_GaussianY$y
tr <- subset(GaussianT_GaussianY, select = -c(y))
# calculate robustness value #
cal_rv(y = y, tr = tr, t1 = tr[1:2,], t2 = tr[3:4,])

Calibrate Estimate of Intervention Mean for Binary Outcome

Description

Calibrate Estimate of Intervention Mean for Binary Outcome

Usage

cali_mean_ybinary_algm(i, gamma, mu_u_tr, mu_u_t, mu_y_t, nsim = 4000)
cali_mean_ybinary_algm(i, gamma, mu_u_tr, mu_u_t, mu_y_t, nsim = 4000)

Arguments

`i`	Observation index.
`gamma`	Scalar or vector specifying the sensitivity parameters.
`mu_u_tr`	Matrix of conditional confounder means for all observed treatments with latent variables in columns.
`mu_u_t`	Matrix of conditional confounder means for treatments of interest with latent variables in columns.
`mu_y_t`	Scalar or vector that contains naive estimates of treatment effects ignoring confounding.
`nsim`	Number of simulation sample draws.

Value

Scalar of calibrated intervention mean.

Dataset with Gaussian Treatments and Binary Outcomes

Description

A dataset containing Gaussian treatments and binary outcomes of 10,000 observations.

Usage

GaussianT_BinaryY
GaussianT_BinaryY

Format

A data frame with eleven variables: one binary outcome, y, and ten Gaussian treatments, t1, t2, ..., t10.

Source

For data generating process, see data-raw/Data_Generation.R.

Dataset with Gaussian Treatments and Outcomes

Description

A dataset containing Gaussian treatments and outcomes of 10,000 observations.

Usage

GaussianT_GaussianY
GaussianT_GaussianY

Format

A data frame with eleven variables: one Gaussian outcome, y, and ten Gaussian treatments, t1, t2, ..., t10.

Source

For data generating process, see data-raw/Data_Generation.R.

Calibration for Gaussian Outcomes

Description

Calibrates the naive estimates to account for unobserved confounding when outcome variables are Gaussian. The calibration can be done with user-specific sensitivity parameters or with our pre-provided calibration methods, the worst-case calibration for a single contrast or multivariate calibration for multiple contrasts.

Usage

gcalibrate(
  y,
  tr,
  t1,
  t2,
  calitype = c("worstcase", "multicali", "null"),
  mu_y_dt = NULL,
  sigma_y_t = NULL,
  mu_u_dt = NULL,
  cov_u_t = NULL,
  nU = NULL,
  R2 = 1,
  gamma = NULL,
  R2_constr = 1,
  nc_index = NULL,
  ...
)
gcalibrate(
  y,
  tr,
  t1,
  t2,
  calitype = c("worstcase", "multicali", "null"),
  mu_y_dt = NULL,
  sigma_y_t = NULL,
  mu_u_dt = NULL,
  cov_u_t = NULL,
  nU = NULL,
  R2 = 1,
  gamma = NULL,
  R2_constr = 1,
  nc_index = NULL,
  ...
)

Arguments

`y`	`data.frame`, `matrix` or `vector`. Gaussian outcome variable.
`tr`	`data.frame`. Treatment variables with rows corresponding to observations and columns to variables.
`t1`	`data.frame`. First treatment arms of interest. May contain a single or multiple treatments in rows.
`t2`	`data.frame`. Second treatment arms of interest, which has same number of row as `t1`.
`calitype`	character. The calibration method to be applied. Can be one of: "worstcase" - apply worst-case calibration when considering a single contrast. "multicali" - apply mutlivariate calibration when considering multiple contrasts. "null" - apply calibration with user-specified sensitivity parameter, $\gamma$ .
`mu_y_dt`	an optional scalar or vector that contains naive estimates of treatment effects ignoring confounding.
`sigma_y_t`	an optional scalar of the standard deviation of outcome conditional on treatments.
`mu_u_dt`	an optional matrix of difference in conditional confounder means, $E(U \mid t1) - E(U \mid t2)$ , with latent variables in columns.
`cov_u_t`	an optional covariance matrix of confounders conditional on treatments.
`nU`	Number of latent confounders to consider.
`R2`	an optional scalar or vector specifying the proportion of residual variance in outcome given the treatment that can be explained by confounders.
`gamma`	sensitivity parameter vector. Must be given when `calitype = "null"`.
`R2_constr`	an optional scalar or vector specifying the upper limit constraint on $R^2$ . By default, `R2_constr = 1`.
`nc_index`	an optional vector containing indexes of negative control treatments. If not `NULL`, worstcase calibration will be executed with constraints imposed by negative control treatments.
`...`	further arguments passed to `kEstimate`, `pca` or `get_opt_gamma`.

Value

gcalibrate returns a list containing the following components:

est_df: a data.frame with naive and calibrated estimates of average treatment effects.
R2: a vector of $R^2$ with elements corresponding to columns of est_df.
gamma: a matrix returned when calitype = "multicali" or "worstcase". If calitype = "multicali", optimized gamma are in columns, respectively resulting in estimates in columns of est_df. If calitype = "worstcase", gamma are in rows, which respectively lead to the worstcase ignorance region with $R^2=1$ for each contrast of interest.
rv: a numeric vector returned when calitype = "worstcase", with elements being the robustness value or NA if the ignorance region doesn't contains 0 for each contrast of interest.

Examples

# load the example data #
y <- GaussianT_GaussianY$y
tr <- subset(GaussianT_GaussianY, select = -c(y))

# worst-case calibration #
t1 <- data.frame(diag(ncol(tr)))
t2 <- data.frame(matrix(0, nrow = ncol(tr), ncol = ncol(tr)))
colnames(t1) = colnames(t2) <- colnames(tr)
est_g1 <- gcalibrate(y = y, tr = tr, t1 = t1, t2 = t2, nU = 3,
                     calitype = "worstcase", R2 = c(0.3, 1))
plot_estimates(est_g1)
# with negative conotrls #
est_g1_nc <- gcalibrate(y = y, tr = tr, t1 = t1, t2 = t2, nU = 3,
                        calitype = "worstcase", R2 = c(0.3, 1), nc_index = c(3, 6))
plot_estimates(est_g1_nc)


# multivariate calibration #
est_g2 <- gcalibrate(y = y, tr = tr, t1 = tr[1:10,], t2 = tr[11:20,], nU = 3,
                     calitype = "multicali", R2_constr = c(1, 0.15))
plot_estimates(est_g2)

# user-specified calibration #
est_g3 <- gcalibrate(y = y, tr = tr, t1 = tr[1:2,], t2 = tr[3:4,],
                     nU = 3, calitype = "null",
                     gamma = c(0.96, -0.29, 0), R2 = c(0.2, 0.6, 1))
plot_estimates(est_g3)
# apply gamma that maximizes the bias for the first contrast considered in est_g1 #
est_g4 <- gcalibrate(y = y, tr = tr, t1 = tr[1:2,], t2 = tr[3:4,],
                     nU = 3, calitype = "null",
                     gamma = est_g1$gamma[1,], R2 = c(0.2, 0.6, 1))
plot_estimates(est_g4)

# load the example data #
y <- GaussianT_GaussianY$y
tr <- subset(GaussianT_GaussianY, select = -c(y))

# worst-case calibration #
t1 <- data.frame(diag(ncol(tr)))
t2 <- data.frame(matrix(0, nrow = ncol(tr), ncol = ncol(tr)))
colnames(t1) = colnames(t2) <- colnames(tr)
est_g1 <- gcalibrate(y = y, tr = tr, t1 = t1, t2 = t2, nU = 3,
                     calitype = "worstcase", R2 = c(0.3, 1))
plot_estimates(est_g1)
# with negative conotrls #
est_g1_nc <- gcalibrate(y = y, tr = tr, t1 = t1, t2 = t2, nU = 3,
                        calitype = "worstcase", R2 = c(0.3, 1), nc_index = c(3, 6))
plot_estimates(est_g1_nc)


# multivariate calibration #
est_g2 <- gcalibrate(y = y, tr = tr, t1 = tr[1:10,], t2 = tr[11:20,], nU = 3,
                     calitype = "multicali", R2_constr = c(1, 0.15))
plot_estimates(est_g2)

# user-specified calibration #
est_g3 <- gcalibrate(y = y, tr = tr, t1 = tr[1:2,], t2 = tr[3:4,],
                     nU = 3, calitype = "null",
                     gamma = c(0.96, -0.29, 0), R2 = c(0.2, 0.6, 1))
plot_estimates(est_g3)
# apply gamma that maximizes the bias for the first contrast considered in est_g1 #
est_g4 <- gcalibrate(y = y, tr = tr, t1 = tr[1:2,], t2 = tr[3:4,],
                     nU = 3, calitype = "null",
                     gamma = est_g1$gamma[1,], R2 = c(0.2, 0.6, 1))
plot_estimates(est_g4)

Obtain Optimized Sensitivity Parameters Using Multivariate Calibration Criterion

Description

Obtain Optimized Sensitivity Parameters Using Multivariate Calibration Criterion

Usage

get_opt_gamma(
  mu_y_dt,
  mu_u_dt,
  cov_u_t,
  sigma_y_t,
  R2_constr = 1,
  normtype = "L2",
  idx = NULL,
  ...
)
get_opt_gamma(
  mu_y_dt,
  mu_u_dt,
  cov_u_t,
  sigma_y_t,
  R2_constr = 1,
  normtype = "L2",
  idx = NULL,
  ...
)

Arguments

`mu_y_dt`	Scalar or vector that contains naive estimates of treatment effects ignoring confounding.
`mu_u_dt`	Matrix of difference in conditional confounder means, $E(U \mid t1) - E(U \mid t2)$ , with latent variables in columns.
`cov_u_t`	Covariance matrix of confounders conditional on treatments.
`sigma_y_t`	Scalar of the standard deviation of outcome conditional on treatments.
`R2_constr`	an optional scalar or vector specifying the upper limit constraint on $R^2$ . By default, `R2_constr = 1`.
`normtype`	character. Optional function `m` for the multivariate calibration criterion. By default, the L2 norm will be applied. "L1" - apply the L1 norm, `sum(abs(x))`. "L2" - apply the L2 norm, `sqrt(sum(x^2))`. "Inf" - apply the infinity norm, `max(abs(x))`.
`idx`	A zero-one vector with 1 in the i-th coordinate if the i-th outcome to be applied with the MCC calibration over, otherwise 0.
`...`	further arguments passed to `solve`

Value

Optimized sensitivity parameters.

Estimates of genes' effects on mice body weight using null treatments approach from Miao et al. (2020)

Description

The dataset consists of estimates of treatment effects of 17 genes, which are likely to affect mouse weight, by using the null treatments approach from Miao et al. (2020), assuming that at least half of the confounded treatments have no causal effect on the outcome.

Usage

mice_est_nulltr
mice_est_nulltr

Format

A data frame with 17 rows and 6 variables:

esti: mean estimates of genes' treatment effects on mouse body weight
X2.5.: 2.5% percentile of the estimates of genes' treatment effects on mouse body weight
X97.5.: 97.5% percentile of the estimates of genes' treatment effects on mouse body weight
X5.: 5% percentile of the estimates of genes' treatment effects on mouse body weight
X95.: 95% percentile of the estimates of genes' treatment effects on mouse body weight
signif: significance

Source

https://arxiv.org/abs/2011.04504

Body weight and gene expressions of 287 mice

Description

A dataset are collected from 287 mice, including the body weight, 37 gene expressions, and 5 single nucleotide polymorphisms.

Usage

micedata
micedata

Format

A data frame with forty-three variables: the mice body weight, y, 5 single nucleotide polymorphisms, rs3663003, rs4136518, rs3694833, rs4231406, rs3661189, and the rest are thirty-seven genes.

Source

https://arxiv.org/abs/2011.04504

Visualize Estimates of Treatment Effects

Description

Visualize Estimates of Treatment Effects

Usage

plot_estimates(est, show_rv = TRUE, order = "naive", labels = NULL, ...)
plot_estimates(est, show_rv = TRUE, order = "naive", labels = NULL, ...)

Arguments

`est`	an return object from `gcalibrate` or `bcalibrate`, or `data.frame` containing estimates of treatment effects with estimates' type in columns and contrasts of interest in rows.
`show_rv`	logical. Whether robustness values should be printed in the plot or not? Available only for the "worstcase" calibration.
`order`	character. The type of order used to plot treatment effects from left to right. Can be one of the following: "naive" - order by the naive estimate from smallest to largest. "worstcase" - place all treatments with negative robust effects on the left, with positive robust effects on the right, and all sensitive ones in the middle. Within the negative robust group, order treatments by the upper bound of the worst-case ignorance region from smallest to largest; within the positive robust group, order treatments by the lower bound of the worst-case ignorance region from smallest to largest; and within the sensitive group, order by the naive estimate from smallest to largest.
`labels`	character. Labels of treatments.
`...`	further arguments passed to `theme`

Value

A graph plotting ignorance regions of the causal estimands of interest.

Note

For examples, please refer to bcalibrate or gcalibrate

Package 'CopSens'

Help Index

Calibration for Binary Outcomes

Description

Usage

Arguments

Value

Examples

Calculate Robustness Value When Executing Worstcase Calibration

Description

Usage

Arguments

Value

Examples

Calibrate Estimate of Intervention Mean for Binary Outcome

Description

Usage

Arguments

Value

Dataset with Gaussian Treatments and Binary Outcomes

Description

Usage

Format

Source

Dataset with Gaussian Treatments and Outcomes

Description

Usage

Format

Source

Calibration for Gaussian Outcomes

Description

Usage

Arguments

Value

Examples

Obtain Optimized Sensitivity Parameters Using Multivariate Calibration Criterion

Description

Usage

Arguments

Value

Estimates of genes' effects on mice body weight using null treatments approach from Miao et al. (2020)

Description

Usage

Format

Source

Body weight and gene expressions of 287 mice

Description

Usage

Format

Source

Visualize Estimates of Treatment Effects

Description

Usage

Arguments

Value

Note