Project 3 Part 1: Functions 💉#
Instrumental Variables
—TODO your name here
Collaboration Statement
TODO brief statement on the nature of your collaboration.
TODO your collaborator’s names here
Tip
You’ll notice that the non-function cells of this notebook are contained in a if __name__ == "__main__":
block. This indicates to Jupyter that the code should only be run when the file is executed directly, not when it is imported as a module. If you write additional tests or cells that are not functions, make sure to add them within a if __name__ == "__main__":
block as well.
Part 1 Rubric#
Section |
Points |
---|---|
Instrumental variable estimators |
3 |
Characterizing compliers |
0.5 |
Assumption violation simulation |
1.5 |
Assumption violation widget |
1.5 |
Total |
6.5 pts |
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.formula.api as smf
from ipywidgets import interact_manual
rng = np.random.default_rng(42)
1. Instrumental Variable Estimators#
We’ll begin by implementing two instrumental variable estimators: the Wald estimator and the two-stage least squares (TSLS) estimator, with covariates.
1.1 Wald Estimator#
We saw in class that the Wald estimator is given by:
This quantity can be interpreted as the ATE under the linear outcome assumption, or the LATE under the monotonicity assumption. It can also be expressed as follows:
where the ITT is the intent-to-treat effect of the instrument on the outcome:
and the \(P(\text{compliers})\) is the proportion of compliers in the sample. See Activity 12 for additional references on how we computed the proportion of compliers, never takers, and always takers:
Complete the functions below to implement the Wald estimator and related quantities.
Tip
With pandas and numpy logical indexing, the implementation of these functions should be brief and not need any for loops.
def intent_to_treat(df, iv_col, outcome_col):
"""
Calculate the intent-to-treat (ITT) estimate of iv_col on outcome_col.
This should look very similar to our diff_in_means function from Project 1, but with a slightly different function signature.
Args:
df (pd.DataFrame): The input dataframe containing the treatment and outcome variables.
iv_col (str): The name of the instrument variable in the dataframe.
outcome_col (str): The name of the outcome variable in the dataframe.
Returns:
float: The ITT estimate of iv_col on outcome_col.
"""
# TODO your code here
pass
if __name__ == '__main__':
test_df = pd.DataFrame({
'T': [1, 0, 1, 0],
'Z': [1, 0, 1, 1],
'Y': [2, 1, 3, 2]
})
assert np.isclose(intent_to_treat(test_df, 'Z', 'Y'), (3+2+2)/3 - 1), "ITT estimate is incorrect"
# these tests are not exhaustive, so adding additional asserts is recommended
print("All asserts for intent_to_treat passed!")
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[2], line 26
19 if __name__ == '__main__':
20 test_df = pd.DataFrame({
21 'T': [1, 0, 1, 0],
22 'Z': [1, 0, 1, 1],
23 'Y': [2, 1, 3, 2]
24 })
---> 26 assert np.isclose(intent_to_treat(test_df, 'Z', 'Y'), (3+2+2)/3 - 1), "ITT estimate is incorrect"
28 # these tests are not exhaustive, so adding additional asserts is recommended
29 print("All asserts for intent_to_treat passed!")
File ~/Github/comsc341-cd/venv/lib/python3.11/site-packages/numpy/core/numeric.py:2348, in isclose(a, b, rtol, atol, equal_nan)
2345 dt = multiarray.result_type(y, 1.)
2346 y = asanyarray(y, dtype=dt)
-> 2348 xfin = isfinite(x)
2349 yfin = isfinite(y)
2350 if all(xfin) and all(yfin):
TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
def prop_never_takers(df, treat_col, instrument_col):
"""
Calculate the proportion of never takers in the dataset.
Args:
df (pd.DataFrame): The input dataframe containing the treatment, outcome, and instrument variables.
treat_col (str): The name of the treatment variable in the dataframe.
instrument_col (str): The name of the instrument variable in the dataframe.
Returns:
float: The proportion of never takers in the dataset.
"""
# TODO your code here
pass
def prop_always_takers(df, treat_col, instrument_col):
"""
Calculate the proportion of always takers in the dataset.
Args:
df (pd.DataFrame): The input dataframe containing the treatment and instrument variables.
treat_col (str): The name of the treatment variable in the dataframe.
instrument_col (str): The name of the instrument variable in the dataframe.
Returns:
float: The proportion of always takers in the dataset.
"""
# TODO your code here
pass
def prop_compliers(df, treat_col, instrument_col):
"""
Calculate the proportion of compliers in the dataset.
This can be computed either as:
E[T | Z=1] - E[T | Z=0]
or
1 - P(never takers) - P(always takers)
Args:
df (pd.DataFrame): The input dataframe containing the treatment and instrument variables.
treat_col (str): The name of the treatment variable in the dataframe.
instrument_col (str): The name of the instrument variable in the dataframe.
Returns:
float: The proportion of compliers in the dataset.
"""
# TODO your code here
pass
if __name__ == '__main__':
test_df = pd.DataFrame({
'T': [1, 0, 1, 0],
'Z': [1, 0, 1, 1],
'Y': [2, 1, 3, 2]
})
assert np.isclose(prop_always_takers(test_df, 'T', 'Z'), 0), "prop_always_takers should be 0"
assert np.isclose(prop_never_takers(test_df, 'T', 'Z'), 1/3), "prop_never_takers should be 1/3"
assert np.isclose(prop_compliers(test_df, 'T', 'Z'), 2/3), "prop_compliers should be 2/3"
# these tests are not exhaustive, so adding additional asserts is recommended
print("All asserts for compliance status proportions passed!")
def wald_estimator(df, treat_col='T', outcome_col='Y', instrument_col='Z'):
"""
Calculate the Wald estimator for the given dataset.
Can be implemented in a number of different ways, including:
- direct calculation
- using the intent_to_treat and prop_compliers functions
Args:
df (pd.DataFrame): The input dataframe containing the treatment, outcome, and instrument variables.
treat_col (str): The name of the treatment variable in the dataframe.
outcome_col (str): The name of the outcome variable in the dataframe.
instrument_col (str): The name of the instrument variable in the dataframe.
Returns:
float: The Wald estimator for the given dataset.
"""
# TODO your code here
pass
if __name__ == '__main__':
test_df = pd.DataFrame({
'T': [1, 0, 1, 0],
'Z': [1, 0, 1, 1],
'Y': [2, 1, 3, 2]
})
assert np.isclose(wald_estimator(test_df, 'T', 'Y', 'Z'), (4/3) / (2/3)), "wald_estimator is incorrect"
print("All asserts for wald_estimator passed!")
1.2 Two-Stage Least Squares (TSLS) estimator with covariates#
The two-stage least squares estimate that we implemented in Worksheet 5 will produce the same estimate as the Wald estimator.
However, the advantage of the TSLS estimator is that it can be extended to include covariates, which is critical for the case when the instrument unconfoundedness assumption is satisfied only when conditioning on covariates. This is a so-called “conditional instrument” as shown on slide 14 of Lecture 15.
Specifically, we can fit the following models for the first and second stages:
with \(\hat{T}\) being the predicted treatment from the first stage, and \(\beta_1\) being the causal effect of interest. Similar to our regression implementations for observational studies, the covariates \(X\) are confounders between the instrument and the outcome, and we can include multiple covariates in the model.
Even in the case where the instrument is completely randomized, the inclusion of covariates can still improve the precision of the causal effect estimate.
def tsls_with_covariates(df, treat_col='T', outcome_col='Y', instrument_col='Z', covariates=None):
"""
Calculate the two-stage least squares estimate for the given dataset, optionally conditioning on covariates.
This function is an extension of the two-stage least squares estimator implemented in WS 5, so
you can use that implementation as a starting point.
Args:
df (pd.DataFrame): The input dataframe containing the treatment, outcome, and instrument variables.
treat_col (str): The name of the treatment variable in the dataframe.
outcome_col (str): The name of the outcome variable in the dataframe.
instrument_col (str): The name of the instrument variable in the dataframe.
covariates (list[str]): The covariates to condition on, defaults to None.
Returns:
float: The two-stage least squares estimate for the given dataset.
"""
# TODO fit the first stage model, optionally conditioning on covariates
# TODO add the predicted treatment values to the dataframe
# TODO fit the second stage model, optionally conditioning on covariates
# TODO return the causal effect estimate from the second stage model
pass
if __name__ == '__main__':
# Test the function with a simple dataset
df = pd.DataFrame({
'T': [1, 0, 1, 0],
'Z': [1, 0, 1, 1],
'Y': [2, 1, 3, 2],
'X': [1, 1, 1, 1]
})
# Test without covariates, should be the same as the Wald estimator
assert np.isclose(tsls_with_covariates(df), wald_estimator(df)), "tsls should be the same as wald_estimator"
# Test with including a constant covariate
assert np.isclose(tsls_with_covariates(df, covariates=['X']), (4/3) / (2/3))
# these tests are not exhaustive, so adding additional asserts is recommended
print("All asserts for tsls_with_covariates passed!")
2. Characterizing Compliers#
A fundamental challenge in instrumental variable estimation is that we do not know whether an individual unit is a complier or not. That is because of how potential treatment status corresponds to the observed treatment status in IV studies.
We have the following table for the observed instrument and treatment status, assuming monotonicity (no defiers):
Z |
T |
Compliance status |
---|---|---|
0 |
0 |
Never taker or complier |
0 |
1 |
Always taker |
1 |
0 |
Never taker |
1 |
1 |
Always taker or complier |
In the case where \(Z=0\) and \(T=0\), we can’t tell whether the unit is a complier, or a never taker who just happened to be in the \(Z=0\) group.
In the case where \(Z=1\) and \(T=1\), we can’t tell whether the unit is a complier, or an always taker who just happened to be in the \(Z=1\) group.
Instead, we have to rely on summary statistics to characterize compliers in the sample. For example, if age is a covariate in our study, there are ways to estimate the average age of compliers in the sample, which we can then compare to the average age of all units in the sample.
We will use the following formula from Marbach and Hangartner 2020 to estimate \(E[X \mid \text{complier}]\), where \(X\) is a covariate of interest:
This quantity can be understood as a “weighted average” of the covariate \(X\) across the three compliance statuses, where the weights are inversely proportional to the compliance status proportions.
The conditional means in the expression above can be estimated by:
selecting the subset of a dataframe that are never takers (\(T=0\) AND \(Z=1\)) or always takers (\(T=1\) AND \(Z=0\))
calculating the mean of \(X\) in the selected subset
def complier_covariate_mean(df, covariate_col, treat_col, instrument_col):
"""
Calculate the mean of a covariate among compliers in the dataset.
Args:
df (pd.DataFrame): The input dataframe containing the treatment, instrument, and covariate variables.
covariate_col (str): The name of the covariate variable in the dataframe.
treat_col (str): The name of the treatment variable in the dataframe.
instrument_col (str): The name of the instrument variable in the dataframe.
Returns:
float: The mean of the covariate among compliers in the dataset.
"""
df = df.copy()
# TODO select the subset of units that are never takers (nt) or always takers (at)
nt_df = None
at_df = None
# TODO calculate the mean of the covariate in the selected subsets
nt_mean = None
at_mean = None
# TODO calculate the overall mean
overall_mean = None
# TODO calculate the compliance status proportions
prop_nt = None
prop_at = None
prop_co = None
# TODO calculate the complier mean
complier_mean = None
return complier_mean
if __name__ == '__main__':
# Test the function with a simple dataset
df = pd.DataFrame({
'Z': [1, 0, 1, 0, 1, 0,],
'T': [0, 0, 1, 1, 1, 0,],
'Y': [2, 1, 3, 2, 5, 4,],
'X': [1, 2, 3, 4, 5, 5,]
})
assert np.isclose(complier_covariate_mean(df, 'X', 'T', 'Z'), 5), "complier_covariate_mean should be 5"
3. Simulation study of IV assumption violations#
We’ll now extend the simulation study that generates \(Z\), \(X\), \(T\), and \(Y\) from Worksheet 5 to include violations of the following 3 IV assumptions:
instrument unconfoundedness
relevance
exclusion restriction
Additionally, we’ll quantify the bias and variance for such violations.
3.1 Building the simulation#
The first variable we’ll generate is a binary unobserved confounder \(X\):
This corresponds to:
X = rng.choice([0, 1], size=n_samples)
Note
In this simulation, we’ll make the confounder \(X\) observable so that we can include it as a covariate in the two-stage least squares estimation.
Instrument Unconfoundedness#
We’ll next generate a binary instrument \(Z\) in the same way as \(X\):
To enable a violation of the instrument unconfoundedness assumption, we include a boolean unconfoundedness_violation
parameter, with the following effect on the simulation:
Thus, if unconfoundedness_violation = True
, the instrument \(Z\) is affected by the confounder \(X\).
Relevance#
To strengthen or weaken the relevance assumption, we can manipulate the strength of the relationship between the instrument and the treatment, which we call \(\pi\) below. Specifically, \(\pi\) will take on values from 0 to 1, where 0 is a total violation of the relevance assumption and 1 is a “fully relevant” relationship. Modify the simulation from Worksheet 5 to include this parameter as follows:
Where:
\(\pi\) is the strength of the relationship between the instrument and the treatment,
instrument_strength
in the function\(\epsilon \sim \text{N}(0, 1)\) is the noise term, sampled from a normal distribution with mean 0 and standard deviation 1
Exclusion Restriction#
Finally, to enable a violation of the exclusion restriction, we include a boolean exclusion_restriction_violation
parameter, with the following effect on the simulation:
Where:
\(\tau\) is the treatment effect,
treatment_effect
in the function\(\gamma\) is the effect of the confounder \(X\) on the outcome,
confounder_effect
in the function\(\epsilon \sim \text{N}(0, 1)\) is the noise term, sampled from a normal distribution with mean 0 and standard deviation 1
def sim_iv_assumptions(n_samples=1000, treatment_effect=1,
confounder_effect=3,
instrument_strength=1,
exclusion_restriction_violation=False,
unconfoundedness_violation=False):
"""
Simulate an IV study with a binary instrument and confounded treatment and outcome,
with potential violations of the relevance, exclusion restriction, and unconfoundedness assumptions.
Args:
n_samples (int): The number of samples to simulate, defaults to 10000
treatment_effect (float): The effect of the treatment on the outcome, defaults to 1
confounder_effect (float): The effect of the confounder on the outcome, defaults to 3
Returns:
A pandas DataFrame with the simulated data:
- T: The treatment variable
- Z: The instrument variable
- Y: The outcome variable
"""
# TODO generate binary confounder
X = 0
# TODO generate binary instrument
Z = 0
# TODO update Z based on unconfoundedness_violation
# TODO generate treatment dependent on Z, X, and noise, incorporating instrument_strength
T = 0
# TODO generate outcome dependent on T, X, Z, and noise, incorporating treatment_effect and confounder_effect
Y = 0
# TODO update Y based on exclusion_restriction_violation
#return pd.DataFrame({'X': X, 'Z': Z, 'T': T, 'Y': Y})
if __name__ == '__main__':
# Test the function with no violations
df = sim_iv_assumptions(n_samples=1000, treatment_effect=1, confounder_effect=3, instrument_strength=1, exclusion_restriction_violation=False, unconfoundedness_violation=False)
assert df[['X', 'Z']].corr().iloc[0, 1] < 0.1, "X and Z should be relatively uncorrelated with no instrument unconfoundedness violation"
# make sure Z and T are integers
assert df['Z'].dtype == int, "Z should be an integer"
assert df['T'].dtype == int, "T should be an integer"
# Test the function with instrument unconfoundedness violation
df = sim_iv_assumptions(n_samples=1000, treatment_effect=1, confounder_effect=3, instrument_strength=1, exclusion_restriction_violation=False, unconfoundedness_violation=True)
assert df[['X', 'Z']].corr().iloc[0, 1] > 0.4, "X and Z should be positively correlated with instrument unconfoundedness violation"
# Test the function with relevance violation
df = sim_iv_assumptions(n_samples=1000, treatment_effect=1, confounder_effect=3, instrument_strength=0, exclusion_restriction_violation=False, unconfoundedness_violation=False)
assert df[['T', 'Z']].corr().iloc[0, 1] < 0.1, "T and Z should be relatively uncorrelated with instrument_strength = 0"
# these tests are not exhaustive, so adding additional asserts is recommended
print("All asserts for sim_iv_assumptions passed!")
3.2 Quantifying the bias and variance of treatment effect estimates#
In addition to visually inspecting the bias and variance of the treatment effect estimate as we’ve done throughout the semester, we can also quantify the bias and variance of the treatment effect estimate.
Bias#
The bias of a treatment effect estimate can be calculated as follows:
Where \(\tau\) is the true treatment effect and \(E[\hat{\tau}]\) is the expected value (mean) of the treatment effect estimate.
Variance#
The variance of a treatment effect estimate can be calculated as follows:
This can be conveniently calculated using np.var:
np.var(estimated_effects_list)
def bias(estimated_effects_list, true_effect):
"""
Calculate the bias of a treatment effect estimate.
Args:
estimated_effects_list (list): A list of estimated treatment effect estimates
true_effect (float): The true treatment effect
Returns:
float: The bias of the treatment effect estimate
"""
# TODO your code here
if __name__ == '__main__':
true_effect = 1
estimated_effects_list = [0, 1, 0, 1]
assert np.isclose(bias(estimated_effects_list, true_effect), -0.5), "Bias should be -0.5"
# these tests are not exhaustive, so adding additional asserts is recommended
print("All asserts for bias passed!")
4. Assumption violation simulation widget#
Finally, we’ll extend the plot_iv_estimates
function from Worksheet 5 where we compared naive regression to IV estimation.
The widget should have interactive sliders for the strength of the instrument, as well as toggles for the violations of the instrument unconfoundedness and exclusion restriction assumptions. Additionally, the widget should have a toggle for whether to include the confounder \(X\) as a covariate in the two-stage least squares estimation.
Make sure to limit the range of the instrument strength slider to 0 to 1 with step size of 0.1.
def naive_regression(df):
"""
Perform a regression on the treatment, which is incorrect due to the missing confounder.
Args:
df (pd.DataFrame): The dataframe containing the data
"""
naive_formula = 'Y ~ 1 + T'
naive_model = smf.ols(naive_formula, data=df).fit()
return naive_model.params['T']
# TODO add interact_manual decorator, with iv_strength slider of range 0 to 1 and step size of 0.1
def plot_iv_estimate_assumptions(iv_strength=0.5, unconfoundedness_violation=False, exclusion_restriction_violation=False, include_confounder=False):
"""ArithmeticError
Plots the distribution of the IV and naive estimates of the treatment effect for the given assumptions.
Additionally displays the bias and variance of the IV and naive estimates.
Holds the treatment effect constant at 1, and the confounder effect at 5.
Args:
iv_strength (float): The strength of the instrument, defaults to 0.5
unconfoundedness_violation (bool): Whether to include a violation of the instrument unconfoundedness assumption, defaults to False
exclusion_restriction_violation (bool): Whether to include a violation of the exclusion restriction assumption, defaults to False
include_confounder (bool): Whether to include the confounder X as a covariate in the two-stage least squares estimation, defaults to False
"""
treatment_effect = 1
confounder_effect = 5
n_datasets = 1000
iv_estimates = []
naive_estimates = []
fig = plt.figure()
for _ in range(n_datasets):
# Simulate an IV dataset with the given assumptions
iv_df = sim_iv_assumptions(treatment_effect=treatment_effect,
confounder_effect=confounder_effect,
instrument_strength=iv_strength,
exclusion_restriction_violation=exclusion_restriction_violation,
unconfoundedness_violation=unconfoundedness_violation)
# TODO compute the causal effect estimate from the naive regression
naive_estimates.append("TODO")
# TODO compute the causal effect estimate using tsls, optionally including the confounder X as a covariate if include_confounder
covariates = None
iv_estimates.append("TODO")
# Plot the true effect
plt.axvline(treatment_effect, color='black', linestyle='--', label='True Effect')
# TODO plot kdeplots of IV estimates and naive estimates
# TODO annotate the title with the bias and variance of the IV and naive estimates
iv_bias = None
iv_variance = None
regression_bias = None
regression_variance = None
plt.title(f'IV Bias: {iv_bias:.2f}, IV Variance: {iv_variance:.2f}\n Regression Bias: {regression_bias:.2f}, Regression Variance: {regression_variance:.2f}')
plt.xlim(-9, 11)
plt.xticks(range(-9, 12, 1))
plt.legend();
Tip
It may be helpful to refer to the instrumental variable DAG when thinking about why you see certain changes in the bias and variance in the widget.
Begin by using the default values for the widget, where
instrument_strength = 0.5
,unconfoundedness_violation = False
,exclusion_restriction_violation = False
, andinclude_confounder = False
. Compare the bias and variance of the IV and regression estimates and briefly describe what you observe.Now vary the
instrument_strength
slider. Describe what you observe in terms of bias and variance when you:
increase the strength of the instrument
decrease the strength of the instrument
set the strength of the instrument to the extremes of 0 and 1
Next, set the
instrument_strength
to 0.5 and toggle theexclusion_restriction_violation
checkbox. When this assumption is violated, does it primarily affect the bias or the variance of the treatment effect estimate?Similarly, reset the
instrument_strength
to 0.5 and toggle theunconfoundedness_violation
checkbox. When this assumption is violated, does it primarily affect the bias or the variance of the treatment effect estimate? What do you observe when you introduceinclude_confounder = True
, and why does this occur?Finally, consider a valid IV setting where there is no violation of the instrument unconfoundedness or exclusion restriction assumptions, and the instrument is strong (highly relevant to the treatment). Describe what happens to the variance of the IV estimate when you set
include_confounder = True
, and briefly discuss why you think this occurs.
TODO your responses below: