Causal Catalog#

Assumptions#

Consistency#

Definition

The observed outcome \(Y\) for a unit is the same as the potential outcome \(Y(T=t)\) for that unit under the treatment that was actually observed \(T=t\).

Mathematical definition

\(Y = Y(T)\)

For binary treatment, that means:

  • \(Y = Y(1) \text{ if } T=1\)
  • \(Y = Y(0) \text{ if } T=0\)

Intuition/Examples

Activity 3 consistency scenarios

Exchangeability#

Definition

The distribution of \(Y(0)\) and \(Y(1)\) for the \(T=1\) and \(T=0\) groups are the same. Also known as ignorability.

Mathematical definition

\(Y(0), Y(1) \perp T\)

Intuition/Examples

Lecture 3 (slide 12): Caffeine and exam performance exchangeability brainstorm

Conditional Exchangeability#

Definition

The distribution of \(Y(0)\) and \(Y(1)\) for the \(T=1\) and \(T=0\) groups are the same, conditional on some variables \(X\). Also known as unconfoundedness. We determine which variables \(X\) to condition on based on the backdoor criterion.

Mathematical definition

\(Y(0), Y(1) \perp T \mid X\)

Intuition/Examples

Activity 8 on how to determine which variables satisfy the backdoor criterion

Positivity#

Definition

Every unit has a non-zero probability of receiving the treatment, and every unit has a non-zero probability of not receiving the treatment. We can also view it as: every subgroup \(x\) in our sample has to have some units that received the treatment, and some units that did not receive the treatment. Also known as overlap.

Mathematical definition

For all values of covariates \(x \in X\):
         \(0 < P(T = 1 \mid X = x) < 1\)

Intuition/Examples

See Lecture 8, beginning at slide 25 for intuition and PollEverywhere example, as well as Activity 9 for seeing bins of covariates in Yeager et al. (2019).

Study Designs#

Randomized Experiments#

Assumptions needed

  • Consistency
  • No interference

Assumptions ensured

  • Exchangeability

Causal quantities identified

  • Average treatment effect (ATE)
  • Average treatment effect on the treated (ATT)
  • Average treatment effect on the untreated (ATU)

Pros/cons#

Advantages

Disadvantages

  • Mitigating the impact of confounding variables
  • Exchangeability is ensured
  • Ensure that the results are causal
  • Cost a lot
  • Some experiments cannot be randomized because of ethics
  • Potential biases in random assignment
  • It is not always possible to design randomized experiments for certain circumstances

Observational Studies#

Assumptions needed

  • Consistency
  • No interference
  • Conditional exchangeability / unconfoundedness
  • Positivity

Assumptions ensured

  • None 😞

Causal quantities identified

  • Average treatment effect (ATE)
  • Average treatment effect on the treated (ATT)
  • Average treatment effect on the untreated (ATU)

Instrumental Variables#

Assumptions needed

  • Consistency
  • No interference
  • Relevance
  • Exclusion restriction
  • Instrument unconfoundedness
  • Linear outcome or monotonicity

Assumptions ensured by design

  • None, but does not need conditional exchangeability / unconfoundedness

Causal quantities identified

  • Average treatment effect (ATE)
  • Local average treatment effect (LATE)

Case studies

See Activity 13 and slide 16 of Lecture 15 for our case studies!


Regression Discontinuity#

Sharp RDD#

Description

  • Treatment is completely deterministic based on the running variable
  • Treatment is "forced" once the running variable crosses the cutoff c

Assumptions needed

  • Consistency
  • No interference
  • Continuity

Assumptions ensured by design

  • None, but does not need conditional exchangeability / unconfoundedness

Causal quantities identified

  • Average treatment effect (ATE) at the cutoff

Fuzzy RDD#

Description

  • Treatment is not deterministically forced, but it is discontinuous

Assumptions needed

  • Consistency
  • No interference
  • Continuity
  • Monotonicity (no defiers)

Assumptions ensured by design

  • None, but does not need conditional exchangeability / unconfoundedness

Causal quantities identified

  • Local average treatment effect (LATE) at the cutoff

Pros/cons#

Advantages

Disadvantages

  • Useful when randomization is not possible
  • Helps examine specific trends surrounding the cutoff of interest
  • Can visually see the assumption needed for both types of RDDs
  • We can check for manipulation by doing the histogram against the threshold, and identify any spots that would make our RDD invalid
  • Data inefficent (we need a lot of data), there isn't a guaranteed sweet spot for the bandwidth value b
  • Participants may change their behavior based on knowledge of the cutoff
  • Have to assume/guess who is a complier (same problem as with IVs)
  • Can only look at the effects at the cutoff
  • The choice of the bandwidth really matters