Activity 15: RDDs

Activity 15: RDDs#

2025-11-03


import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import statsmodels.formula.api as smf
gov_transfers = pd.read_csv("~/COMSC-341CD/data/gov_transfers.csv")

The columns we have in the dataset are:

  • \(R\) Income_Centered: the predicted income of the household, centered at 0

  • \(T\) Participation: whether the household participated in the poverty alleviation program

  • \(Y\) Support: the amount of goverment support, encoded as a survery question:

    • In relation to the previous government, do you believe that the current government is worse (0), the same (1/2), better (1)?

gov_transfers.head()
Income_Centered Participation Support
0 0.006571 0 1.0
1 0.011075 0 1.0
2 0.002424 0 1.0
3 0.007650 0 0.5
4 0.010001 0 1.0

Part 1#

Let’s first visualize how the treatment assignment is related to the income. Use pd.cut with bins=20 and labels=False to bin the income data into 20 bins:

# TODO your code here

gov_transfers.head()
Income_Centered Participation Support
0 0.006571 0 1.0
1 0.011075 0 1.0
2 0.002424 0 1.0
3 0.007650 0 0.5
4 0.010001 0 1.0

Next, generate a sns.pointplot of y=Participation vs x=income_bin. Set linestyle='none' to plot the points individually.

Discuss with folks around you: Do you observe any deviations from the policy of assigning participation in the transfer program based on income? Does this suggest that the RDD is “sharp” or not?

Your response: pollev.com/tliu

# TODO your code here

# convert the xticks to the same units as the original income data
plt.xticks(ticks=[0, 5, 9.5, 14, 19], labels=[-0.02, -0.01, 0, 0.01, 0.02])
# plot the cutoff as a vertical line
plt.axvline(x=9.5, color='black', linestyle='--', label='Income cutoff')
<matplotlib.lines.Line2D at 0x322f0c850>
../_images/2c496cc1d4a28e4dcc2ce817c6fb7262b8ac82d8428e8c82d1748ed727fd8561.png

Part 2#

Now we’ll look at the outcome variable, Support. Generate a sns.pointplot of y=Support vs x=income_bin. Set linestyle='none' to plot the points individually.

# TODO your code here

plt.xticks(ticks=[0, 5, 9.5, 14, 19], labels=[-0.02, -0.01, 0, 0.01, 0.02])
plt.axvline(x=9.5, color='black', linestyle='--', label='Income cutoff')
<matplotlib.lines.Line2D at 0x322f7cf10>
../_images/2c496cc1d4a28e4dcc2ce817c6fb7262b8ac82d8428e8c82d1748ed727fd8561.png

On Worksheet 5, we fit two linear regressions on either side of the cutoff to estimate the average treatment effect for an RDD. Another approach to estimating the average treatment effect is to fit a single linear regression that allows the slope to change at the cutoff:

\[ Y = \beta_0 + \beta_1 (R - c) + \beta_2 T + \beta_3 (R -c ) \cdot T \]

Since \(c=0\), this simplifies to:

\[ Y = \beta_0 + \beta_1 R + \beta_2 T + \beta_3 R\cdot T \]

The average treatment effect at the cutoff is then given by \(\beta_2\).

# TODO your code here
#rdd_formula = ''
#rdd_model = smf.ols(rdd_formula, data=gov_transfers).fit()
#print(rdd_model.params['Participation'])

What is your estimate of the average treatment effect at the cutoff (rounded to 1 decimal place)?

Your response: pollev.com/tliu

Acknowledgements#

This activity uses data from Nick Huntington-Klein’s causaldata package.