Activity 1: Jupyter and NumPy#
2025-01-30
# import numpy
# the 'as' allows us to use np as a shorthand for numpy
import numpy as np
1. Replicating the combined contingency table#
We can replicate the calculations we previously did by hand using numpy:
# Y=1 Y=0
contingency_overall = np.array([[273, 77], # T=1
[289, 61]]) # T=0
Accessing the contingency table can be done by indexing array[row, column]
. The :
operator is used to select all rows or columns.
# select the first row, second column
print(contingency_overall[0, 1])
# select the second column
print(contingency_overall[:, 1])
# select the first row
print(contingency_overall[0, :])
77
[77 61]
[273 77]
# Compute the marginal probability of Y=1
# TODO your code here
# Computer the conditional probability of Y=1 given T=1
# TODO your code here
# Computer the conditional probability of Y=1 given T=0
# TODO your code here
Optional extension: if you’re already comfortable with the coding environment, you can explore the np.sum documentation, particularly the axis
parameter, to see how you can equivalently compute these probabilities without slicing out the rows or columns.
2. Contingency tables for small and large stones#
Below are the contingency tables for the two cases, stored as numpy arrays:
# contingency table for C=0, large stones
# Y=1 Y=0
contingency_C0 = np.array([[192, 71], # T=1
[55, 25]]) # T=0
# contingency table for C=1, small stones
# Y=1 Y=0
contingency_C1 = np.array([[81, 6], # T=1
[234, 36]]) # T=0
# prints (number of rows, number of columns)
print(contingency_C0.shape)
(2, 2)
We can verify that the two contingency tables are consistent with the overall contingency table:
# this returns a numpy array of booleans of whether each element in the two arrays are equal
np.isclose(contingency_C0 + contingency_C1, contingency_overall)
array([[ True, True],
[ True, True]])
Compute the expected value of \(Y\) given \(T=1\) and \(T=0\) for the case of large stones (\(C=0\)).
# TODO your code
EY_given_T1_C0 = 0
EY_given_T0_C0 = 0
# the f-string allows us to insert the value of the variable into the string
# and the :.2f allows us to round the value to 2 decimal places
print(f"Estimated expected value of Y given T=1 and C=0: {EY_given_T1_C0:.2f}")
print(f"Estimated expected value of Y given T=0 and C=0: {EY_given_T0_C0:.2f}")
Estimated expected value of Y given T=1 and C=0: 0.00
Estimated expected value of Y given T=0 and C=0: 0.00
Which treatment appears to be more successful in the large stone case (\(C=0\))?
Your response: TODO
Now let’s compute the expected value of \(Y\) given \(T=1\) and \(T=0\) for the case of small stones (\(C=1\)).
# TODO your code
EY_given_T1_C1 = 0
EY_given_T0_C1 = 0
# the f-string allows us to insert the value of the variable into the string
# and the :.2f allows us to round the value to 2 decimal places
print(f"Estimated expected value of Y given T=1 and C=1: {EY_given_T1_C1:.2f}")
print(f"Estimated expected value of Y given T=0 and C=1: {EY_given_T0_C1:.2f}")
Estimated expected value of Y given T=1 and C=1: 0.00
Estimated expected value of Y given T=0 and C=1: 0.00
Which treatment appears to be more successful in the small stone case (\(C=1\))?
Your response: TODO
Note that we’ve split out the contingency tables for each stone case, but what we’re effectively doing is computing the following conditional expectations:
which we read as “the expected value of \(Y\) given \(T\) set to \(t\) and \(C\) set to \(c\)”.