Plotting and bootstrap demo#
2025-02-13
# imports we'll need
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
The Normal distribution#
A probability distribution that we will frequently encounter in addition to the categorical distributions we saw in worksheet 1 is the normal distribution. The normal distribution has the following properties:
centered and symmetric about the mean \(\mu\) (or \(E[X]\))
spread by the standard deviation \(\sigma\)
the variance is \(\sigma^2\) (or \(V[X]\))
We write a random variable as \(X\) that follows a normal distribution as \(X \sim \mathcal{N}(\mu, \sigma^2)\).
Tip
Watch whether the software you are using takes the variance or the standard deviation as input. In numpy
, it is the standard deviation, which is different than how we write the distribution above.
We can draw random samples from the normal distribution using numpy.random.Generator.normal()
:
# This creates a generator with a fixed seed, so we can generate the same random numbers
rng = np.random.default_rng(seed=42)
# generate 50000 samples from a normal distribution with E[X]=10 and standard deviation 1
For bootstrap samples, we can use random.Generator.choice() with replacement to draw samples from our dataset.