Lecture 8

Distributions

Normal distributions and the sampling distribution



Dr Lincoln Colling

14 Nov 2022


Psychology as a Science

Plan for today

Today we’ll learn about the sampling distribution

But before we can do that we need to know what distributions are, where they come from, and how to describe them

  • The binomial distribution

  • The normal distribution

    • Processes that produce normal distributions

    • Process that don’t produce normal distributions

    • Describing normal distributions

    • Describing departures from the normal distributions

  • Distributions and samples

    • The Central Limit Theorem
  • The Standard Error of the Mean

The Binomial Distributions

  • The binomial distribution is one of the simplest distribution you’ll come across

  • To see where it comes from, we’ll just build one!

  • We can build one by flipping a coin (multiple times) and counting up the number of heads that we get

Figure 1: Possible sequences after coin flips

Figure 2: Distribution of number of heads after coin flips

  • In Figure 1 we can see the possible sequences of events that can happen if we flip a coin (⚈ = heads and ⚆ = tails) Figure 2 look very interesting at the moment.

  • In Figure 2 we just count up the number of sequences that lead to 0 heads, 1 head, 2 heads, etc

  • As we flip more coins the distribution of number of heads takes on a characteristic shape

  • This is the binomial distribution

The binomial distribution

  • The binomial distribution is just an idealised representation of the process that generates sequences of heads and tails when we flip a coin

    • Or any other process that gives rise to binary data
  • It’s an idealisation but natural processes do give rise to binomial distribution

  • In the bean machine (Figure 3) balls fall from the top and bounce off pegs as they fall

    • Balls can bounce one of two directions (left or right; binary outcome)
  • Most of the balls collect near the middle, and fewer balls are found at the edges

Figure 3: Example of the bean machine

The normal distribution

Flipping coins might seem a long way off anything you might want to study in psychology, but the shape of the binomial distribution might be familiar to you

  • The binomial distribution has a shape that is similar to the normal distribution

But there are a few key differences:

  1. The binomial distribution is bounded at 0 and n (number of coins)

    • The normal distribution can range from \(+\infty\) to \(-\infty\)
  2. The binomial distribution is discrete (0, 1, 2, 3 etc, but no 2.5)

    • The normal distribution is continuous

The normal distribution is a mathematical abstraction, but we can use it as model of real-life populations that are produced by certain kinds of natural processes

Processes that produce normal distributions

To see how a natural process can give rise to a normal distribution, let’s play a board game!

There’s only 1 rule: You roll the dice n times (number of rounds), add up all the values, and move than many spaces. That is your score

  • We can play any number of rounds

  • And we’ll play with friends, because you can’t get a distribution of scores if you play by yourself!

If we have enough players who play enough rounds then the distribution of scores across all the players will take on a characteristic shape

Figure 4: Distribution of players’ position from the starting point

Processes that produce normal distributions

  • A players score on the dice game is determined by adding up the values of each roll

  • So after each roll their score can increase by some amount

The dice game might look artificial, but it maybe isn’t that different to some natural processes

For example, developmental processes might look pretty similar to the dice game

Think about height:

  • At each point in time some value can be added (growth) or a person’s current height

  • So if we looked at the distribution of heights in the population then we might find something that looks similar to a normal distribution

A key factor that results in the normal distribution shape is this adding up of values

Processes that don’t produce normal distributions

Let’s change the rules of the game

  • Instead of adding up the value of each roll, we’ll multiply them ( e.g., roll a 1, 2, and 4 and your score is 8)

  • The distribution is skewed with most player having low scores and a few players have very high scores

  • Can you think of a process that operates like this in the real world?

    • How about interest or returns on investments?

    • Maybe this explains the shape of real world wealth distributions

Describing normal distributions

  • The normal distribution has a characteristic bell shape but not all normal distributions are identical

  • They can vary in terms of where they are centered and how spread out they are

  • Changing \(\mu\) and \(\sigma\) changes the absolute position of points on the plot, but not the relative positions measured in units of \(\sigma\)

    • This will be a really useful property which we’ll make use of later


Describing deviations from the normal distribution

When looked at the distribution of scores from the second dice game we saw that it was skew

  • Skew is a technical term to describe one way in which distributions can deviate from normal

Describing deviations from the normal distribution

  • Another way to deviate from the normal distribution is to have either fatter or skinnier tails

  • The tailedness of a distribution is given by its kurtosis.

  • Kurtosis of a distribution is often specified with reference to the normal distribution. This is excess kurtosis.

Distributions and samples

  • We’ve seen that whenever we look at the distribution of values where the values are produced by adding up numbers we got something that looked like a normal distribution

  • In Lecture 6, we saw that the formula for the sample mean was as shown in in Equation 1, below:

\[\bar{x}={\displaystyle\sum^{N}_{i=1}{\frac{x_i}{N}}} \qquad(1)\]

  • So to calculate a sample mean, we just add up a bunch of numbers

  • Let’s say I take lots of samples from a population.

    • And for each sample, I calculate the sample mean.

    • If we had to plot these sample means, then what would the distribution look like?

The sampling distribution of the mean

We can try it out.

  • Let’s say that I have a population with a mean of 100

  • And a standard deviation of 15.

  • From this population I can draw samples of 25 values

  • I’ll do this 100,000 times and plot the results in Figure 5

Figure 5: Sample means from 100,000 samples (sample size = 25)

The standard deviation of the sampling distribution of the mean (the plot in Figure 5) has a special name:

It’s called the standard error of the mean!

The Central Limit Theorem

Before we move on to how to calculate the standard error of the mean I want to assure you something

  • You might think that the sampling distribution of the mean in Figure 5 is normally distributed because the population is normally distributed

  • But this is not the case, as your sample size increases, then sampling distribution of the mean will be normally distributed

  • And this will happen even if the population is not normally distributed

The Central Limit Theorem

Figure 6: Distribution of the population

Figure 7: Sampling distribution of the mean (50, 000 samples)

  • If the sample size is large enough, then the sampling distribution of the mean will approach a normal distribution. This occurs even if the population isn’t normally distributed

The standard error of the mean

  • In Lecture 7 we started talking about the spread of sample means around the population

Figure 8: (a) 10 samples with a standard deviation of 7.6 (b) 10 samples with a standard deviation of 9.21


I showed you Figure 8 (above) where the average deviation of sample means from the population mean was either small (A) or large (B)

The standard error of the mean

I asked you to image two scenarios: One that was a feature of the population and one that was a feature of the samples where the average deviation of sample means from the population mean would be small (or zero).

If you managed, then great! But if not, then here they are:

  1. If the average (squared) deviation in the population is 0 then the average deviation of sample means from the population mean would be 0

    • Because all members of the population would be the same, so all samples would be the same, so all sample means would be the same

    • Conversely, if the average (squared) deviations in the population was larger, then the average deviations of sample means from the population mean would be larger

The standard error of the mean

  1. If the sample size was large (so large to include the entire population) then the average deviation of sample means from the population mean would be 0

    • Because every sample would be identical to the population, so every sample mean would be identical to the population mean

    • Conversely, if the sample size was smaller, then the average deviations of sample means from the population mean would be larger

Let’s put these two ideas together to try come up with a formula for the average (squared) deviations of the sample means from the population mean

Our formula will include:

  • \(n\): the sample size

  • \(\sigma^2\): the average (squared) deviations in the population (aka the variance of the population)

  • And we’ll call our result \(\sigma_{\bar{x}}^2\)

The standard error of mean

The only way to combine \(n\) and \(\sigma^2\) so that:

  1. when \(n\) is very big \(\sigma_{\bar{x}}^2\) will be small (and vice versa) and

  2. when \(\sigma^2\) is very small \(\sigma_{\bar{x}}^2\) will be small (and vice versa)

is formula is Equation 2, below:

\[\sigma_{\bar{x}}^2=\frac{\sigma^2}{n} \qquad(2)\]

But remember, we don’t actually know the true \(\sigma^2\) (the variance of the population), we only know \(s^2\) (the sample variance, which is out estimate of the variance in the population). So we’ll make a slight change to the formula as in Equation 3

\[s_{\bar{x}}^2=\frac{s^2}{n} \qquad(3)\]

The standard error of mean

  • There’s one final step to get to the formula for the standard error of the mean.

  • The formula in Equation 3 is framed in terms of the average (squared) deviations) of sample means from the population mean—that is, in terms of variance.

  • But the standard error of the mean is the standard deviation of the sampling distribution

  • The standard deviation is just the square root of the variance, so we just need to take the square root of both sides of Equation 3, to get the equation in Equation 4, below:

\[s_{\bar{x}}=\frac{s}{\sqrt{n}} \qquad(4)\]

More commonly, however, you’ll see \(s_{\bar{x}}\) just written as \(\mathrm{SEM}\) for Standard Error of the Mean

And is the formula for the standard error of the mean and where it comes from

The standard error of mean

This was, admittedly, a fairly long winded way to get to what is essentially a very simple formula

  • However, as I have alluded to several times, the standard error of the mean is a fairly misunderstood concept

  • I hope that getting there the long way has helped you to build a better intuition of what the standard error of the mean actually is

I dislike talking about misconceptions because I think it can sometimes create them

But it worth talking about one prominent one

Misconception

The SEM tells you how far away the sample mean is (likely) to be from the actual population mean

But it doesn’t tell you anything about the sample mean… at least not your sample mean that you have calculated for your particular sample

The standard error of mean

The standard error of the mean is just what we’re defined it as:

The standard deviation of the sampling distribution

So what does this tell you?

  • It tells you how far on averages sample means (not your sample mean) will be from the population mean

  • Your sample mean might be close to the population mean, it might be far away from the population mean. But the SEM doesn’t quantity this

Your sample mean is either close or it is far from population mean

  • The SEM tells you something about the consequences of a sampling process

  • Not something about your sample

So why is it even useful? More on that next week!