Virtual Laboratories > Random Samples > 1 2 3 4 [5] 6 7 8 9

5. The Central Limit Theorem


Statement of the Theorem

The central limit theorem and the law of large numbers are the two fundamental theorems of probability. Roughly, the central limit theorem states that the distribution of the sum of a large number of independent, identically distributed variables will be approximately normal, regardless of the underlying distribution. The importance of the central limit theorem is hard to overstate; indeed it is the reason that many statistical procedures work.

As usual, we start with a basic random experiment that has a sample space and a probability measure P. Suppose that X is a real-valued random variable with mean µ and standard deviation d respectively (which we assume are finite). Now suppose that we repeat the experiment over and over to form a sequence of independent random variables, each with the same distribution as X (that is, we sample from the distribution of X):

X1, X2, X3, ...

Let Yn = sumi = 1, ..., n Xi denote the n'th partial sum. Note that Mn = Yn / n is the sample mean of the first n sample variables.

Mathematical Exercise 1. Show that if X has density f then the density of Yn is f*n, the n-fold convolution of f.

Simulation Exercise 2. In the dice experiment, select the sum random variable. For each die distribution, start with n = 1 die and increase the number of dice by one until you get to n = 20 dice. Note the shape and location of the density function at each stage. With 20 dice, run the simulation 1000 times with an update frequency of 10. Note the apparent convergence of the empirical density function to the true density function.

In the last exercise, you should have been struck by the fact that the density of the sum becomes increasingly bell-shaped, as the sample size increases, regardless of the shape of the underlying density. Even more remarkably, this phenomenon is not just qualitative: one particular family of density functions (the normal family) describes the limiting distribution of the sum, regardless of the basic distribution we start with.

Mathematical Exercise 3. Show (again!) that

  1. E(Yn) = nµ.
  2. var(Yn) = nd2.

Simulation Exercise 4. In the dice experiment, select the sum random variable. For each die distribution, start with n = 1 die and increase the number of dice by one until you get to n = 20 dice. Note the shape and location of the density function, and the scale on the horizontal and vertical axes, at each stage. With 20 dice, run the simulation 1000 times with an update frequency of 10. Note the apparent convergence of the empirical density function to the true density function.

We will now make the central limit theorem precise. From Exercise 3, we cannot expect Yn itself to have a limiting distribution; the variance of Yn grows to infinity and, unless µ = 0, the mean drifts to either infinity (if µ > 0) or to negative infinity (if µ < 0). Thus, to obtain a limiting distribution that is not degenerate, we need to consider, not Yn itself, but the standard score of Yn. Thus, let

Zn = (Yn - nµ) / (n1/2 d).

Mathematical Exercise 5. Show that E(Zn) = 0 and var(Zn) = 1.

Mathematical Exercise 6. In the definition of Zn, divide the numerator and denominator by n to show that Zn is also the standard score of the sample mean Mn.

The central limit theorem states that the distribution of the standard score Zn converges to the standard normal distribution as n increases to infinity.

Proof of the Central Limit Theorem

We need to show that

Fn(z) converges to F(z) as n converges to infinity for each z in R,

where Fn is the distribution function of Zn and F the distribution function of the standard normal distribution. However we will show instead that

Gn(t) converges to exp(t2 / 2) as n converges to infinity for each t in R.

where Gn is the moment generating function of Zn and the expression on the right is the moment generating function of the standard normal distribution. This is a slightly less general version of the central limit theorem, because it requires that the moment generating function of the underlying distribution be finite on an interval about 0. For a proof of the general version, see for example, Probability and Measure by Patrick Billingsley.

The following exercises make up the proof of the central limit theorem. Ultimately, the proof hinges on a generalization of a famous limit from calculus.

Mathematical Exercise 7. Suppose that an converges to a as n converges to infinity. Show that

(1 + an / n)n converges to ea as n converges to infinity.

Now let

Note that g is the moment generating function of the standard score of a sample variable Xi and Gn is the moment generating function of the standard score Zn.

Mathematical Exercise 8. Show that

  1. g(0) = 1
  2. g'(0) = 0
  3. g''(0) = 1

Mathematical Exercise 9. Show that

Zn = (1 / n1/2) sumi = 1, ..., n [(Xi - µ) / d].

Mathematical Exercise 10. Use properties of moment generating functions to show that

Gn(t) = [g(t / n1/2)]n.

Mathematical Exercise 11. Use Taylor's theorem with remainder to show that

g(t / n1/2) = 1 + g''(sn) t2 /(2n) where |sn| lteq.gif (846 bytes) |t| / n1/2.

Mathematical Exercise 12. In the context of previous exercise, show that

sn converges to 0 and hence g''(sn) converges to 1 as n converges to infinity.

Mathematical Exercise 13. Finally, show that

Gn(t) = [1 + g''(sn) t2 / (2n)]n converges to exp(t2 / 2) as n converges to infinity.

Normal Approximations

The central limit theorem implies that if the sample size n is "large," then the distribution of the partial sum Yn (or equivalently the sample mean Mn) is approximately normal. This fact is of fundamental importance, because it means that we can approximate the distribution of certain statistics, even if we know very little about the underlying sampling distribution.

Of course, the term "large" is relative. Roughly, the more "abnormal" the basic distribution, the larger n must be for normal approximations to work well. The rule of thumb is that a sample size n of at least 30 will suffice; although for many distributions smaller n will do.

Mathematical Exercise 14. Suppose that X1, X2, ..., X30 is a random sample of size 30 from the uniform distribution on (0, 1). Let Y = X1 + X2 + ··· + X30. Find normal approximations to

  1. P(13 < Y < 18).
  2. The 90th percentile of Y.

Mathematical Exercise 15. Let M denote the sample mean from a random sample of size 50 from the distribution with density function f(x) = 3x-4, x > 0. Find normal approximations to

  1. P(M > 1.6).
  2. The 60th percentile of M.

A slight technical problem arises when the sampling distribution is discrete. In this case, the partial sum also has a discrete distribution, and hence we are approximating a discrete distribution with a continuous one.

Mathematical Exercise 16. Suppose that X takes integer values and hence so doe the partial sum Yn. Show that for any h in (0, 1], the event {k - h < Yn < k + h}is equivalent to the event {Yn = k}

In the context of the previous exercise, different values of h lead to different normal approximations, even though the events are equivalent. The smallest approximation would be 0 when h = 0, and the approximations increase as h increases. It is customary to split the difference by using h = 0.5 for the normal approximation. This is sometimes called the continuity correction. The continuity correction is extended to other events in the natural way, using the additivity of probability.

Mathematical Exercise 17. Let Y denote the sum of the scores of 20 fair dice. Compute the normal approximation to

P(60 <= Y <= 75).

Simulation Exercise 18. In the dice experiment, set the die distribution to fair, select the sum random variable Y, and set n = 20. Run the simulation 1000 times, updating every 10 runs. Compute the following and compare with the result in the previous exercise:

  1. P(60 <= Y <= 75).
  2. The relative frequency of the event {60 <= Y20 <= 75}

Normal Approximation to the Gamma Distribution

If Y has the gamma distribution with shape parameter k and scale parameter b, and if k is a positive integer, then

Y = sumi = 1, ..., n Xi

where X1, X2, ..., Xk are independent and each has the exponential distribution with scale parameter b. It follows that if k is large (and not necessarily integer), the gamma distribution can be approximated by the normal distribution with mean kb and variance kb2.

Simulation Exercise 19. In the gamma experiment, vary k and r and note the shape of the density function. With k = 10 and b = 2, run the experiment 1000 times with an update frequency of 10 and note the apparent convergence of the empirical density function to the true density function.

Mathematical Exercise 20. Suppose that Y has the gamma distribution with shape parameter k = 10 and scale parameter b = 2. Find normal approximations to

  1. P(18 < Y < 23).
  2. The 80th percentile of Y.

Normal Approximation to the Chi-Square Distribution

The chi-square distribution with n degrees of is the gamma distribution with parameters k = n / 2 and 1 / 2. From the central limit theorem, if n is large the chi-square distribution can be approximated by the normal distribution with mean n and variance 2n.

Simulation Exercise 21. In the chi-square experiment, vary n and note the shape of the density function. With n = 20, run the experiment 1000 times with an update frequency of 10 and note the apparent convergence of the empirical density function to the true density function.

Mathematical Exercise 22. Suppose that Y has the chi-square distribution with n = 20 degrees of freedom. Find normal approximations to

  1. P(18 < Y < 25).
  2. The 75th percentile of Y.

Normal Approximation to the Binomial Distribution

If X has the binomial distribution with parameters n and p, then

X = sumi = 1, ..., n Ii

where I1, I2, ..., In are independent indicator variables with P(Ij = 1) = p for each j. It follows that if n is large, the binomial distribution with parameters n and p can be approximated by the normal distribution with mean np and variance np(1 - p). The rule of thumb is that n should be large enough for np >= 5 and n(1 - p) >= 5.

Simulation Exercise 23. In the binomial timeline experiment, vary n and p and note the shape of the density function. With n = 50 and p = 0.3, run the simulation 1000 times, updating every 10 runs. Compute the following:

  1. P(12 <= X <= 16)
  2. The relative frequency of the event {12 <= X <= 16}.

Mathematical Exercise 24. Suppose that X has the binomial distribution with parameters n = 50 and p = 0.3. Compute the normal approximation to P(12 <= X <= 16) and compare with the results of the previous exercise.

Normal Approximation to the Poisson Distribution

If Y has the Poisson distribution with mean n, then

Y = sumi = 1, ..., n Xi

where X1, X2, ..., Xk are independent and each has the Poisson distribution with mean 1. It follows from the central limit theorem that if µ is large (and not necessarily integer), the Poisson distribution with parameter µ can be approximated by the normal distribution with mean µ and variance µ.

Mathematical Exercise 25. Suppose that Y has the Poisson distribution with mean 20. Find the normal approximation to

P(16 <= Y <= 13)