Virtual Laboratories > Random Samples > 1 2 3 4 [5] 6 7 8 9
The central limit theorem and the law of large numbers are the two fundamental theorems of probability. Roughly, the central limit theorem states that the distribution of the sum of a large number of independent, identically distributed variables will be approximately normal, regardless of the underlying distribution. The importance of the central limit theorem is hard to overstate; indeed it is the reason that many statistical procedures work.
As usual, we start with a basic random experiment that has a sample space and a probability measure P. Suppose that X is a real-valued random variable with mean µ and standard deviation d respectively (which we assume are finite). Now suppose that we repeat the experiment over and over to form a sequence of independent random variables, each with the same distribution as X (that is, we sample from the distribution of X):
X1, X2, X3, ...
Let Yn = i
= 1, ..., n Xi denote the n'th partial sum. Note that Mn =
Yn / n is the sample mean of
the first n sample variables.
1. Show that
if X has density f then the density of Yn is f*n, the n-fold
convolution of f.
2. In the
dice
experiment, select the sum random variable. For each die
distribution, start with n = 1 die and increase the number of dice by one until
you get to n = 20 dice. Note the shape and location of the density function at
each stage. With 20 dice, run the simulation 1000 times with an update frequency of 10.
Note the apparent convergence of the empirical density function to the true density
function.
In the last exercise, you should have been struck by the fact that the density of the sum becomes increasingly bell-shaped, as the sample size increases, regardless of the shape of the underlying density. Even more remarkably, this phenomenon is not just qualitative: one particular family of density functions (the normal family) describes the limiting distribution of the sum, regardless of the basic distribution we start with.
3. Show (again!)
that
4. In the
dice
experiment, select the sum random variable. For each die
distribution, start with n = 1 die and increase the number of dice by one until
you get to n = 20 dice. Note the shape and location of the density function,
and the scale on the horizontal and vertical axes, at
each stage. With 20 dice, run the simulation 1000 times with an update frequency of 10.
Note the apparent convergence of the empirical density function to the true density
function.
We will now make the central limit theorem precise. From Exercise 3, we cannot expect Yn itself to have a limiting distribution; the variance of Yn grows to infinity and, unless µ = 0, the mean drifts to either infinity (if µ > 0) or to negative infinity (if µ < 0). Thus, to obtain a limiting distribution that is not degenerate, we need to consider, not Yn itself, but the standard score of Yn. Thus, let
Zn = (Yn - nµ) / (n1/2 d).
5. Show that E(Zn)
= 0 and var(Zn) = 1.
6. In the
definition of Zn, divide the numerator and denominator by n
to show that Zn is also the standard score of the sample mean Mn.
The central limit theorem states that the distribution of the standard score Zn converges to the standard normal distribution as n increases to infinity.
We need to show that
Fn(z)
F(z) as n
for each z in R,
where Fn is the distribution function of Zn and F the distribution function of the standard normal distribution. However we will show instead that
Gn(t) exp(t2 / 2) as n
for each t in R.
where Gn is the moment generating function of Zn and the expression on the right is the moment generating function of the standard normal distribution. This is a slightly less general version of the central limit theorem, because it requires that the moment generating function of the underlying distribution be finite on an interval about 0. For a proof of the general version, see for example, Probability and Measure by Patrick Billingsley.
The following exercises make up the proof of the central limit theorem. Ultimately, the proof hinges on a generalization of a famous limit from calculus.
7. Suppose that an
a as n
. Show that
(1 + an / n)n ea as n
.
Now let
Note that g is the moment generating function of the standard score of a sample variable Xi and Gn is the moment generating function of the standard score Zn.
8. Show that
9. Show
that
Zn = (1 / n1/2) i
= 1, ..., n [(Xi - µ) / d].
10. Use properties
of moment generating functions to show that
Gn(t) = [g(t / n1/2)]n.
11. Use Taylor's
theorem with remainder to show that
g(t / n1/2) = 1 + g''(sn)
t2 /(2n) where |sn| |t| / n1/2.
12. In the context
of previous exercise, show that
sn 0
and hence g''(sn)
1 as n
.
13. Finally, show
that
Gn(t) = [1 + g''(sn) t2
/ (2n)]n exp(t2 / 2) as n
.
The central limit theorem implies that if the sample size n is "large," then the distribution of the partial sum Yn (or equivalently the sample mean Mn) is approximately normal. This fact is of fundamental importance, because it means that we can approximate the distribution of certain statistics, even if we know very little about the underlying sampling distribution.
Of course, the term "large" is relative. Roughly, the more "abnormal" the basic distribution, the larger n must be for normal approximations to work well. The rule of thumb is that a sample size n of at least 30 will suffice; although for many distributions smaller n will do.
14. Suppose that X1,
X2, ..., X30 is a random sample of size 30 from
the uniform distribution on (0, 1). Let Y = X1 + X2 + ··· + X30.
Find normal approximations to
15. Let M
denote the sample mean from a random sample of size 50 from the distribution with density
function f(x) = 3x-4, x > 0. Find normal
approximations to
A slight technical problem arises when the sampling distribution is discrete. In this case, the partial sum also has a discrete distribution, and hence we are approximating a discrete distribution with a continuous one.
16. Suppose
that X takes integer values and hence so doe the partial sum Yn.
Show that for any h in (0, 1], the event {k - h < Yn
< k + h}is equivalent to the event {Yn = k}
In the context of the previous exercise, different values of h lead to different normal approximations, even though the events are equivalent. The smallest approximation would be 0 when h = 0, and the approximations increase as h increases. It is customary to split the difference by using h = 0.5 for the normal approximation. This is sometimes called the continuity correction. The continuity correction is extended to other events in the natural way, using the additivity of probability.
17. Let Y
denote the sum of the scores of 20 fair dice. Compute the normal approximation to
P(60 Y
75).
18. In the
dice
experiment, set the die distribution to fair, select the sum random
variable Y, and set n = 20. Run the simulation 1000 times, updating
every 10 runs. Compute the following and compare with the result in the previous exercise:
If Y has the gamma distribution with shape parameter k and scale parameter b, and if k is a positive integer, then
Y = i
= 1, ..., n Xi
where X1, X2, ..., Xk are independent and each has the exponential distribution with scale parameter b. It follows that if k is large (and not necessarily integer), the gamma distribution can be approximated by the normal distribution with mean kb and variance kb2.
19. In the
gamma
experiment, vary k and r and note the shape of
the density function. With k = 10 and b = 2, run the experiment 1000
times with an update frequency of 10 and note the apparent convergence of the empirical
density function to the true density function.
20. Suppose that Y
has the gamma distribution with shape parameter k = 10 and scale parameter b
= 2. Find normal approximations to
The chi-square distribution with n degrees of is the gamma distribution with parameters k = n / 2 and 1 / 2. From the central limit theorem, if n is large the chi-square distribution can be approximated by the normal distribution with mean n and variance 2n.
21. In the
chi-square
experiment, vary n and note the shape of the density
function. With n = 20, run the experiment 1000 times with an update frequency of
10 and note the apparent convergence of the empirical density function to the true density
function.
22. Suppose that Y
has the chi-square distribution with n = 20 degrees of freedom. Find normal
approximations to
If X has the binomial distribution with parameters n and p, then
X = i
= 1, ..., n Ii
where I1, I2, ..., In are
independent indicator variables with P(Ij = 1) = p
for each j. It follows that if n is large, the binomial distribution
with parameters n and p can be approximated by the normal distribution
with mean np and variance np(1 - p). The rule of thumb is that n
should be large enough for np
5 and n(1 - p)
5.
23. In the
binomial timeline
experiment, vary n and p and note
the shape of the density function. With n = 50 and p = 0.3, run the
simulation 1000 times, updating every 10 runs. Compute the following:
24. Suppose that X
has the binomial distribution with parameters n = 50 and p = 0.3.
Compute the normal approximation to P(12
X
16) and
compare with the results of the previous exercise.
If Y has the Poisson distribution with mean n, then
Y = i
= 1, ..., n Xi
where X1, X2, ..., Xk are independent and each has the Poisson distribution with mean 1. It follows from the central limit theorem that if µ is large (and not necessarily integer), the Poisson distribution with parameter µ can be approximated by the normal distribution with mean µ and variance µ.
25. Suppose that Y
has the Poisson distribution with mean 20. Find the normal approximation to
P(16 Y
13)