Virtual Laboratories > Interval Estimation > 1 2 [3] 4 5 6

3. Estimation of the Variance in the Normal Model


Preliminaries

Suppose that X1, X2, ..., Xn is a random sample from the normal distribution with mean µ and variance d2. In this section we will construct confidence intervals for d2, one of the most important special cases of interval estimation. A parallel section on Tests for the Variance in the Normal Model is in the chapter on Hypothesis Testing.

As usual, we will construct the confidence intervals by finding pivotal variables for d2. The construction depends on whether the mean µ is known or unknown; thus µ is a nuisance parameter for the problem of estimating µ. Finally, recall that the normal family is a location-scale family.

Confidence Intervals for d2 when µ is Known

Suppose first that µ is known, although this is usually an artificial assumption in applications. Recall that in this case, the natural estimator of d2 is

W2 =(1 / n) sumi = 1, ..., n (Xi - µ)2.

Recall also that V = nW2 / d2 has the chi-square distribution with n degrees of freedom, and hence is a pivotal variable for d2. Now for k > 0 and p in (0, 1), let vk, p denote the quantile of order p for the chi-square distribution with k degrees of freedom. For selected values of k and p and n, vk, p can be obtained from the table of the chi-square distribution or from the quantile applet.

Mathematical Exercise 1. Use the pivotal variable V to show that a 1 - r confidence interval, confidence upper bound, and confidence lower bound are given as follows:

  1. [nW2 / vn, 1 - r/2, nW2 / vn, r/2].
  2. nW2 / vn, r.
  3. nW2 / vn, 1 - r.

Note that we have used the equal-tail choice in the construction of the two-sided interval, but the interval is not symmetric about the sample variance W2 (unlike the confidence intervals for µ, which are always symmetric about the sample mean M).

Confidence Intervals for d2 when µ is Unknown

Consider now the more realistic case in which µ, as well as d2, is unknown. In this case, the sample variance is

S2 = [1 / (n - 1)] sumi = 1, ..., n (Xi - M)2.

where M = (1 / n) sumi = 1, ..., n Xi is the sample mean. Recall that

V = (n - 1)S2 / d2

has the chi-square distribution with n - 1 degrees of freedom, and hence is a pivotal variable for d2.

Mathematical Exercise 2. Use the pivotal variable V to show that a 1 - r confidence interval, confidence upper bound, and confidence lower bound are given as follows:

  1. [(n - 1)S2 / vn-1, 1 - r/2, (n - 1)S2 / vn-1, r/2].
  2. (n - 1)S2 / vn-1, r.
  3. (n - 1)S2 / vn-1, 1 - r.

Simulation Exercise 3. Use variance estimation experiment to explore the procedure. Select the normal distribution. Use various parameter values, confidence levels, sample sizes, and interval types. For each configuration, run the experiment 1000 times with an update frequency of 10. As the simulation runs, note that the confidence interval successfully captures the standard deviation if and only if the value of the pivot variable is between the quantiles. Note the size and location of the confidence intervals and note how well the proportion of successful intervals approximates the theoretical confidence level.

Non-Normal Distributions

One of the key assumptions that we made was that the underlying distribution is normal. Of course, in real statistical problems, we are unlikely to know much about the underlying distribution, let alone whether or not it is normal. Even when the underlying distribution is not normal, the procedures of this section are still used to construct approximate confidence intervals for the variance. You will see in the simulation exercises below that this procedure is not nearly as robust as that of constructing interval estimates for the mean. Nonetheless, if the distribution is not too far from normal, the procedure usually works well.

Simulation Exercise 4. In variance estimation experiment, select the gamma distribution. Use various parameter values, confidence levels, sample sizes, and interval types. For each configuration, run the experiment 1000 times with an update frequency of 10. Note the size and location of the confidence intervals and note how well the proportion of successful intervals approximates the theoretical confidence level.

Simulation Exercise 5. In variance estimation experiment, select the uniform distribution. Use various parameter values, confidence levels, sample sizes, and interval types. For each configuration, run the experiment 1000 times with an update frequency of 10. Note the size and location of the confidence intervals and note how well the proportion of successful intervals approximates the theoretical confidence level.

Computational Exercises

Mathematical Exercise 6. For both procedures, show that a 1 - a confidence interval, lower bound, and upper bound for d can be obtained by taking the square root of the corresponding confidence bounds for d2.

Mathematical Exercise 7. Suppose that the weight of a bag of potato chips (in grams) is a random variable with unknown mean µ and variance d2. A sample of 75 bags has mean 250 and standard deviation 10. Construct the 90% confidence interval for d.

Mathematical Exercise 8. At a telemarketing firm, the length of a telephone solicitation (in seconds) is a random variable with unknown mean µ and variance d2. A sample of 50 calls has mean length 300 and standard deviation 30. Construct the 95% confidence upper bound for d.

Data Analysis Exercise 9. Using Michelson's data, construct the 95% two-sided confidence interval, the confidence upper bound, and the confidence lower bound for the standard deviation of the speed of light in air. Assume that the "true value" is the known mean..

Data Analysis Exercise 10. Using Cavendish's data, construct the 95% confidence interval, confidence upper bound, and confidence lower bound for the standard deviation of the density of the earth. Assume that the "true value" is the known mean.

Data Analysis Exercise 11. Using Short's data, construct the 95% two-sided confidence interval, the confidence upper bound, and the confidence lower bound for the standard deviation of the parallax of the sun. Assume that the "true value" is the known mean.

Data Analysis Exercise 12. For the length of a Sertosa iris petal in Fisher's iris data, Construct the 90% confidence interval for d.