Virtual Laboratories > Hypothesis Testing > 1 2 3 4 [5] 6 7

5. Tests in the Two-Sample Normal Model


In this section, we will study hypothesis tests in the two-sample normal model and in the bivariate normal model. This section parallels the section on Estimation in the Two Sample Normal Model in the chapter on Interval Estimation.

The Two-Sample Normal Model

Suppose first that X = (X1, X2, ..., Xn1) is a random sample of size n1 from the normal distribution with mean µ1 and variance d12 and that Y = (Y1, Y2, ..., Yn2) is a random sample of size n2 from the normal distribution with mean µ2 and variance d22. Moreover, suppose that the samples X and Y are independent.

This type of situation arises frequently when the random variables represent a measurement of interest for the objects of the population, and the samples correspond to two different treatments. For example, we might be interested in the blood pressure of a certain population of patients. The X vector records the blood pressures of a control sample, while the Y vector records the blood pressures of the sample receiving a new drug. Similarly, we might be interested in the yield of an acre of corn. The X vector records the yields of a sample receiving one type of fertilizer, while the Y vector records the yields of a sample receiving a different type of fertilizer.

Usually our interest is in a comparison of the parameters (either the mean or variance) for the two sampling distributions. In this section we will construct tests for the ratio of the variances and for the difference of the means. As with previous estimation problems we have studied, the procedures vary depending on what parameters are known or unknown. Also as before, key elements in the construction of the tests are the sample means and sample variances and the special properties of these statistics when the sampling distribution is normal. We will use the following notation:

  1. M1 = (1 / n1)sumi = 1, ..., n1 Xi.
  2. W12 = (1 / n1)sumi = 1, ..., n1 (Xi - µ1)2.
  3. S12 = [1 / (n1 - 1)]sumi = 1, ..., n1 (Xi - M1)2.
  4. M2 = (1 / n2)sumi = 1, ..., n2 Xi.
  5. W22 = (1 / n2)sumi = 1, ..., n2 (Xi - µ2)2.
  6. S22 = [1 / (n2 - 1)]sumi = 1, ..., n2 (Xi - M2)2.

Tests for d22 / d12 when µ1, µ2 are Known

We will first consider tests for the ratio of the variances d22 / d12 when the means µ1, µ2 are known. Usually, of course, this is an unrealistic assumption. Our basic test statistic is

F0 = (W12 / W22)a0 where a0 > 0.

Mathematical Exercise 1. Show that if d22 / d12 = a0 then F0 has the F distribution with n1 degrees of freedom in the numerator and n2 degrees of freedom in the denominator.

For p in (0, 1) and for m > 0 and k >0, let fm, n, p denote the quantile of order p for the F distribution with m degrees of freedom in the numerator and n degrees of freedom in the denominator.

Mathematical Exercise 2. Show that the following tests have significance level r:

  1. Reject H0: d22 / d12 = a0 versus H1: d22 / d12 <> a0 if and only if F0 > fn1, n2, 1 - r/2 or F0 < fn1, n2, a/21, n2, a/2.
  2. Reject H0: d22 / d12 <= a0 versus H1: d22 / d12 > a0 if and only if F0 < fn1, n2, r.
  3. Reject H0: d22 / d12 gteq.gif (844 bytes) a0 versus H1: d22 / d12 < a0 if and only if F0 > fn1, n2, 1 - r.

Mathematical Exercise 3. For each of the tests in Exercise 2, show that we fail to reject H0 at significance level r if and only if a0 is in the corresponding 1 - r level confidence interval.

Tests for d22 / d12 when µ1, µ2 are Unknown

Next we will first consider tests for the ratio of the variances d22 / d12 under the more realistic assumption that the means µ1, µ2 are unknown. In this case, our basic test statistic is

F0 = (S12 / S22)a0 where a0 > 0.

Mathematical Exercise 4. Show that if d22 / d12 = a0 then F0 has the F distribution with n1 - 1 degrees of freedom in the numerator and n2 - 1 degrees of freedom in the denominator.

Mathematical Exercise 5. Show that the following tests have significance level r:

  1. Reject H0: d22 / d12 = a0 versus H1: d22 / d12 <> a0 if and only if F0 > fn1 - 1, n2 - 1, 1 - r/2 or F0 < fn1 - 1, n2 - 1, r/2.
  2. Reject H0: d22 / d12 <= a0 versus H1: d22 / d12 > a0 if and only if F0 < fn1 - 1, n2 - 1, r.
  3. Reject H0: d22 / d12 gteq.gif (844 bytes) a0 versus H1: d22 / d12 < a0 if and only if F0 > fn1 - 1, n2 - 1, 1 - r.

Mathematical Exercise 6. For each of the tests in Exercise 5, show that we fail to reject H0 at significance level r if and only if a0 is in the corresponding 1 - r level confidence interval.

Tests for µ2 - µ1 when d1, d2 are Known

Next we will consider the estimation problem for the difference of the means µ2 - µ1 under the assumption that the standard deviations d1 and d2 are known. Of course, this is usually an unrealistic assumption. Our test statistic is

Z0 = [(M2 - M1) - a0] / (d12 / n1 + d22 / n2)1/2.

Mathematical Exercise 7. Show that Z0 has the normal distribution with mean a0 - (µ2 - µ1) and variance 1.

As usual, let zp denote the quantile of order p for the standard normal distribution. For selected values of p, values of zp can be obtained from the quantile applet.

Mathematical Exercise 8. Show that the following tests have significance level r:

  1. Reject H0: µ2 - µ1 = a0 versus H1: µ2 - µ1 <> a0 if and only if Z0 > z1 - r / 2 or Z0 < -z1 - r / 2.
  2. Reject H0: µ2 - µ1 <= a0 versus H1: µ2 - µ1 > a0 if and only if Z0 > z1 - r.
  3. Reject H0: µ2 - µ1 gteq.gif (844 bytes) a0 versus H1: µ2 - µ1 < a0 if and only if Z0 < -z1 - r.

Mathematical Exercise 9. For each of the tests in Exercise 8, show that we fail to reject H0 at significance level r if and only if a0 is in the corresponding 1 - r level confidence interval.

Tests for µ2 - µ1 when d1, d2 are Unknown

Finally, we will consider tests for the difference of the means under the more realistic assumption that the standard deviations d1 and d2 are unknown, but equal:

d1 = d2 = d.

This assumption is reasonable if there is an inherent variability in the measurement variables that does not change even when different treatments are applied to the objects in the population. Recall that the pooled estimate of the common variance d2 is

S2 = [(n1 - 1)S12 + (n2 - 1)S22] / (n1 + n2 - 2).

Our basic test statistic is

T0 = [(M2 - M1) - a0] / [S (1 / n1 + 1 / n2)1/2].

Mathematical Exercise 10. Show that if µ2 - µ1 = a0, then T0 has the t-distribution with n = n1 + n2 - 2 degrees of freedom.

As usual, for k > 0 and p in (0, 1) let tk, p denote the quantile of order p for the t distribution with k degrees of freedom. For selected values of k and p, values of tk, p are given in the quantile applet.

Mathematical Exercise 11. Show that the following tests have significance level r:

  1. Reject H0: µ2 - µ1 = a0 versus H1: µ2 - µ1 <> a0 if and only if T0 > tn, 1 - r / 2 or T0 < -tn, 1 - r / 2.
  2. Reject H0: µ2 - µ1 <= a0 versus H1: µ2 - µ1 > a0 if and only if T0 > tn, 1 - r.
  3. Reject H0: µ2 - µ1 gteq.gif (844 bytes) a0 versus H1: µ2 - µ1 < a0 if and only if T0 < -tn, 1 - r.

Mathematical Exercise 12. For each of the tests in Exercise 11, show that we fail to reject H0 at significance level a if and only if a0 is in the corresponding 1 - r level confidence interval.

Tests in the Bivariate Normal Model

In this subsection, we consider a model that is superficially similar to the two-sample normal model, but is actually much simpler. Suppose that

(X1, Y1), (X2, Y2), ..., (Xn, Yn)

is a random sample of size n from the bivariate normal distribution with

E(X) = µ1, E(Y) = µ2, var(X) = d12, var(Y) = d22, cov(X, Y) = d1,2.

Thus, instead of a pair of samples, we have a sample of pairs. This type of model frequently arises in before and after experiments, in which a measurement of interest is recorded for a sample of n objects from the population, both before and after a treatment. For example, we could record the blood pressure of a sample of n patients, before and after the administration of a certain drug.

Mathematical Exercise 13. Show that Y1 - X1, Y2 - X2, ..., Yn - Xn is a random sample of size n from the normal distribution with mean µ2 - µ1 and variance d2 = d12 + d22 - 2d1,2.

Thus, the differences fit the one-sample normal model that we have already studied. In particular, for tests of µ2 - µ1, see the section on Tests of the Mean in the Normal Model and for tests of d2, see the section on Tests of the Variance in the Normal Model.

Computational Exercises

Mathematical Exercise 14. A new drug is being developed to reduce a certain blood chemical. A sample of 36 patients are given a placebo while a sample of 49 patients are given the drug. The statistics (in mg) are m1 = 87, s1 = 4, m2 = 63, s2 = 6. Test the following at the 10% significance level:

  1. H0: d1 = d2 versus H1: d1 <>d2.
  2. H0: µ1 <= µ2 versus H1: µ1 > µ2 (assuming the d1 = d2).
  3. Based on (b), is the drug effective?

Mathematical Exercise 15. A company claims that an herbal supplement improves intelligence. A sample of 25 persons are given a standard IQ test before and after taking the supplement. The before and after statistics are m1 = 105, s1 = 13, m2 = 110, s2 = 17, s12 = 190. At the 10% significance level, do you believe the company's claim?

Data Analysis Exercise 16. In Fisher's iris data, consider the petal length variable for the samples of Versicolor and Virginica irises. Test the following at the 10% significance level:

  1. H0: d1 = d2 versus H1: d1 <>d2.
  2. H0: µ1 <= µ2 versus H1: µ1 > µ2 (assuming the d1 = d2).

Mathematical Exercise 17. A plant has two machines that produce a circular rod whose diameter (in cm) is critical. A sample of 100 rods from the first machine as mean 10.3 and standard deviation 1.2. A sample of 100 rods from the second machine has mean 9.8 and standard deviation 1.6.

  1. H0: d1 = d2 versus H1: d1 <>d2.
  2. H0: µ1 = µ2 versus H1: µ1 <> µ2 (assuming the d1 = d2).