Virtual Laboratories > Hypothesis Testing > 1 2 3 4 5 6 [7]
Suppose that we have a random experiment with a random variable X of interest. Assume additionally that X is discrete with density function f on a finite set S. We repeat the experiment n times go generate a random sample of size n from the distribution of X:
X1, X2, ..., Xn.
Recall that these are independent variables, each with the distribution of X.
In this section, we assume that the distribution of X is unknown. For a given density function f0, we will test the hypotheses
H0: f = f0 versus H1:
f f0,
The test that we will construct is known as the goodness of fit test for the conjectured density f0. As usual, our challenge in developing the test is to find a good test statistic--one that gives us information about the hypotheses and whose distribution, under the null hypothesis, is known, at least approximately.
Suppose that S = {x1, x2, ..., xk}. To simplify the notation, let
pj = f0(xj) for j = 1, 2, ..., k.
Now let Nj = #{i in {1, 2, ..., n}: Xi = xj} for j = 1, 2, ..., k.
1. Show that under
the null hypothesis,
Exercise 1 indicates how we might begin to construct our test: for each j we can compare the observed frequency of xj (namely Nj) with the expected frequency of value xj (namely npj), under the null hypothesis. Specifically, our test statistic will be
V = (N1 - np1)2 / np1 + (N2 - np2)2 / np2 + ··· + (Nk - npk)2 / npk.
Note that the test statistic is based on the squared errors (the differences between the expected frequencies and the observed frequencies). The reason that the squared errors are scaled as they are is the following crucial fact, which we will accept without proof: Under the null hypothesis, as n increases to infinity, the distribution of V converges to the chi-square distribution with k - 1 degrees of freedom.
As usual, for m > 0 and r in (0, 1), we will let vm, r denote the quantile of order p for the chi-square distribution with k degrees of freedom. For selected values of m and r, vm, r can be obtained from the table of the chi-square distribution.
2. Show that the
following test has approximate significance level a:
Reject H0: f = f0 versus H1:
f f0, if
and only if V > vk - 1, 1 - a.
Again, the test is an approximate one that works best when n is large. Just how large n needs to be depends on the pj; the rule of thumb is that the test will work well if the expected frequencies npj are at least 1 and at least 80% are at least 5.
Let an indicator variable I takes the value 1 when the null hypothesis is rejected and the value 0 when it is not rejected.
3. Suppose that
the sampling and test distributions are the same. Explain why
4. Suppose that
the sampling and test distributions are different. Explain why
In the simulation exercises below, you will be able to judge the quality of the test empirically.
5. In the
chi-square dice experiment, set the sampling distribution to fair, the sample size to 50,
and the significance level to 0.1. Set the test distribution as indicated below and in
each case, run the simulation 1000 times. In case (a), give the empirical estimate of the
significance level of the test and compare with 0.1. In the other cases, give the
empirical estimate of the power of the test. Rank the distributions in (b)-(d) in
increasing order of apparent power. Do your results seem reasonable?
6.
In the chi-square dice experiment, set the
sampling distribution to ace-six flats, the sample size to 50, and the significance level
to 0.1. Set the test distribution as indicated below and in each case, run the simulation
1000 times. In case (a), give the empirical estimate of the significance level of the test
and compare with 0.1. In the other cases, give the empirical estimate of the power of the
test. Rank the distributions in (b)-(d) in increasing order of apparent power. Do your
results seem reasonable?
7.
In the
chi-square dice experiment, set the
sampling distribution to the symmetric, unimodal distribution, the sample size to 50, and
the significance level to 0.1. Set the test distribution as indicated below and in each
case, run the simulation 1000 times. In case (a), give the empirical estimate of the
significance level of the test and compare with 0.1. In the other cases, give the
empirical estimate of the power of the test. Rank the distributions in (b)-(d) in
increasing order of apparent power. Do your results seem reasonable?
8.
In the
chi-square dice experiment, set the
sampling distribution to the distribution skewed right, the sample size to 50, and the
significance level to 0.1. Set the test distribution as indicated below and in each case,
run the simulation 1000 times. In case (a), give the empirical estimate of the
significance level of the test and compare with 0.1. In the other cases, give the
empirical estimate of the power of the test. Rank the distributions in (b)-(d) in
increasing order of apparent power. Do your results seem reasonable?
9.
Suppose
that D1 and D2 are different distributions. Is the
power of the test with sampling distribution D1 and test distribution D2
the same as the power of the test with sampling distribution D2 and
test distribution D1? Make a conjecture based on your results in
Exercises 5-8.
10.
In the
chi-square dice experiment, set the
sampling and test distributions to fair and the significance level to 0.05. Run the
experiment 1000 times for each of the following sample sizes. In each case, give the
empirical estimate of the significance level and compare with 0.05.
11.
In the
chi-square dice experiment, set the
sampling distribution to fair, the test distributions to ace-six flats, and the
significance level to 0.05. Run the experiment 1000 times for each of the following sample
sizes. In each case, give the empirical estimate of the power of the test. Do the powers
seem to be converging?
For a descriptive goodness of fit test, see the section on Probability Plots in the chapter on Random Samples.