Introduction

1. Introduction

The Basic Statistical Model

As usual, our starting point is a random experiment with a sample space and a probability measure P. In the basic statistical model, we have an observable random variable X taking values in a set S. In general, X can have quite a complicated structure. For example, if the experiment is to sample n objects from a population and record various measurements of interest, then

X = (X₁, X₂, ..., X_n)

where X_i is the vector of measurements for the i'th object. The most important special case occurs when X₁, X₂, ..., X_n, are independent and identically distributed. In this case, we have a random sample of size n from the common distribution.

Suppose also that the distribution of X depends on a parameter a taking values in a parameter space A. Usually, a is a vector of real parameters, so that and A is a subset of R^k for some k and

a = (a₁, a₂, ..., a_k).

Confidence Sets

A confidence set is a subset A(X) of the parameter space A that depends only on the data variable X, and no unknown parameters. Thus, in a sense, a confidence set is a set-valued statistic. A confidence set is an estimate of a in the sense that we hope a is in A(X) with high probability. In particular, the confidence level is the smallest probability that a is in A(X):

min{P[a A(X) | a]: a A}.

Usually, we try to construct a confidence set for a with a prescribed confidence level 1 - r, where 0 < r < 1. Typical confidence levels are 0.9, 0.95, and 0.99. Sometimes the best we can do is to construct a confidence set whose confidence level is at least 1 - r; this is called a conservative 1 - r confidence set for a.

Note that when we run the experiment and observe the data x, the computed confidence set is A(x). The true value of the parameter a is either in this set, or is not, and we will usually never know. However, by the law of large numbers, if we were to repeat the confidence experiment over and over, the proportion of sets that contain a would converge to

P[a A(X) | a) 1 - r.

This is the precise meaning of the term confidence.

Next, note that the quality of a confidence set, as an estimator of a, is based on two factors: the confidence level and the size of the set; a good estimate has small size (and hence gives tight bounds on a) and large confidence. However, for a given X, there is usually a tradeoff between confidence level and size--increasing the confidence level comes only at the expense of increasing the size of the set. Finally, note that in general, the size of the set is a random variable, although in some special cases it is a constant.

In many cases, we are interested in estimating a particular real parameter b = b(a). For example, if a is a vector, b might be one of the coordinates of a; the other coordinates, would be nuisance parameters in this context. In this case, our confidence set frequently has the form

A(X) = {a A: L(X) b U(X)}

where L(X) and U(X) are statistics. In this case [L(X), U(X)] is called a (two-sided) confidence interval for b. If the confidence set has the form

A(X) = {a A: L(X) b}

then L(X) is called a confidence lower bound for b. If the confidence set has the form

A(X) = {a A: b U(X)}

then U(X) is called a confidence upper bound for b.

If we can construct a confidence interval for a parameter, then we easily construct a confidence interval for any monotone function of the parameter.

$Mathematical Exercise$ 1. Suppose that [L, U] is a 1 - r level confidence interval for b and suppose that g is a function defined on the parameter space A.

If g is increasing, show that [g(L), g(U)] is 1 - r level confidence interval for g(b).
If g is decreasing, show that [g(U), g(L)] is 1 -r level confidence interval for g(b).

$Mathematical Exercise$ 2. Suppose that L is a 1 - r₁ level confidence lower bound for a and that U is a 1 - r₂ level confidence upper bound for a. Show that if r = r₁ + r₂ < 1 then [L, U] is a conservative 1 - r level confidence interval for a. Hint: Use Bonferroni's inequality.

Pivotal Variables

You might think that it should be very difficult to construct confidence sets for a parameter c. However, in many important special cases, confidence sets can be constructed easily from certain random variables known as pivotal variables.

A pivotal variable for a is a random variable V(X, a) that is a function of the observation variable X and the parameter a, but whose distribution does not depend on a. Suppose that V(X, a) takes values in a set T. If we know the distribution of the pivot variable, then for a given r, we can hopefully find B T (that does not depend on a) such that

P[V(X, a) B | a] = 1 - r.

It then follows that a 1 - r confidence set for the parameter is given by

A(X) = {a A: V(X, a) B}.

In many cases, we have a real parameter a of interest, and the real-valued pivot variable V(x, a) is a monotone function of a for fixed x. Then, the confidence set is an interval:

$Mathematical Exercise$ 3. Show that if V(x, a) is monotone in a for each x then the confidence set is an interval of the form

[L(X, v₁), U(X, v₂)].

There are lots of ways to choose the numbers v₁ and v₂ above in the construction of the confidence set; the optimal choice is the one that minimizes the length of the confidence set is some sense. Now for for r in (0, 1), let v(r) denote the quantile of order r for the pivotal variable V(X, a) (again, this quantile does not depend on a).

$Mathematical Exercise$ 4. Suppose that 0 < p < 1. Show that v₁ = v[(1 - p)r], v₂ = v[(1 - pr)] satisfy the conditions for the construction of the confidence set.

The particular choice p = 1 / 2 corresponds to equal-tailed confidence intervals; it is the most commonly used case, and is frequently (but not always) an optimal choice. Again, for given data, there is a general tradeoff between the confidence level and the size of the confidence set.

$Mathematical Exercise$ 5. Let A(X) denote the confidence set using v₁ and v₂ of the previous exercise. Show that for fixed p and X, A(X) is decreasing in a and hence increasing in 1 - r (in the sense of the subset relation).

Pivotal variables are far from unique; the challenge is to find a pivotal quantity whose distribution is known and which gives tight bounds on the parameter.

$Mathematical Exercise$ 6. Suppose that V is a pivotal variable for a. If u is a function defined on the range of V and u involves no unknown parameters, show that U = u(V) is also a pivotal variable for a.

Location-Scale Families

In the case of location-scale families of distributions, we can easily find pivotal variables. Suppose that U is a real-valued random variable with density function g and no unknown parameters. Let

X = ľ + dU where ľ is in R and d > 0.

Recall that the density function of X is given by

f(x | ľ, d) = g[(x - ľ) / d] / d

and the corresponding family of distributions is called the location-scale family associated with the distribution of U. Now suppose that X₁, X₂, ..., X_n is a random sample of size n from the distribution of X. Recall that the sample mean and sample variance are defined, respectively, by

M = (1 / n) _{i
= 1, ..., n} X_i.
S² = [1 / ( n - 1)] _{i
= 1, ..., n} (X_i - M)².

$Mathematical Exercise$ 7. Suppose that d is known and ľ unknown. Show that (M - ľ) / d is pivotal for ľ.

$Mathematical Exercise$ 8. Suppose that ľ and d are unknown. Show that (M - ľ) / S is pivotal for ľ.

$Mathematical Exercise$ 9. Suppose that ľ is known and d unknown. Show that (M - ľ) / d is pivotal for d.

$Mathematical Exercise$ 10. Suppose that ľ and d are unknown. Show that S / d is pivotal for d.

The most important location scale family is the normal family of distributions. The problem of estimating the parameters for this family is considered in the next two sections. In this section, we will explore a few other miscellaneous problems.

The Exponential Distribution

Suppose X₁, X₂, ..., X_n that is a random sample of size n from the exponential distribution with scale parameter b > 0.

$Mathematical Exercise$ 11. Show that 2nM / b has the chi-square distribution with 2n degrees of freedom, and hence is a pivotal variable for b.

Note that the variable in Exercise 11 is a multiple of the variable in Exercise 9 (with ľ = 0). Now for p in (0, 1), let v_p denote the quantile of order p for the chi-square distribution with 2n degrees of freedom.

$Mathematical Exercise$ 12. Use the pivotal variable in previous exercise to show that a 1 - r level confidence interval, confidence upper bound, and confidence lower bound are as follows:

[2nM / v_{1 - r/2}, 2nM / v_r_/2]
2nM / v_r.
2nM / v_{1 -}_r.