Virtual Laboratories > Finite Sampling Models > [1] 2 3 4 5 6 7 8 9 10

1. Introduction


The Basic Sampling Model

Suppose that we have a population D of N objects. The population could be a deck of cards, a set of people, and urn full of balls, or any number of other collections. In many cases, we simply label the objects from 1 to N, so that D = {1, 2, ..., N}. In other cases (such as the card experiment), it may be more natural to label the objects with vectors. In any case, D is a subset of Rk for some k.

Our basic experiment consists of selecting n objects from the population D at random and recording the sequence of objects chosen:

X = (X1, X2, ..., Xn), where Xi in D is the i'th object chosen.

If the sampling is with replacement, the sample size n can be any positive integer. In this case, the sample space S is

S = Dn = {(x1, x2, ..., xn): x1, x2, ..., xn in D}.

If the sampling is without replacement, the sample size n can be no larger than the population size N. In this case, the sample space S consists of all permutations of size n chosen from D:

S = Dn = {(x1, x2, ..., xn): x1, x2, ..., xn in D are distinct}.

Mathemtical Exercise 1. Show that

  1. #(Dn) = Nn.
  2. #(Dn) = (N)n = N(N - 1) ··· (N - n + 1).

With either type of sampling, we assume that the samples are equally likely and thus that the outcome variable X is uniformly distributed on S; this is the meaning of the phrase random sample:

P(X in A) = #(A) / #(S) for A subset S.

Examples and Special Cases

We will be particularly interested in the following special models:

  1. A dichotomous population consists of two types of objects. For example, we might have an urn with balls that are either red or green, a batch of electronic components that are either good or defective, a population of people who are either male or female, or a population of of animals that are either tagged or untagged.
  2. More generally, a multi-type population consists of objects of k different types. For example, a group of voters might consist of democrats, republicans, and independents, or an urn could have balls of several different colors.
  3. A standard deck of cards can be modeled by D = {1, 2, ..., 13} × {0, 1, 2, 3}, where the first coordinate encodes the denomination (ace, 2-10, jack, queen, king) and where the second coordinate encodes the suit (clubs, diamonds, hearts, spades). The general card experiment consists of drawing n cards at random and without replacement from the deck D. Thus, the i'th card is Xi = (Yi, Zi) where Yi is the denomination and Zi is the suit. The special case n = 5 is the poker experiment and the special case n = 13 is the bridge experiment.
  4. Rolling n fair, six-sided dice is equivalent to choosing a random sample of size n with replacement from the population D = {1, 2, 3, 4, 5, 6}. Generally, selecting a random sample of size n with replacement from D = {1, 2, ..., N} is equivalent to rolling n fair, N-sided dice.
  5. Supposes that we select n persons at random and record their birthdays. If we assume that birthdays are uniformly distributed throughout the year, and if we ignore leap years, then this experiment is equivalent to selecting a sample of size n with replacement from D = {1, 2, ..., 365}. Similarly, we could record birth months or birth weeks.
  6. Suppose that we distribute n distinct balls into N distinct cells at random. This experiment also fits the basic model, where D is the population of cells and Xi is the cell containing the i'th ball. Sampling with replacement means that a cell may contain more than one ball; sampling without replacement means that a cell may contain at most one ball.
  7. Suppose that when we purchase a certain product (bubble gum, or cereal for example), we receive a coupon (a baseball card or small toy, for example), which is equally likely to be any one of N types. We can think of this experiment as sampling with replacement from the population of coupon types; Xi is the coupon that we receive on the i'th purchase.

The Exchangeable Property

Let us return to the general model of selecting n objects at random from the population D, either with or without replacement.

Mathemtical Exercise 2. Show that any permutation of (X1, X2, ..., Xn) has the same distribution as (X1, X2, ..., Xn) itself (namely the uniform distribution on the appropriate sample space S).

A sequence of random variables with the property in the last exercise is said to be exchangeable. Although this property is very simple to understand, both intuitively and mathematically, it is nonetheless very important. We will use the exchangeable property often in this chapter.

Mathemtical Exercise 3. Show that any sequence of m of the n outcome variables is uniformly distributed on the appropriate sample space:

  1. Dm if the sampling is with replacement.
  2. Dm if the sampling is without replacement.

In particular, for either sampling method, Xi is uniformly distributed on D for each i.

Mathemtical Exercise 4. Show that if the sampling is with replacement, X1, X2, ..., Xn are independent.

Thus, when the sampling is with replacement, the sample variables form a random sample from the uniform distribution, in the technical sense.

Mathemtical Exercise 5. Show that if the sampling is without replacement, then the conditional distribution of a sequence of m of the outcome variables given a sequence of j other outcome variables is the uniform distribution on the set of permutations of size m chosen from the population when the j known objects are removed (of course, m + j cannot exceed n).

In particular, Xi and Xj are dependent for any distinct i and j when the sampling is without replacement.

The Unordered Sample

In many cases, particularly when the sampling is without replacement, the order in which the objects are chosen is not important; all that matters is the (unordered) set of objects:

W = {X1, X2, ..., Xn}.

Suppose first that the sampling is without replacement. In this case, W takes values in the set of combinations of size n chosen from D:

T = {{x1, x2, ..., xn}: x1, x2, ..., xn in D are distinct}.

Mathemtical Exercise 6. Show that #(T) = C(N, n)

Mathemtical Exercise 7. Show that W is uniformly distributed over T:

P(W in B) = #(B) / #(T) = #(B) / C(N, n) for B subset T.

Hint: For any combination of size n from D, there are n! permutations of size n.

If the sampling is with replacement, W takes values in the collection of subsets of D, of size from 1 to n:

T = {{x1, x2, ..., xn}: x1, x2, ..., xn in D}.

Mathemtical Exercise 8. Show that #(T) = C(N + n - 1, n).

Mathemtical Exercise 9. Show that W is not uniformly distributed on T.

Computational Exercises

Mathematical Exercise 10. Suppose that a sample of size 2 is chosen from the population {1, 2, 3, 4, 5, 6}. Explicitly list all

  1. Ordered samples, with replacement.
  2. Ordered samples, without replacement.
  3. Unordered samples, with replacement.
  4. Unordered samples, without replacement.

Mathemtical Exercise 11. In the card experiment with n = 5 cards (poker), show that there are

  1. 311,875,200 ordered hands
  2. 2,598,960 unordered hands

Mathemtical Exercise 12. In the card experiment with n = 13 cards (bridge), show that there are

  1. 3,954,242,643,911,239,680,000 ordered hands
  2. 635,013,559,600 unordered hands

Simulation Exercise 13. In the card experiment, set n = 3. Run the simulation 5 times and on each run, list all of the (ordered) sequences of cards that would give the same unordered hand as the one you observed.

Mathemtical Exercise 14. In the card experiment, show that

  1. Yi is uniformly distributed on {1, 2, ..., 13} for each i.
  2. Zi is uniformly distributed {0, 1, 2, 3}for each i.

Mathemtical Exercise 15. In the card experiment, show that Yi and Zj are independent for any i and j.

Mathemtical Exercise 16. In the card experiment, show that (Y1, Y2), (Z1, Z2) are dependent. Compare this result with the previous exercise.

Mathematical Exercise 17. Suppose that a sequence of 5 cards is dealt.

  1. Find the probability that the third card is a spade.
  2. Find the probability that the second and fourth cards are queens.
  3. Find the conditional probability that the second card is a heart given that the fifth card is a heart.
  4. Find the probability that the third card is a queen and the fourth card is a heart.

Simulation Exercise 18. Run the card experiment 500 times, updating after each run. Compute the relative frequency corresponding to each probability in the previous exercise.

Mathematical Exercise 19. Find the probability that a bridge hand will contain no card of denomination 10, jack, queen, king, or ace. Such a hand is called a Yarborough, in honor of the Earl of Yarborough.

The Key Problem

Suppose that a person has n keys, only one of which opens a certain door. The person tries the keys at random. We will let N denote the trial number when the person finds the correct key.

Mathemtical Exercise 20. Suppose that unsuccessful keys are discarded (the rational thing to do, of course). Show that

  1. P(N = i) = 1 / n for i = 1, 2, ..., n. Thus, N has the uniform distribution on {1, 2, ..., n}.
  2. E(N) = (n + 1) / 2.
  3. var(N) = (n2 - 1) / 12.

Mathemtical Exercise 21. Suppose that unsuccessful keys are not discarded (perhaps the person has had a bit too much to drink). Show that

  1. P(N = i) = [(n - 1) / n]i - 1(1 / n) for i = 1, 2, ... Thus, N has the geometric distribution on {1, 2, ...} with parameter 1 / n.
  2. E(N) = n.
  3. var(N) = n(n - 1).