Virtual Laboratories > Bernoulli Trials > 1 2 3 4 5 [6] 7

6. The Multinomial Distribution


Multinomial Trials

A multinomial trials process is a sequence of independent, identically distributed random variables

U1, U2, ...,

each taking k possible values. Thus, the multinomial trials process is a simple generalization of the Bernoulli trials process (which corresponds to k = 2). For simplicity, we will denote the outcomes by the integers 1, 2, ..., k. Thus, we will denote the common density function of the trial variables by

pi = P(Uj = i) for i = 1, 2, ..., k (and for any j).

Of course pi > 0 for each i and p1 + p2 + ··· + pk = 1.

As with our discussion of the binomial distribution, we are interested in the variables that count the number of times each outcome occurred. Thus, let

Zi = #{j {1, 2, ..., n}: Uj = i} for i = 1, 2, ..., k

(for simplicity, we are suppressing the dependence on n). Note that

Z1 + Z2 + ··· + Zk = n,

so if we know the values of k - 1 of the counting variables, we can find the value of the remaining counting variable. As with any counting variable, we can express Zi as a sum of indicator variables:

Mathematical Exercise 1. Show that Zi = Ii1 + Ii2 + ··· + Iin where Iij = 1 if Uj = i and Zij = 0 otherwise.

Distributions

Basic arguments using independence and combinatorics can be used to derive the joint, marginal, and conditional densities of the counting variables. In particular, recall the definition of the multinomial coefficient

C(n; j1, j2, ..., jk) = n! / (j1! j2! ··· jk!) for positive integers j1, j2, ..., jk with j1 + j2 + ··· + jk = n.

Mathematical Exercise 2. Show that for positive integers j1, j2, ..., jk with j1 + j2 + ··· + jk = n,

P(Z1 = j1, Z2 = j2, ..., Zk = jk) = C(n; j1, j2, ..., jk) p1j1 p2j2 ··· pkjk.

The distribution of (Z1, Z2, ..., Zk) is called the multinomial distribution with parameters n and p1, p2, ..., pk.

We also say that (Z1, Z2, ..., Zk-1) has this distribution (recall that the values of k - 1 of the counting variables determine the value of the remaining variable). Usually, it is clear from context which meaning of the term multinomial distribution is intended. Again, the ordinary binomial distribution corresponds to k = 2.

Mathematical Exercise 3. Show that Zi has the binomial distribution with parameters n and pi:

P(Zi = j) = C(n, j) pij (1 - pi)n - j for j = 0, 1, ..., n

The multinomial distribution is preserved when the counting variables are combined. Specifically, suppose that A1, A2, ..., Am is a partition of the index set {1, 2, ..., k} into nonempty subsets. For each j, let Wj denote the sum of Zi over i in Aj, and let qj denote the sum of pi over i in Aj.

Mathematical Exercise 4. Show that (W1, W2, ..., Wm) has the multinomial distribution with parameters n and q1, q2, ..., qm.

The multinomial distribution is also preserved when some of the counting variables are observed. Specifically, suppose that A, B is a partition of the index set {1, 2, ..., k} into nonempty subsets. Suppose that we observe Zj = zj for j in B. Let z denote the sum of zj over j in B, and let p denote the sum of pi over i in A.

Mathematical Exercise 5. Show that the conditional distribution of Zi, i in A given Zj = zj, j in B is multinomial with parameters n - z and pi / p for i in A.

Combinations of the basic results in Exercises 5 and 6 can be used to compute any marginal or conditional distributions.

Simulation Exercise 6. In the dice experiment, select the number of aces. For each die distribution, start with a single die and add dice one at a time, noting the shape of the density function. When you get to 10 dice, run the simulation with an update frequency of 10. Note the apparent convergence of the relative frequency function to the density function.

Moments

We will compute the mean, variance, covariance, and correlation of the counting variables. Results from the binomial distribution and the representation in terms of indicator variables are the main tools.

Mathematical Exercise 7. Show that

  1. E(Zi) = npi.
  2. var(Zi) = npi(1 - pi).

Mathematical Exercise 8. show that for distinct i and j,

  1. cov(Zi, Zj) = -n pi pj.
  2. cor(Zi, Zj) = - {pi pj / [(1 - pi)(1 - pj)]}1/2.

From Exercise 8, note that the number of times outcome i occurs and the number of times outcome j occurs are negatively correlated, but the correlation does not depend on n or k. Does this seem reasonable?

Mathematical Exercise 9. Use the result of Exercise 8 to show that if k = 2, then the number of times outcome 1 occurs and the number of times outcome 2 occurs are perfectly correlated. Does this seem reasonable?

Simulation Exercise 10. In the dice experiment, select the number of aces. For each die distribution, start with a single die and add dice one at a time, noting the shape and location of the mean/standard deviation bar. When you get to 10 dice, run the simulation with an update frequency of 10. Note the apparent convergence of the empirical moments to the true moments.

Computational Problems

Mathematical Exercise 11. Suppose that we roll 10 fair dice. Find the probability that

  1. scores 1 and 6 occur once each and the other scores occur twice each.
  2. scores 2 and 4 occur 3 times each.
  3. there are 4 even scores and 6 odd scores.
  4. scores 1 and 3 occur twice each given that score 2 occurs once and score 5 three times.

Mathematical Exercise 12. Suppose that we roll 4 ace-six flat dice (faces 1 and 6 have probability 1/4 each; faces 2, 3, 4, and 5 have probability 1/8 each). Find the joint density function of the number of times each score occurs.

Simulation Exercise 13. In the dice experiment, select 4 ace-six flats. Run the experiment 500 times, updating after each run. Compute the joint relative frequency function of the number times each score occurs. Compare the relative frequency function with the true density function.

Mathematical Exercise 14. Suppose that we roll 20 ace-six flat dice. Find the covariance and correlation of the number of 1's and the number of 2's.

Simulation Exercise 15. In the dice experiment, select 20 ace-six flat dice. Run the experiment 500 times, updating after each run. Compute the empirical covariance and correlation of the number of 1's and the number of 2's. Compare the results with the theoretical results computed in Problem 14.