The Multinomial Distribution

6. The Multinomial Distribution

Multinomial Trials

A multinomial trials process is a sequence of independent, identically distributed random variables

U₁, U₂, ...,

each taking k possible values. Thus, the multinomial trials process is a simple generalization of the Bernoulli trials process (which corresponds to k = 2). For simplicity, we will denote the outcomes by the integers 1, 2, ..., k. Thus, we will denote the common density function of the trial variables by

p_i = P(U_j = i) for i = 1, 2, ..., k (and for any j).

Of course p_i > 0 for each i and p₁ + p₂ + ЗЗЗ + p_k = 1.

As with our discussion of the binomial distribution, we are interested in the variables that count the number of times each outcome occurred. Thus, let

Z_i = #{j {1, 2, ..., n}: U_j = i} for i = 1, 2, ..., k

(for simplicity, we are suppressing the dependence on n). Note that

Z₁ + Z₂ + ЗЗЗ + Z_k = n,

so if we know the values of k - 1 of the counting variables, we can find the value of the remaining counting variable. As with any counting variable, we can express Z_i as a sum of indicator variables:

$Mathematical Exercise$ 1. Show that Z_i = I_i1 + I_i2 + ЗЗЗ + I_in where I_ij = 1 if U_j = i and Z_ij = 0 otherwise.

Distributions

Basic arguments using independence and combinatorics can be used to derive the joint, marginal, and conditional densities of the counting variables. In particular, recall the definition of the multinomial coefficient

C(n; j₁, j₂, ..., j_k) = n! / (j₁! j₂! ЗЗЗ j_k!) for positive integers j₁, j₂, ..., j_k with j₁ + j₂ + ЗЗЗ + j_k= n.

$Mathematical Exercise$ 2. Show that for positive integers j₁, j₂, ..., j_k with j₁ + j₂ + ЗЗЗ + j_k= n,

P(Z₁ = j₁, Z₂ = j₂, ..., Z_k = j_k) = C(n; j₁, j₂, ..., j_k) p₁^j1 p₂^j2 ЗЗЗ p_k^jk.

The distribution of (Z₁, Z₂, ..., Z_k) is called the multinomial distribution with parameters n and p₁, p₂, ..., p_k.

We also say that (Z₁, Z₂, ..., Z_k-1) has this distribution (recall that the values of k - 1 of the counting variables determine the value of the remaining variable). Usually, it is clear from context which meaning of the term multinomial distribution is intended. Again, the ordinary binomial distribution corresponds to k = 2.

$Mathematical Exercise$ 3. Show that Z_i has the binomial distribution with parameters n and p_i:

P(Z_i = j) = C(n, j) p_i^{^j} (1 - p_i)^{n - j} for j = 0, 1, ..., n

The multinomial distribution is preserved when the counting variables are combined. Specifically, suppose that A₁, A₂, ..., A_m is a partition of the index set {1, 2, ..., k} into nonempty subsets. For each j, let W_j denote the sum of Z_i over i in A_j, and let q_j denote the sum of p_i over i in A_j.

$Mathematical Exercise$ 4. Show that (W₁, W₂, ..., W_m) has the multinomial distribution with parameters n and q₁, q₂, ..., q_m.

The multinomial distribution is also preserved when some of the counting variables are observed. Specifically, suppose that A, B is a partition of the index set {1, 2, ..., k} into nonempty subsets. Suppose that we observe Z_j = z_j for j in B. Let z denote the sum of z_j over j in B, and let p denote the sum of p_i over i in A.

$Mathematical Exercise$ 5. Show that the conditional distribution of Z_i, i in A given Z_j = z_j, j in B is multinomial with parameters n - z and p_i / p for i in A.

Combinations of the basic results in Exercises 5 and 6 can be used to compute any marginal or conditional distributions.

6. In the dice experiment, select the number of aces. For each die distribution, start with a single die and add dice one at a time, noting the shape of the density function. When you get to 10 dice, run the simulation with an update frequency of 10. Note the apparent convergence of the relative frequency function to the density function.

Moments

We will compute the mean, variance, covariance, and correlation of the counting variables. Results from the binomial distribution and the representation in terms of indicator variables are the main tools.

$Mathematical Exercise$ 7. Show that

E(Z_i) = np_i.
var(Z_i) = np_i(1 - p_i).

$Mathematical Exercise$ 8. show that for distinct i and j,

cov(Z_i, Z_j) = -n p_i p_j.
cor(Z_i, Z_j) = - {p_i p_j / [(1 - p_i)(1 - p_j)]}^1/2.

From Exercise 8, note that the number of times outcome i occurs and the number of times outcome j occurs are negatively correlated, but the correlation does not depend on n or k. Does this seem reasonable?

$Mathematical Exercise$ 9. Use the result of Exercise 8 to show that if k = 2, then the number of times outcome 1 occurs and the number of times outcome 2 occurs are perfectly correlated. Does this seem reasonable?

10. In the dice experiment, select the number of aces. For each die distribution, start with a single die and add dice one at a time, noting the shape and location of the mean/standard deviation bar. When you get to 10 dice, run the simulation with an update frequency of 10. Note the apparent convergence of the empirical moments to the true moments.

Computational Problems

$Mathematical Exercise$ 11. Suppose that we roll 10 fair dice. Find the probability that

scores 1 and 6 occur once each and the other scores occur twice each.
scores 2 and 4 occur 3 times each.
there are 4 even scores and 6 odd scores.
scores 1 and 3 occur twice each given that score 2 occurs once and score 5 three times.

$Mathematical Exercise$ 12. Suppose that we roll 4 ace-six flat dice (faces 1 and 6 have probability 1/4 each; faces 2, 3, 4, and 5 have probability 1/8 each). Find the joint density function of the number of times each score occurs.

13. In the dice experiment, select 4 ace-six flats. Run the experiment 500 times, updating after each run. Compute the joint relative frequency function of the number times each score occurs. Compare the relative frequency function with the true density function.

$Mathematical Exercise$ 14. Suppose that we roll 20 ace-six flat dice. Find the covariance and correlation of the number of 1's and the number of 2's.

15. In the dice experiment, select 20 ace-six flat dice. Run the experiment 500 times, updating after each run. Compute the empirical covariance and correlation of the number of 1's and the number of 2's. Compare the results with the theoretical results computed in Problem 14.