Virtual Laboratories > Bernoulli Trials > 1 2 3 4 5 [6] 7
A multinomial trials process is a sequence of independent, identically distributed random variables
U1, U2, ...,
each taking k possible values. Thus, the multinomial trials process is a simple generalization of the Bernoulli trials process (which corresponds to k = 2). For simplicity, we will denote the outcomes by the integers 1, 2, ..., k. Thus, we will denote the common density function of the trial variables by
pi = P(Uj = i) for i = 1, 2, ..., k (and for any j).
Of course pi > 0 for each i and p1 + p2 + ··· + pk = 1.
As with our discussion of the binomial distribution, we are interested in the variables that count the number of times each outcome occurred. Thus, let
Zi = #{j {1, 2,
..., n}: Uj = i} for i = 1, 2, ..., k
(for simplicity, we are suppressing the dependence on n). Note that
Z1 + Z2 + ··· + Zk = n,
so if we know the values of k - 1 of the counting variables, we can find the value of the remaining counting variable. As with any counting variable, we can express Zi as a sum of indicator variables:
1. Show that Zi
= Ii1 + Ii2 + ··· + Iin
where Iij = 1 if Uj = i and Zij
= 0 otherwise.
Basic arguments using independence and combinatorics can be used to derive the joint, marginal, and conditional densities of the counting variables. In particular, recall the definition of the multinomial coefficient
C(n; j1, j2, ..., jk) = n! / (j1! j2! ··· jk!) for positive integers j1, j2, ..., jk with j1 + j2 + ··· + jk = n.
2. Show that for
positive integers j1, j2, ..., jk
with j1 + j2 + ··· + jk = n,
P(Z1 = j1, Z2 = j2, ..., Zk = jk) = C(n; j1, j2, ..., jk) p1j1 p2j2 ··· pkjk.
The distribution of (Z1, Z2, ..., Zk)
is called the multinomial distribution with parameters n and
We also say that (Z1, Z2, ..., Zk-1) has this distribution (recall that the values of k - 1 of the counting variables determine the value of the remaining variable). Usually, it is clear from context which meaning of the term multinomial distribution is intended. Again, the ordinary binomial distribution corresponds to k = 2.
3. Show that Zi
has the binomial distribution with parameters n and pi:
P(Zi = j) = C(n, j) pij (1 - pi)n - j for j = 0, 1, ..., n
The multinomial distribution is preserved when the counting variables are combined. Specifically, suppose that A1, A2, ..., Am is a partition of the index set {1, 2, ..., k} into nonempty subsets. For each j, let Wj denote the sum of Zi over i in Aj, and let qj denote the sum of pi over i in Aj.
4. Show that (W1,
W2, ..., Wm) has the multinomial distribution with
parameters n and
The multinomial distribution is also preserved when some of the counting variables are observed. Specifically, suppose that A, B is a partition of the index set {1, 2, ..., k} into nonempty subsets. Suppose that we observe Zj = zj for j in B. Let z denote the sum of zj over j in B, and let p denote the sum of pi over i in A.
5. Show that the
conditional distribution of Zi, i in A given Zj
= zj, j in B is multinomial with parameters n
- z and pi / p for i in A.
Combinations of the basic results in Exercises 5 and 6 can be used to compute any marginal or conditional distributions.
6. In the
dice experiment, select the number of aces. For each die distribution, start with a single die
and add dice one at a time, noting the shape of the density function. When you get to 10
dice, run the simulation with an update frequency of 10. Note the apparent convergence of
the relative frequency function to the density function.
We will compute the mean, variance, covariance, and correlation of the counting variables. Results from the binomial distribution and the representation in terms of indicator variables are the main tools.
7. Show that
8. show
that for distinct i and j,
From Exercise 8, note that the number of times outcome i occurs and the number of times outcome j occurs are negatively correlated, but the correlation does not depend on n or k. Does this seem reasonable?
9. Use the
result of Exercise 8 to show that if k = 2, then the number of times outcome 1
occurs and the number of times outcome 2 occurs are perfectly correlated. Does this seem
reasonable?
10. In the
dice experiment, select the number of aces. For each die distribution, start with a single
die and add dice one at a time, noting the shape and location of the mean/standard
deviation bar. When you get to 10 dice, run the simulation with an update frequency of 10.
Note the apparent convergence of the empirical moments to the true moments.
11. Suppose
that we roll 10 fair dice. Find the probability that
12. Suppose
that we roll 4 ace-six flat dice (faces 1 and 6 have probability 1/4 each; faces 2, 3, 4,
and 5 have probability 1/8 each). Find the joint density function of the number of times
each score occurs.
13. In
the dice experiment, select 4 ace-six flats. Run the experiment 500 times, updating after
each run. Compute the joint relative frequency function of the number times each score
occurs. Compare the relative frequency function with the true density function.
14. Suppose
that we roll 20 ace-six flat dice. Find the covariance and correlation of the number of
1's and the number of 2's.
15. In
the dice experiment, select 20 ace-six flat dice. Run the experiment 500 times, updating
after each run. Compute the empirical covariance and correlation of the number of 1's and
the number of 2's. Compare the results with the theoretical results computed in Problem
14.