Virtual Laboratories > Expected Value > [1] 2 3 4 5 6 7

1. Definitions and Properties


Expected value is one of the most important concepts in probability. The expected value of a real-valued random variable gives the center of the distribution of the variable, in a special sense. Additionally, by computing expected values of various real transformations of a general random variable, we con extract a number of interesting characteristics of the distribution of the variable, including measures of spread, symmetry, and correlation.

Definitions

As usual, we start with a random experiment that has a sample space and a probability measure P. Suppose that X is a random variable for the, taking values in a subset S of R.

If X has a discrete distribution with density function f then the expected value of X is defined by

E(X) = sumx in S xf(x).

If X has a continuous distribution with density function f then the expected value of X is defined by

E(X) = integralS xf(x)dx.

Finally, suppose that X has a mixed distribution, with partial discrete density g on D and partial continuous density h on C, where D and C are disjoint, D is countable, and S = D union C. The expected value of X is defined by

E(X) = sumx in C xg(x) + integralC xh(x)dx.

In any case, the expected value of X may not exist because the sum or the integral may not converge. The expected value of X is also called the mean of the distribution of X and is frequently denoted µ.

Interpretation

The mean is the center of the probability distribution of X in a special way. Indeed, if we think of the distribution as a mass distribution, then the mean is the center of mass as defined in physics. Please recall the other measures of the center of a distribution that we have studied: a mode is any value of x that maximizes f(x). A median is any value of x that satisfies

P(X < x) <= 1/2, P(X >= x) >= 1/2.

To understand expected value in a probabilistic way, suppose that we create a new, compound experiment by repeating the basic experiment over and over again. This gives a sequence of independent random variables,

X1, X2, X3 ...

each with the same distribution as X. In statistical terms, we are sampling from the distribution of X. The average value, or sample mean, after n runs is

Mn = (X1 + X2 + ··· + Xn) / n

The average value Mn converges to the expected value µ as n converges to infinity. The precise statement of this is the law of large numbers, one of the fundamental theorems of probability.

Examples and Special Cases

Mathematical Exercise 1. A constant c can be thought of as a random variable that takes only the value c with probability 1. The corresponding distribution is sometimes called point mass at c. Show that

E(c) = c.

Mathematical Exercise 2. Let I be an indicator random variable (that is, a variable that takes only the values 0 and 1). Show that

E(I) = P(I = 1).

In particular, if IA is the indicator of an event A, then E(IA) = P(A), so in a sense, expected value subsumes probability. For a book that takes expected value, rather than probability, as the fundamental starting concept, see Probability via Expectation, by Peter Whittle.

Mathematical Exercise 3. Suppose that X is uniformly distributed on a finite subset S of R. Show that E(X) is the arithmetic average of the numbers in S.

Mathematical Exercise 4. The score on a fair die is uniformly distributed on {1, 2, 3, 4, 5, 6}. Find the expected score.

Simulation Exercise 5. In the dice experiment, select one fair die. Run the experiment 1000 times, updating every 10 runs, and note the apparent convergence of the sample mean to the distribution mean.

Mathematical Exercise 6. Find the expected score for an ace-six flat die. The density function is

f(1) = 1/4, f(2) = f(3) = f(4) = f(5) = 1/8, f(6) = 1/4

Simulation Exercise 7. In the dice experiment, select one ace-six flat die. Run the experiment 1000 times, updating every 10 runs, and note the apparent convergence of the sample mean to the distribution mean.

Mathematical Exercise 8. Suppose that Y has density function f(n) = p(1 - p)n - 1 for n = 1, 2, ..., where 0 < p < 1 is a parameter. This defines the geometric distribution with parameter p. Show that

E(Y) = 1 / p.

Mathematical Exercise 9. Suppose that N has density function f(n) = exp(-t)tn / n! for n = 0, 1, ..., where t > 0 is a parameter. This defines the Poisson distribution with parameter t. Show that

E(N) = t.

Mathematical Exercise 10. Suppose that X is uniformly distributed on an interval (a, b) of R. Show that the mean is the midpoint of the interval:

E(X) = (a + b) / 2

Mathematical Exercise 11. Suppose that X has density f(x) = 12x2(1 - x) for 0 < x < 1.

  1. Find E(X).
  2. Find the mode of X
  3. Find the median of X
  4. Sketch the graph of f and show the location of the mean, median, and mode on the x-axis.

Mathematical Exercise 12. Suppose that X has the density function f(x) = a / xa + 1 for x > 1, where a > 0 is a parameter. This defines the Pareto distribution with shape parameter a. Show that

  1. E(X) = infinity if 0 < a <= 1
  2. E(X) = a / (a - 1) if a > 1.

Simulation Exercise 13. In the random variable experiment, select the Pareto distribution. For the following values of the shape parameter a, run the experiment 1000 times updating every 10 runs. Note the the behavior of the empirical mean.

  1. a = 1
  2. a = 2
  3. a = 3

Mathematical Exercise 14. Suppose that T has density f(t) = r exp(-rt) for t > 0 where r > 0 is a parameter. This defines the exponential distribution with rate parameter r. Show that

  1. Show that E(T) = 1 / r.
  2. Show that the median of T is 0.
  3. Show that the median of T is ln(2) / r.
  4. Sketch the graph of f and show the location of the mean, median, and mode on the x-axis.

Simulation Exercise 15. In the random variable experiment, select the gamma distribution. Set k = 1 to get the exponential distribution. Vary r with the scroll bar and note the position of the mean relative to the graph of the density function. Now with r = 2, run the experiment 1000 times updating every 10 runs. Note the apparent convergence of the sample mean to the distribution mean.

Mathematical Exercise 16. Suppose that X has density f(x) = 1 / [pi(1 + x2)], x in R. This defines the Cauchy distribution (named after Augustin Cauchy), a member of the family of t-distributions.

  1. Sketch the graph of f.
  2. Show that E(X) does not exist.
  3. Find the median of X.
  4. Find the mode of X.

Simulation Exercise 17. In the random variable experiment, select the student t distribution. Set n = 1 to get the Cauchy distribution. Run the simulation 1000 times, updating every 10 runs. Note the behavior of the empirical mean.

Mathematical Exercise 18. Suppose that Z has density f(z) = exp(-z2 / 2) / (2pi)1/2 for z in R. This defines the standard normal distribution. Show that

  1. Show that E(Z) = 0.
  2. Sketch the graph of f and show E(Z) on the z-axis.

Simulation Exercise 19. In the random variable experiment, select the normal distribution (the default parameter values give the standard normal distribution). Run the simulation 1000 times, updating every 10 runs, and note the apparent convergence of the empirical mean to the true mean.

Change of Variables Theorem

The expected value of a real-valued random variable gives the center of the distribution of the variable. This idea is much more powerful than might first appear. By finding expected values of various functions of a general random variable, we can measure many interesting features of its distribution.

Thus, suppose that X is a variable variable taking values in a general set S, and suppose that r is a function from S into R. Then r(X) is a real-valued random variable and we would like to compute E[r(X)]. However, to compute this expected value from the definition would require that we know the density function of transformed variable r(X) (a difficult problem, in general). Fortunately, there is a much better way, given by the change of variables theorem for expected value.

Mathematical Exercise 20. Show that if X has a discrete distribution with density function f then

E[r(X)] = sumx in S r(x)f(x).

Similarly, if X has a continuous distribution with density function f then

E[r(X)] = integralS r(x)f(x)dx.

Mathematical Exercise 21. Prove the version of the change of variables theorem when X is continuous and r is discrete (i.e., r has countable range).

Mathematical Exercise 22. Suppose that X is uniformly distributed on (-1, 3).

  1. Find the density of X2.
  2. Find E(X2) using the density function in (a).
  3. Find E(X2) using the change of variables theorem.

Mathematical Exercise 23. Suppose that X has density function f(x) = x2 / 60 for x {-2, -1, 1, 2, 3, 4, 5}.

  1. Find E(X).
  2. Find the density of X2.
  3. Find E(X2) using the density in (a)
  4. Find E(X2) using the change of variables theorem.

Mathematical Exercise 24. Suppose that X has density function f(x) = 12x2(1 - x) for 0 < x < 1. Find

  1. E(1/X)
  2. E(X1/2)

Mathematical Exercise 25. Suppose that (X, Y) has probability function f(x, y) = 2(x + y) for 0 < x < y < 1. Find

  1. E(X)
  2. E(Y)
  3. E(X2Y).
  4. E(X2 + Y2)

Mathematical Exercise 26. Suppose that X is uniformly distributed on the interval [a, b], and that g is a continuous function from [a, b] into R. Show that E[g(X)] is the average value of g on [a, b], as defined in calculus.

Basic Properties

The exercises below gives basic properties of expected value. These properties are true in general, but restrict your proofs to the discrete and continuous cases separately; the change of variables theorem is the main tool you will need. In these exercises X and Y are real-valued random variables for an experiment, c is a constant, and we assume that the indicated expected values exist.

Mathematical Exercise 27. Show that E(X + Y) = E(X) + E(Y)

Mathematical Exercise 28. Show that E(cX) = cE(X).

Thus, as a consequence of the last two exercises, 

E(aX + bY) = aE(X) + bE(Y)

 for constants a and b; in words, expected value is a linear operation.

Mathematical Exercise 29. Show that if X 0 (with probability 1) then E(X) 0.

Mathematical Exercise 30. Show that if X Y (with probability 1) then E(X) E(Y)

Mathematical Exercise 31. Show that |E(X)| E(|X|)

The results of these exercises are so basic that it is important to understand them on an intuitive level. Indeed, these properties are in some sense implied by the interpretation of expected value given in the law of large numbers.

Mathematical Exercise 32. Suppose that X and Y are independent. Show that

E(XY) = E(X)E(Y)

The last exercise shows that independent random variables are uncorrelated.

Mathematical Exercise 33. A pair of fair dice are thrown, and the scores (X1, X2) recorded. Find the expected value of

  1. Y = X1 + X2.
  2. Z = X1X2.
  3. U = min{X1, X2}
  4. V = max{X1, X2}.

Mathematical Exercise 34. Suppose that E(X) = 5 and E(Y) = -2. Find E(3X + 4Y - 7).

Mathematical Exercise 35. Suppose that X and Y are independent, and that E(X) = 5, E(Y) = -2. Find

E[(3X - 4)(2Y + 7)]

Mathematical Exercise 36. Suppose that there are 5 duck hunters, each a perfect shot. A flock of 10 ducks fly over, and each hunter selects one duck at random and shoots. Find the expected number of ducks killed. Hint: Express the number of ducks killed as a sum of indicator random variables.

For a more complete analysis of the duck hunter problem, see The Number of Distinct Sample Values in the chapter on Finite Sampling Models.

Moments

If X is a random variable, a a real number, and n > 0, the n'th moment of X about a is defined to be

E[(X - a)n].

The moments about 0 are simply referred to as moments. The moments about µ = E(X) are the central moments. The second central moment is particularly important and is studied in detail in the section on variance. In some cases, if we know all of the moments of X, we can determine the entire distribution of X. This idea is explored in the section on generating functions.

Mathematical Exercise 37. Suppose that X is uniformly distributed on an interval (a, b). Find a general formula for the moments of X.

Mathematical Exercise 38. Suppose that X has density f(x) = 12x2(1 - x), 0 < x < 1. Find a general formula for the moments of X.

Mathematical Exercise 39. Suppose that X has a continuous distribution with density f that is symmetric about a:

f(a + t) = f(a - t) for any t

Show that if E(X) exists, then E(X) = a.

Nonnegative Variables

Mathematical Exercise 40. Let X be a nonnegative random variable for an experiment, either discrete or continuous. Show that

E(X) = integral{x > 0} P(X > x)dx.

Hint: In the representation above, express P(X > t) in terms of the density of X, as a sum in the discrete case or an integral in the continuous case. Then interchange the integral and the sum (in the discrete case) or the two integrals (in the continuous case).

Mathematical Exercise 41. Prove Markov's inequality (named after Andrei Markov): If X is a nonnegative random variable, then for t > 0,

P(X >= t) = E(X) / t.

Hint: Let It denote the indicator variable of the event {X >= t}. Show that tIt = X. Then take expected values through the inequality.

Mathematical Exercise 42. Use the result of Exercise 40 to prove the change of variables formula when the random vector X has a continuous distribution and r is nonnegative.

Mathematical Exercise 43. Use the result of Exercise 40 to show that if X is nonnegative and E(X) = 0 then P(X = 0) = 1.

The following result is similar to Exercise 40, but is specialized to nonnegative integer valued variables:

Mathematical Exercise 44. Suppose that N is a discrete random variable that takes values in the set of nonnegative integers. Show that

E(N) = n = 0, 1, ... P(N > n) = n = 1, 2, ... P(N >= n).

Hint: In the first representation, express P(N > n) as a sum in terms of the density function of N. Then interchange the two sums. The second representation can be obtained from the first by a change of variables in the summation index.

Mathematical Exercise 45. Suppose that X has the density function f(x) = r exp(-rx) for x > 0, where r > 0 is a parameter. This defines the exponential distribution with rate parameter r.

  1. Find E(X) using the definition.
  2. Find E(X) using the formula in Exercise 40.
  3. Compute both sides of Markov's inequality.

Mathematical Exercise 46. Suppose that Y has density function g(n) = (1 - p)n - 1p for n = 1, 2, ... where 0 < p < 1 is a parameter. This defines the geometric distribution with parameter p.

  1. Find E(Y) using the definition.
  2. Find E(Y) using the formula in Exercise 44.
  3. Compute both sides of Markov's inequality.

A General Definition

The result in Exercise 40 can be used as the basis of a general formulation of expected value that works for discrete, continuous, or even mixed distributions. First, the result in Exercise 40 is taken as the definition of E(X) if X is nonnegative.

Next, for a real number x, we define the positive and negative parts of x as follows

Mathematical Exercise 47. Show that

  1. x+ >= 0, x- >= 0
  2. x = x+ - x-.
  3. |x| = x+ + x-.

Finally, if X is a random variable, then X+ and X- , the positive and negative parts of X, are nonnegative random variables. Thus, assuming that E(X+) or E(X-) (or both) is finite, we can define

E(X) = E(X+) - E(X-)

Jensens's Inequality

Our next sequence of exercises will establish an important inequality known as Jensen's inequality, named for Johan Jensen. First we need a definition. A real-valued function g defined on an interval S of R is said to be convex on S if for each x0 in S, there exist numbers a and b (that may depend on x0) such that

ax0 + b = g(x0), ax + b g(x) for x in S.

Mathematical Exercise 48. Interpret the definition of convex function geometrically. The line y = ax + b is called a supporting line at x0.

You may be more familiar with convexity in terms of the following theorem from calculus:

Mathematical Exercise 49. Show that g is convex on S if g has a continuous, non-negative second derivative on S. Hint: Show that the tangent line at x0 is a supporting line at x0.

Mathematical Exercise 50. Prove Jensen's inequality: If X takes values in an interval S and g is convex on S, then

E[g(X)] g[E(X)]

Hint: In the definition of convexity given above, let x0 = E(X) and replace x with X. Then take expected values through the inequality.

Mathematical Exercise 51. Suppose that X has density function f(x) = a / xa + 1 for x > 1, where a > 1 is a parameter. This defines the Pareto distribution with shape parameter a.

  1. Find E(X) using the formula in Exercise 40.
  2. Find E(1/X).
  3. Show that g(x) = 1/x is convex on (0, infinity).
  4. Verify Jensen's inequality by comparing the results of parts (a) and (b).

Jensens's inequality extends easily to higher dimensions. The 2-dimensional version is particularly important, because it will be used to derive several special inequalities in the next section. First, a subset S of R2 is convex if

u, v in S and p in [0, 1] implies (1 - p)u + pv in S.

Next, a real-valued function g on S is said to be convex if for each (x0, y0) in S, there exist numbers a, b, and c (depending on (x0, y0)) such that

ax0 + by0 + c = g(x0, y0), ax + by + c g(x, y) for (x, y) in S.

Mathematical Exercise 52. Interpret the definitions of convex set and convex function geometrically. The plane z = ax + by + c is called a supporting plane at (x0, y0).

From calculus, g is convex on S if g has continuous second derivatives on S and has a positive non-definite second derivative matrix:

gxx 0, gyy 0, gxxgyy - gxy2 0 on S.

Mathematical Exercise 53. Prove Jensen's inequality: If (X, Y) takes values in a convex set S and g is convex on S then

E[g(X, Y)] g[E(X), E(Y)].

Hint: In the definition of convexity, let x0 = E(X), y0 = E(Y), and replace x with X, y with Y. Then take expected values through the inequality.

Mathematical Exercise 54. Suppose that (X, Y) has probability function f(x, y) = 2(x + y) for 0 < x < y < 1.

  1. Show that g(x, y) = x2 + y2 is convex on the domain of f.
  2. Compute E(X2 + Y2).
  3. Compute [E(X)]2 + [E(Y)]2.
  4. Verify Jensen's inequality by comparing (b) and (c).

In both the one and two-dimensional cases, a function g is concave if the inequality in the definition is reversed. Jensen's inequality also reverses.

Mathematical Exercise 55. Suppose that x1, x2, ..., xn are positive numbers. Show that the arithmetic mean is at least as large as the geometric mean:

(x1 x2 ··· xn)1/n (x1 + x2 + ··· + xn) / n.

Hint: Let X be uniformly distributed on {x1, x2, ..., xn} and let g(x) = ln(x).

Conditional Expected Value

The expected value of a random variable X is based, of course, on the probability measure P for the experiment. This probability measure could be a conditional probability measure, conditioned on a given event B for the experiment (with P(B) > 0). The usual notation is E(X | B), and this expected value is computed by the definitions given at the beginning of this page, except that the conditional density f(x | B) replaces the ordinary density f(x). It is very important to realize that, except for notation, no new concepts are involved. The results we have established for expected value in general have analogues for these conditional expected values.

Mathematical Exercise 56. Suppose that X has the density function f(x) = r exp(-rx) for x > 0, where r > 0 is a parameter. This defines the exponential distribution with rate parameter r. For fixed t > 0, find

E(X | X > t).

Mathematical Exercise 57. Suppose that Y has density function g(n) = (1 - p)n - 1p for n = 1, 2, ... where 0 < p < 1 is a parameter. This defines the geometric distribution with parameter p. Find

E(Y | Y is even).

Mathematical Exercise 58. Suppose that (X, Y) has density function f(x, y) = x + y for 0 < x < 1, 0 < y < 1. Find

E(XY | Y > X).

More generally, the conditional expected value of a random variable, given the value of another random variable, is a very important topic that is treated in a separate section.