Optimal Strategies

4. Optimal Strategies

A Condition for Optimality

Recall that the stopping rule for red and black is to continue playing until the gambler is ruined or his fortune reaches the target fortune a. Thus, the gambler's strategy is to decide how much to bet on each trial before he must stop. Suppose that we have a class of strategies that correspond to certain valid fortunes and bets:

A: set of fortunes, B_x: set of valid bets for x A.

For example, sometimes (as with timid play) we might want to restrict the fortunes to integers from 0 to a; other times (as with bold play) we might want to use the interval [0, 1] as the fortune space. As for the bets, we will always assume that the gambler cannot bet what he does not have and does not bet more than he needs in order to reach the target. Thus, we at least have the conditions

x A, y B_x implies 0 y min{x, a -x}.

Moreover, we always restrict our strategies to those for which the stopping time is Nfinite.

A strategy with win probability function V is optimal if for any other strategy with win probability function U we have

U(x) V(x) for x A.

$Mathematical Exercise$ 1. Show if there exists an optimal strategy, then the win probability function is unique.

However, there may not exist an optimal strategy or there may be several optimal strategies. Moreover, the optimality question depends on the value of the trial win probability p, in addition to the structure of fortunes and bets.

Suppose now that S is a strategy with win probability function V. Our goal in is to show that if

pV(x + y) + qV(x - y) V(x) for x A, y B_x.

Then S is optimal.

$Mathematical Exercise$ 2. Consider the following strategy: if the initial fortune is x in A, we pick y in B_x and then bet y on the first trial; thereafter we follow strategy S. Condition on the outcome of the first trial to show that the win probability function for this new strategy is

U(x) = pV(x + y) + qV(x - y).

Thus, the theorem we are trying to prove in this section can be restated as follows: If S is optimal with respect to the class of strategies in Exercise 2, then S is optimal over all strategies.

Suppose now that the optimality condition holds. Let T be an arbitrary strategy with win probability function U. The random variable V(X_n) can be interpreted as the probability of winning if the gambler's strategy is replaced by strategy S after time n.

$Mathematical Exercise$ 3. Condition on the outcome of trial n to show that

E[V(X_n) | X₀ = x] = E[pV(X_n - 1 + Y_n) + qV(X_n - 1 - Y_n) | X₀ = x].

$Mathematical Exercise$ 4. Use the results of Exercises 3 and the optimality condition to show that for n = 1, 2, ...

E[V(X_n) | X₀ = x] E[V(X_n - 1) | X₀ = x].

$Mathematical Exercise$ 5. Use the result of Exercise 4 to show that

E[V(X_n) | X₀ = x] V(x) for n = 1, 2, ...

$Mathematical Exercise$ 6. Take the limit as n increases in Exercise 5 to show that

E[V(X_N) | X₀ = x] V(x) where N is the stopping time for strategy T.

$Mathematical Exercise$ 7. Show that E[V(X_N) | X₀ = x] = U(x)

Finally from Exercises 6 and 7 we have shown that strategy S is indeed optimal:

U(x) V(x) for x A.

Favorable Trials with a Minimum Bet

Suppose now that p 1 / 2 so that the trials are favorable (or at least not unfair) to the gambler. We will show that if the house requires that all bets be multiples of a basic unit (which of course real gambling houses do), then an optimal strategy is for the gambler to play timidly, making the minimum bet on each trial until he must stop.

First, let us assume that all bets must be multiples of a basic unit, which we might as well assume is $1. Thus the sets of valid fortunes and bets are

A = {0, 1, ..., a}, B_x = {0, 1, ..., min{x, a - x}}.

Let f denote the win probability function for timid play. To show that timid play is optimal, we just need to verify that the condition for optimality holds, which in this case becomes

pf(x + y) + qf(x - y) f(x) for x in A, y in B_x.

$Mathematical Exercise$ 8. Show that optimality condition holds if p = 1 / 2.

$Mathematical Exercise$ 9. If p > 1 / 2, show that the condition for optimality is equivalent to

p(q / p)^{x + y} + q(q / p)^{x
- y} (q / p)^x.

$Mathematical Exercise$ 10. Show that the inequality in the last exercise is equivalent to

pq(p^y - q^y)(p^y - 1 - q^y - 1) 0.

The inequality in the last exercise is true for p > 1 / 2, and hence timid play is optimal when the trials are favorable.

11. In the red and black game set a = 16, x = 8, and p = 0.45. Define the strategy of your choice and play 100 games. Compare your relative frequency of wins with the probability of winning with timid play.

Favorable Trials without a Minimum Bet

We will now assume that the house allows arbitrarily small bets and that p > 1/2, so that the trials are strictly favorable. In this case it is natural to take the target as the monetary unit so that the sets of fortunes and bets becomes

A = [0, 1], B_x = [0, min{x, 1 - x}} for x A.

We will show that V(x) = 1 for x in (0, 1]. Our results for timid play will play an important role in our analysis, so we will let f(j; a) denote the probability of reaching an integer target a, starting at the integer jin [0, a], with unit bets.

First, let us fix a rational initial fortune x = k / n in [0, 1].

$Mathematical Exercise$ 12. Let m be a positive integer. Suppose that, starting at x, the gambler bets 1/mn on each trial. Show this is equivalent to timid play with target mn and initial fortune mk and that hence the probability of reaching the target 1 is f(mk; mn).

$Mathematical Exercise$ 13. Show that

f(mk; mn) 1 as m .

$Mathematical Exercise$ 14. Use the results of Exercises 6 and 7 to show that

V(x) = 1 if x (0, 1] is rational.

$Mathematical Exercise$ 15. Use the result of the last exercise and the fact that V is increasing to show that

V(x) = 1 for all x (0, 1].

Unfair Trials

We will now assume that p 1 / 2 so that the trials are unfair, or at least not favorable. We will show that bold play is optimal. As before, we will take the target fortune as the basic monetary unit and allow any valid fraction of this unit as a bet. Thus, the sets of fortunes and bets are

A = [0, 1], B_x = [0, min{x, 1 - x}] for x A.

Let F denote the win probability function. for bold play. To show that bold play is optimal, we just need to verify that the general condition for optimality.

$Mathematical Exercise$ 16. Show that the optimality condition is equivalent to

D(r, s) = F[(r + s) / 2] - pF(s) - qF(r) 0 for 0 r s 1.

$Mathematical Exercise$ 17. Use the continuity of F to show that it suffices to prove the inequality in Exercise 16 when r and s are binary rationals.

We will now use induction on m to show that the inequality in Exercise 16 holds when r and s are binary rational with rank m or less, where m = 0, 1, ...

$Mathematical Exercise$ 18. Show that the inequality in Exercise16 holds when r and s have rank 0; that is, show that the inequality holds for

r = 0, s = 0,
r = 0, s = 1,
r = 1, s = 1.

Suppose now that the inequality in Exercise 16 holds when r and s have rank m or less, for a particular m. Suppose that r and s have rank m + 1 or less. We will show that the inequality holds in each of the four cases

r s 1 / 2
1 / 2 r s
r (r + s) / 2 1 / 2 s
r 1 / 2 (r + s) / 2 s

The basic functional equation for F will be the main tool.

$Mathematical Exercise$ 19. Show that in case (a), D(r, s) = pD(2r, 2s)

$Mathematical Exercise$ 20. Show that in case (b), D(r, s) = qD(2r - 1, 2s- 1)

$Mathematical Exercise$ 21. For case (c), fill in the details of the following steps:

D(r, s) = pF(r + s) - p[p + qF(2s - 1)] - q[pF(2r)]
1 / 2 r + s 1 so F(r + s) = p + qF(2r + 2s - 1)
0 r + s - 1 / 2 1 / 2 so F(r + s - 1 / 2) = pF(2r + 2s - 1)
D(r, s) = q[F(r + s - 1 / 2) - pF(2s - 1) - pF(2r)]
If 2s - 1 2r then D(r, s) = (q - p)F(2s - 1) + qD(2s - 1, 2r)
If 2r 2s - 1 then D(r, s) = (q - p)F(2r) + qD(2r, 2s - 1)

$Mathematical Exercise$ 22. For case (d), fill in the details of the following steps:

D(r, s) = [p + qF(r + s - 1)] - p[p + qF(2s - 1)] - q[pF(2r)]
0 r + s - 1 1 / 2 so F(r + s - 1) = pF(2r + 2s - 2)
1 / 2 r + s - 1 1 so F(r + s - 1 / 2) = p + qF(2r + 2s - 2)
D(r, s) = p(q - p) + p[F(r + s - 1 / 2) - qF(2s - 1) - qF(2r)]
If 2s - 1 2r then D(r, s) = p(q - p)[1 - F(2r)] + pD(2s - 1, 2r)
If 2r 2s - 1 then D(r, s) = p(q - p)[1 - F(2s - 1)] + pD(2r, 2s - 1)

$Mathematical Exercise$ 23. Use the induction hypothesis and the results of the previous exercises to finish the proof that bold play is optimal when the trials are unfair.

24. In the red and black game, set a = 16, x = 8, and p = 0.45. Define the strategy of your choice and play 100 games. Compare your relative frequency of wins with the probability of winning with bold play.