* Core Contributor;Correspondence to ryanirl@icloud.com;
The Gambler’s Ruin is a classic problem in probability that goes as follows: Say
we have two gamblers A and B, who are competing against each other in a game.
Each gambler initially starts with some amount of money, and at each round of
the game, one player either wins a dollar from the other player, or loses a
dollar to the other player. They play as many independent rounds as they need
until one player runs out of money (gets ruined).
To begin formulating this problem, let’s define some variables:
Let player A start with i dollars and player B starts with (N−i)
dollars. Therefore N, the total amount of dollars in the game, is constant and
does NOT change throughout the game (make sure to convince yourself of this
before moving on).
Let player A have probability p of winning any given round and player B have
probability q=1−p of winning any given round. In a fair game p=q=0.5.
Given that player A has i dollars at the start of an arbitrary round, let
us define xi as the probability that player A wins the entire game (all N
dollars). This is a convenient notation because if player A losses that first
round, then the probability that player A wins the game at the next round would
be xi−1 because they lost 1 dollar in the previous round (if they won the first round, it
would instead be xi+1). This is the probability that we want to solve for. That is,
we want to know at any point in time what xi is, given the starting state defined
by the variables p, i, and N.
As a side note, here are two definitions and a theorem that we will need later.
Mutually Exclusive: Two events E0 and E1 are mutually exclusive if they cannot
occur at the same time. That is E0∩E1=∅.
Exhaustive: Given a sample space S, events E0,E1,…,En are exhaustive if
⋃i=0n(Ei)=S.
Theorem (Law of Total Probability): Suppose E0,E1,…,En are mutually exclusive
and exhaustive events in a sample space S. Then for any event A∈S it must be that
Now, back to the problem at hand. Previously it was established that we want to
solve for xi, the probability that player A wins the game starting with i
dollars. First notice that we also have two boundary conditions, x0 and xN.
x0, the probability that player A wins when they have 0 dollars is 0
(because they have already lost). xN, the probability that player A wins when
they have N dollars is 1 (because they have already won).
This is progress, but now let’s dig deeper and define two events:
W: The event that A wins a round.
L: The event that A loses a round.
Notice that events W and L are mutually exlusive and exhaustive.
Mutually exclusive because if A wins a round, then they cannot also lose that
round. Exhaustive because player A can either win or lose a round, nothing
more and nothing less. Therefore we can apply the law of total probability
(previously defined above) to get the following equation for xi:
xixi=P(xi∣W)P(W)+P(xi∣L)P(L)=xi+1p+xi−1q
Let’s break this down a bit. We can interpret P(xi∣W) as “the probability
that player A wins the whole game when starting with i dollars (xi) given
that player A wins the first round (event W)“. Therefore P(xi∣W)=xi+1,
and inversely P(xi∣L)=xi−1.
As a reminder p is the probability that player A wins any round and q=p−1
is the probability that player A will lose a single round (or the probability
that player B wins a round). Typically p=q=0.5 in a fair game.
Equations of this form are called difference equations (not to be confused
with differential equations). I will cover two ways we can solve this equation:
A clever and more brute force method, and a more principled method that involves
guessing a solution to the difference equation.
Method 1: Clever Brute Force
First, notice that because p+q=1 we can reorganize the difference equation to
the following:
This form is still a bit confusing, so let’s plug in a couple values to see if we
can find any sort of pattern that arises from this equation. First let’s try i=1:
x2−x1=pq(x1−x0)=pq(x1)
Now, let’s try i=2:
x3−x2=pq(x2−x1)=pq(pq(x1))=(pq)2(x1)
Immediately we can see a pattern arise. Feel free to try i=3 on your own,
but it turns out this equation does generalizes to:
xi+1−xi=(pq)i(x1)
This will be an important equation for later. For now, we still need one more
piece of the puzzle before things start coming together. Notice one more thing:
Although, notice that this is actually not defined in the case that p=q=0.5. Thus,
we need to go back to the original series and see that in the case that p=q=0.5 we
get the following (feel free to verify this on your own):
xi=⎩⎨⎧x11−pq1−(pq)ix1iifp=qifp=q
Dealing with just the p=q case, let’s try and figure out x1. To do so,
remember that we know the solutions to the edge cases, so let’s plug in the case
in which i=N.
This is the final solution to the case where p=q. To handle the case in
which p=q=0.5, we get that x1=N1. Plugging this back into
the original equation we get that xi=Ni, and thus the solution is:
xi=⎩⎨⎧1−(pq)N1−(pq)iNiifp=qifp=q
Method 2: Principled Difference Equation Solution
Another way we can solve for xi from the form xi=xi+1p+xi−1q is
to notice that this equation is a pretty standard difference equation (discrete
analog of a differential equation). Thus, we can use textbook ideas from differential
equations to solve for xi.
To do so, let’s guess a possible solution, and first try xi=xi. Then:
Simplifying a bit, notice that 1−4pq=1−4p(1−p)=1−4p+4p2=(2p−1)2. Therefore:
x=2p1±1−4pq=2p1±(2p−1)
This has two solutions, 1 and pq. Therefore, we should take the
linear combination of these roots. This gives a solution of:
xi=C0+C1(pq)i
For some constants C0 and C1. Well, to solve for C0 and C1 we need
to plug in our boundary conditions x0=0 and xN=1. Starting with x0=0
we obtain:
Therefore C0=1−(pq)N−1. Notice that in this case, p
cannot equal q or else we get a divide by 0. Which is bad. Therefore, we
will have to solve for the case that p equals q later and treat it as a
separate case. For now, when p=q we get:
Now handling the case in which p=q, notice that the roots of the
differential equation become 1 and 1. That is, we have repeated roots. What
shall we do in this situation? Well, classic theory of differential would tell
us to consider the solution: xi=C0+C1i. Doing so and plugging in our boundary
cases gives:
x0C0=C0+C1(0)=0
And finally:
xN1C1=C0+C1(N)=C1(N)=N1
Therefore our solution for the case when p=q is: xi=N1i=Ni, and the final solution becomes:
xi=⎩⎨⎧1−(pq)N1−(pq)iNiifp=qifp=q
And that’s it for the formal derivation of the solution to this problem. In both
methods for solving the problem we came to the same solution.
Analysis
Now that we have a closed-form solution for xi, the probability that player A will
win the entire game when starting with i dollars of N total dollars when having
a probability of p for winning every independent round, let’s see how different
values of i, p, and N impact the probability of player A winning the game.
To do so, we need to first write some code to compute the closed-form solution. Below
is that implementation. This handles some edge cases in which numerical stability is
a problem, though quickly breaks if N is set too large due to overflow.
Now let’s visualize different values of p, i, and N (the code to do so can
be found on my GitHub here):
Looking at the above, it seems the more money in the game (N) the sharper the
cutoff between the probabilities. This makes a lot of sense because winning by
luck is less likely when there is more money in the game since more rounds
are played. That is, given even the slightest edge in p you’re almost guaranteed
to win if N is large and i≈2N.
Monte Carlo Simulation
Just for fun, another thing we can do to validate our closed-form solution is to
compare it against a bunch of simulated games. You would assume that if we simulated
n_simulation games then as n_simulation approaches infinity that the ratio of
games that player A won would approach our analytical solution (Figure 3).
To do this, we need to first write code to simulate the playing of a game given
the following variables:
p, the probability that player A wins any independent round.
i, the amount of money player A starts with.
N, the total amount of money in the game.
Below is the code to do that:
Now that we have a way to simulate a single game, let’s simulate some number of
games and compute the probability of xi as the ratio of games that player A
won against the total number of games played. Therefore, the analog of the
true_probability() function for estimating the probability based on Monte
Carlo simulations would be:
Now that we have both a closed-form and estimated solution to this problem, let’s
compare the results of both using 1000 Monte Carlo simulations.
The results look really good! In general, both the estimated and true
probability seem to give very similar results (which is expected). Of course, as
we increase n_sims the difference between the true and estimated probabilities
should converge to each other.
Since we keep track of the histories of the game, one last thing we can do
is plot the histories of each game. Doing so with 100 games, and the above
settings gives:
And this is pretty cool, you can see the random walk behavior clearly.
In summary, we explored the Gambler’s Ruin problem, a classic
problem in probability theory where two players gamble until one player runs out
of money. We derived a closed-form solution for the probability of one player
winning the entire game, using both a clever brute force method and a more
principled approach based on difference equations. This solution reveals how the
initial stake, total money in play, and the probability of winning each round
affect the overall chances of winning. Next, we implemented the game in
Python, and ran a Monte Carlo simulation to experimentally validate the closed
form solution. In doing so, we found that the closed-form solution and the
Monte Carlo simulation provided similar results that converged upon increasing
the number of simulated games. Finally, we ran an analysis to show how different
values of initial parameters impacted the probability of one player winning, and
also showed figures demonstrating the random-walk nature of these games.
[{"id":1,"title":"Lecture 7: Gambler's Ruin and Random Variables | Statistics 110","authors":["Blitzstein, J."],"year":2013,"url":"https://www.youtube.com/watch?v=PNrqCdslGi4"},{"id":2,"title":"Gambler’s Ruin Problem","authors":["Sigman, K."],"year":2009,"url":"http://www.columbia.edu/~ks20/stochastic-I/stochastic-I-GRP.pdf"}]
References
Blitzstein, J. (2013). Lecture 7: Gambler's Ruin and Random Variables | Statistics 110.
Link