# Mega Millions, Multiple Winners, and Expectations

The Mega Millions lottery is a popular number-picking lottery game in the US. It exists in 45 states (including D.C.), and is played by millions of people every week. Lotteries are well known for having negative expected values, meaning that players lose (on average) more than they win. This should be expected, given that lotteries (and gambling in general) are profit-seeking enterprises.

The potentially large jackpots of Mega Millions (the jackpot is pari-mutuel) can push the game into a region of positive EV, though. This is counter-balanced by the fact that duplicate tickets will result in splitting the winnings equally among the winners, driving down the available EV. This post explores what kind of impact that has on the game.

### Rules of the Game

I’m getting the rules from the Georgia Lottery site. The rules are fairly straightforward. You pick 5 non-repeating numbers from 1 to 75 (inclusive). Then, you pick a 6th number (the Mega Ball) from 1 to 15 (inclusive). The order of the numbers does not matter. The payout table is shown at the above link, where you win based on how many base numbers you match, along with whether or not you matched the mega ball.

Each ticket (1 play) costs \$1. For an additional \$1, you can try for the ‘Megaplier’ (gotta love the terrible names), which is a random number that increases the prizes for everything but the jackpot by some multiple (2x through 5x).

The game has 258,890,850 different combinations of tickets. Ignoring the mega ball, there are 17,259,390 5-ball combinations. Of that large number, 241,288,446 will be losing combinations each game.

### Expected Values

The EVs are simple to calculate, given the probabilities of each win scenario. I calculated the EVs for a range of jackpot values, from \$10 million to \$600 million (around the largest jackpot in recent history). The variations of EV as a function of jackpot size are shown in the following plot:

The break-even points are \$212 million for the basic game, and \$343 million when using the megaplier. If you are going to play (don’t), then definitely do not get the megaplier.

I was surprised at the amount of positive EV space in the plots. I thought the game would be worse for longer. The probability of getting the jackpot is about 3.86e-9, and the jackpot needs to contribute about 0.81 EV to break even (-0.81 being the EV for all non-jackpot scenarios). That gives a jackpot value of \$210 million to give us the required EV.

What this doesn’t take into account is the number of other people playing. There is always a chance of there being more than 1 winner. How does that chance affect the EV? To figure that out, we need some idea of how many tickets are bought.

### Ticket Sales vs. Published Jackpot

Fortunately for us, data exists of ticket sales vs. expected jackpot value (what the public sees on all the billboards). I gathered data from this site for all the dates after the rule change (since Oct 22, 2013). The following two plots show the variation of ticket sales and jackpots by date, and scattered on a log-log scale:

You can see the steady increases in jackpots that are followed by a winner/drop down to a base level.

The pearson correlation between the tickets and the jackpot is about 0.81, which seems pretty reasonable. Ticket sales obviously jump when there is more money on the line. Unfortunately, so will the chances of more than 1 person winning. Assuming that we are a jackpot winner, the chance of someone else winning the jackpot is found from the binomial distribution. The expected number of winners (other than ourselves) as a function of the number of tickets sold is shown below:

We expect there to be 1 additional winner (thereby splitting the jackpot in half) at around 250 million tickets sold.

We want the EV of a jackpot, so we’ll solve for the number of tickets sold given a jackpot to estimate the expectations of splitting the jackpot among many people. Using SciPy’s curve_fit function, I fit a simple exponential equation to the data:

The fit is fine for our purposes, and it will let us move on to the final step of getting the EV with multiple winners possible.

### EVs With Potential Multiple Winners

Here is the graph of EVs for all our scenarios now. The multiple winners lines use the EVs of split jackpots as ticket sales are expected to increase. The extrapolation lines show what might happen if the curve fit I did was valid past its bounds.

Not surprisingly, the possibility of splitting winners works against the player. The break-even point for the megaplier case is a \$443 million jackpot. Not long after that, the EV dips back to negative. The extrapolation shows that, as the jackpot climbs, so many people will get tickets that it will defeat the purpose. I guess don’t tell anyone you read this post!

The EV is maximized around a \$509 million jackpot, at about a \$1.50 winnings after paying \$1. Getting the megaplier makes the game almost completely negative, in terms of EV.

### Conclusions

I was surprised that there is a positive EV zone for the player at all, at least in a jackpot range that has been hit in the past. If it has been talked about, I missed it entirely. I didn’t go out and read up on anything for this since the calculations are simple enough, anyway. I wasn’t surprised at the negative impact that multiple winners has. Overall, it was a fun exercise!

The real question, I think, is whether or not it is rational to buy when the jackpot is in the positive EV zone? If you’re only making \$0.50 per ticket, it hardly seems worth it. You’d have to buy a lot of tickets to expect a reasonable payout. Also, that payout EV only comes from the jackpot. EVs in other gambling games tend to be balanced around a “Pay 1 to win 2 48% of the time” scenario. In the lottery, you’re more likely to accrue massive negatives before that 1 in 259 million chance hits. Since positive EV jackpots hit about once every 3 months, it’ll take a very long time to reach the ‘long run’.

Another problem is game-theoretic. If everyone knows that there is a decrease in EV because of other players, how does that affect ticket sales? Do people skip buying? Does that mean we stick around and buy, expecting fewer players? Do they all think that, too?

### Python Code

Here I’m including some of the python code used to do the analysis, for anyone wanting to replicate it or play around with it.

```from __future__ import division
from scipy import stats, misc, optimize
import numpy as np

# define some objects to do probability calculations
# a lottery like this one is a hypergeometric activity
a = stats.hypergeom(M=75,n=5,N=5)
# a simple indicator-style function
b = lambda x: 1/15 if x==1 else 14/15

# the number of 5 balls correct, mega ball, and the prize
scenarios = [[5,0,1e6],
[4,1,5000],
[4,0,500],
[3,1,50],
[3,0,5],
[2,1,5],
[1,1,2],
[0,1,1],]

base_cost = 1
mult_cost = 1

# probabilities of each multiplier ball
mult_probs = [[y, 1/x] for x,y in zip([2.5, 5, 3.75, 7.5],[5,4,3,2])]

# base expected value (all the EV but the jackpot)
total_ps = 1 - a.pmf(5)*b(1)
win_ps = 0
EV = 0
play_cost = base_cost
for nb, e, amt in scenarios:
prob = a.pmf(nb) * b(e)
val = prob * (amt - play_cost)
EV += val
win_ps += prob
lose_ps = total_ps - win_ps
EV += lose_ps*(-play_cost)

# multiplier expected value
total_ps = 1 - a.pmf(5)*b(1)
win_ps = 0
mEV = 0
play_cost = base_cost + mult_cost
for nb, e, amt in scenarios:
prob = a.pmf(nb) * b(e)
val = prob * (sum((amt*m - play_cost)*pm for m, pm in mult_probs))
mEV += val
win_ps += prob
lose_ps = total_ps - win_ps
mEV += lose_ps*(-play_cost)

# get the total EV with jackpots
jackpots = np.logspace(7, 8.6, 200)
jprob = a.pmf(5)*b(1)
EVs = EV + jprob*(jackpots-1)
mEVs = mEV + jprob*(jackpots-1)

# from that point, plot jackpots vs the EVs and mEVs

# The fitted function is of the form (for log millions data):
def fitfun(x, a, b, c, d):
return a + b*10**(x*c + d)

# the solved parameters:
params = [1.16035426,  0.0567717 ,  1.11329293, -1.71121703]

# redo the EV chart as a function of jackpot,
# but include the EV from a jackpot pot split

# probability of someone getting the same ticket as us
p_same = 1/misc.comb(75, 5, exact=True)*1/15

jackpots = np.linspace(1, 2.8, 300)
# solve for the tickets sold
tickets = fitfun(jackpots,*params)
full_tickets = (10**tickets)*1e6
# probability of jackpot
jprob = a.pmf(5)*b(1)
full_jack = (10**jackpots)*1e6
# jackpot value without considering tickets
EVs1 = EV + jprob*(full_jack-1)

# Get the expected value of the jackpot (up to 50 simultaneous wins)
# I don't divide the jackpot by the expected number of winners since
# that number isn't an integer, and I want to account for the step in
# whole winners
def jev(full, nt, v=50):
ev = 0
for k in xrange(v):
ev += full/(k+1) * stats.binom.pmf(k=k,n=nt-1,p=p_same)
return ev
# Get the EVs with that jackpot being split
j_evs = np.array([jev(jack, tick) for jack, tick in zip(full_jack, full_tickets)])

# the EVs for shared winners
EVs2 = EV + jprob*(j_evs)
```