## What is a flush?

You have a flush in poker when holding 5 cards of the same suit. Flushes come in three flavors:

• (The standard) Flush: hold 5 cards of the same suit not in a series
🃔🃗🃘🃚🃞
• Straight Flush - hold 5 cards of the same suit ordered in a series
🃔🃕🃖🃗🃘
• Royal Flush - hold 5 cards of the same suit ordered in a series ending with an Ace
🃚🃛🃝🃞🃑

Together, these hands are 3 out of the 5 most powerful hands in Poker.

## What are the odds of getting a flush? The Mathematical solution

😴 Not interested in the math?

The likelihood of drawing a standard flush is calculated in three steps:

1. Determine the number of ways to draw any 5 cards of the same suit,
2. then subtract by the number of Straight and Royal Flushes,
3. then divide all that by the number of possible ways to draw any 5 cards

Mathematically, that looks like:

$$\frac{{}_{CardsInSuit} \mathrm{ C }_{CardsInFlush} * 4 - (StraightFlushesInDeck + RoyalFlushesInDeck)}{{}_{CardsInDeck} \mathrm{ C }_{CardsInHand}}$$

Let's break that down step by step

### 1. Determine the number of ways to draw any 5 cards of the same suit

This is a combination problem. Rather than explain how combinations work, I will point you to Khan Academy's excellent video for deriving the combination formula which you can watch here. As explained there, we have:

\begin{align} {}_{CardsInSuit} \mathrm{ C }_{CardsInFlush} &= \frac{13!}{5!(13-5)!} = \frac{13!}{5!(8)!}\\ &= \frac{13*12*11*10*9*8*7*6*5*4*3*2*1}{5*4*3*2*1*(8*7*6*5*4*3*2*1)}\\ &= \frac{13*12*11*10*9}{5*4*3*2*1}\\ &= \frac{154440}{120}\\ &= 1287 \end{align}

So there are $1287$ possible flushes per suit. Since we have four suits, then there are $1287 * 4 = 5148$ total possible flushes in a deck. This updates our master equation to:$$\frac{5148 - (StraightFlushesInDeck + RoyalFlushesInDeck)}{{}_{CardsInDeck} \mathrm{ C }_{CardsInHand}}$$

### 2. Subtract the number of Straight and Royal Flushes

Recall that a Straight Flush is 5 cards of the same suit ordered in a series. That means there are 9 Straight Flushes per suit, as shown with the clubs below:

1 🃑🃒🃓🃔🃕 2 🃒🃓🃔🃕🃖 3 🃓🃔🃕🃖🃗 4 🃔🃕🃖🃗🃘 5 🃕🃖🃗🃘🃙 6 🃖🃗🃘🃙🃚 7 🃗🃘🃙🃚🃛 8 🃘🃙🃚🃛🃝 9 🃙🃚🃛🃝🃞

Since there are four suits, there are $9 * 4 = 36$ Straight Flushes.

Finally, since an Ace can play as the smallest or highest card, it can be added to the end of a straight to make a Royal Flush. This can only happen in four possible ways, again, once for each suit:

1 🃚🃛🃝🃞🃑 2 🃊🃋🃍🃎🃁 3 🂪🂫🂭🂮🂡 4 🂺🂻🂽🂾🂱

So the number of Straight and Royal Flushes add up to 40. This updates our master equation to:

$$\frac{5148 - 40}{{}_{CardsInDeck} \mathrm{ C }_{CardsInHand}} = \frac{5108}{{}_{CardsInDeck} \mathrm{ C }_{CardsInHand}}$$

It's not looking nearly as scary as before!

### 3. Divide by the number of possible ways to draw any 5 cards

We're getting close! This last calculation is similar to that in step 1 except it's looking for any 5 card combination across the entire deck (rather than only across suits). This works out to:

\begin{align} {}_{CardsInDeck} \mathrm{ C }_{CardsInHand} &= \frac{52!}{5!(52-5)!} = \frac{52!}{5!(47)!}\\ &= \frac{52*51*50*49*48}{5*4*3*2*1}\\ &= 2,598,960 \end{align}

So finally, by updating the master equation one last time, the probability of getting a flush on your next poker hand is:

$$\frac{5108}{2,598,960} = 0.001965 ≈ \frac{1}{509}$$

So in the long run, one out of every 509 hands is a flush.

## What are the odds of getting a flush? A Monte Carlo solution

To estimate the probability of getting a flush programmatically, you have to run that experiment many thousands of times over. In practice this is called a Monte Carlo simulation.

Side note: When I began learning about Monte Carlos I thought they were a sort of "magical", mysteriously complex thing... mostly because their name sounds so exotic. Don't be fooled.  "Monte Carlo" is just an overly fancy and arbitrary name for "simulation". They can be quite elementary.

Even so, simulations are kind of magical because you can use them to brute force a solution out of a complex system even when a mathematical model of that system is hard to come by. Say, for example, you don't have a firm understanding of the combination or permutation math we went through above - which produced the exact answer to the question "What are the odds of getting a flush?"  We can run many simulations of this card game to figure out what that probability would be to a high degree of certainty. Here it is:

// Python 3
from collections import namedtuple
from random import shuffle
import pandas as pd

mathematically_derived_flush_probability = 5108/2598960 * 100

#%% What is the likelyhood of getting flush? Monte Carlo derivation

Card = namedtuple("Card", "suit, rank")

class Deck:
suits = '♦♥♠♣'
ranks = '23456789JQKA'

def __init__(self):
self.cards = [Card(suit, rank) for suit in self.suits for rank in self.ranks]
shuffle(self.cards)

def deal(self, amount):
return tuple(self.cards.pop() for _ in range(amount))

#flush = False
hand_count = 0
flush_count = 0
flush_cutoff = 150 # Increase this number to run the simulation over more hands.
column_names = ['hand_count', 'flush_count', 'flush_probability', 'estimation_error']
hand_results = pd.DataFrame(columns=column_names)

while flush_count < flush_cutoff:
deck = Deck()
while len(deck.cards) > 5:
hand_count +=1
hand = deck.deal(5)
# (Card(suit='♣', rank='7'), Card(suit='♠', rank='2'), Card(suit='♥', rank='4'), Card(suit='♥', rank='K'), Card(suit='♣', rank='3'))
if len(set(card.suit for card in hand)) == 1:
#            print(f"Yay, it's a Flush: {hand}")
flush_count +=1
monte_carlo_derived_flush_probability = flush_count / hand_count * 100
estimation_error = (monte_carlo_derived_flush_probability - mathematically_derived_flush_probability) / mathematically_derived_flush_probability * 100
hand_df = pd.DataFrame([[hand_count,flush_count,monte_carlo_derived_flush_probability, estimation_error]], columns=column_names)
hand_results = hand_results.append(hand_df)

#%% Analyze results
# Show how each consecutive hand helps us estimate the flush probability
hand_results.plot.line('hand_count','flush_probability').axhline(y=mathematically_derived_flush_probability,color='r')

# As the number of hands (experiments) increases, our estimation of the actual probability gets better.
# Below the error gets closer to 0 percent as the number of hands increases.
hand_results.plot.line('hand_count','estimation_error').axhline(y=0,color='black')

#%% Memory usage
print("Memory used to store all %s runs: %s megabytes" % (len(hand_results),round(hand_results.memory_usage(index=True,deep=True).sum()/1000000, 1)))

To prove that our simulation arrived at the correct answer we can compare its output to the known probability of getting a flush after 80,000 hands:

As you can see, our simulated flush_probability (in blue) approaches the mathematically derived probability 0.1965% (in black).

Similarly, below is a plot of the estimation_error between the simulated probability and the mathematically derived value. As you can see, the estimation error is more than 100% off in the early runs of the simulation but gradually rises to within 5% of the error.

If you were to run the simulation for, say, twice the number of hands, then we would see that the blue and red lines eventually overlap with the black horizontal line in both charts - signifying that the simulated answer becomes equivalent to the mathematically derived answer.

## To simulate or not to simulate?

Finally, you might wonder,

"If I can generate a precise answer to a problem by simulating it, then why bother with all the complicated math in the first place?"

In our example, we could run the simulation over enough hands to get a  precise answer with a high degree of confidence. However, if one is  running a simulation because they don't know the answer (which is often the case), then one needs to answer another question,

"How long do I run the simulation to be confident I have the right answer?"

The answer to that seems simple:

"Run it for a long time."

Eventually your estimated outputs could converge to a single value such that outputs from additional simulations don't drastically change from prior runs. The problem here is that in some cases, depending on the complexity of the system you're simulating, seemingly convergent output may be a temporary phenomena. That is, if you ran a hundred thousand more simulations, you might begin to see your outputs diverge from what you thought was your stable answer. In a different scenario, despite having run tens of millions of simulations, it could happen that  an output still hasn't converged. Do you have the time to program and run the simulation? Or would a mathematical approximation get you there sooner?

There is yet another concern:

"What is the cost?"

Consumer computers are relatively cheap today but 30 years ago they cost $4,000 to$9,000 in 2019 dollars. In comparison, a TI89 only cost $215 (again, in 2019 dollars). So if you were asking this question back in 1990 and you were good with probability math, you could have saved$3,800 by using a TI89. Cost is just as important today: simulating self-driving cars and protein folding can quickly burn through many millions of dollars.

Finally, mission critical applications may require both a simulation and a mathematical model to cross check the results of both approaches. A tidy example of this is when Matt Parker of StandUpMaths calculated the odds of landing on any property in the game of Monopoly by simulation and confirmed those results with Hannah Fry's mathematical model of the same game.

If you liked this article, feel like you need to politely challenge it, or would like to hire me to build a Monte Carlo simulation, then let me know what you're thinking in the forum below.