I must be allowed to add some explanatory remarks to bring the subject home to reason—to that sluggish reason, which supinely takes opinions on trust, and obstinately supports them to spare itself the labour of thinking.
For reference, here is a list of the rules of probability:
Addition rule:
Special addition rule (for mutually exclusive events):
Subtraction rule:
Multiplication rule:
Special multiplication rule (for independent events):
It is important to bear in mind the circumstances in which the
special addition rule and the special multiplication rule can
be used. Most mistakes in probabilistic reasoning occur
because someone assumes that events are independent when they
are not (or vice versa), or because someone assumes that
events are mutually exclusive when they are not (or vice
versa).
Here is the original problem which led Pascal
and Fermat to develop probability theory:
The gambler's problem
Suppose you roll a single die four times; what is the
probability of rolling at least one 6? The gambler reasoned
that since the chance of a 6 in each roll is 1/6, the chance
of a 6 in 4 rolls is 4 x 1/6 = 2/3. Now suppose you roll
a pair of dice 24 times; what is the probability of rolling at
least one double 6? The gambler reasoned that since the chance
of a double 6 in one roll is 1/36, the chance of a double 6 in
24 rolls is 24 x 1/36 = 2/3. In other words, the gambler
expected to win each bet 2/3 of the time. His problem was
that he seemed to lose more often with the second bet than the
first. He was at a loss to explain this, so he asked his
friend Pascal for an answer.
What are the
mistakes in the gambler's reasoning? What are the true
probabilities of winning each bet?
In both his calculations, the gambler used the simplified
addition rule. This overlooks the fact that the outcomes he
is dealing with are not mutually exclusive; he may get a 6 (or
a double 6) in more than one roll. To calculate the true
probabilities, we need to use the full addition rule.
In fact, the simplest way to proceed is to calculate the
probability of getting no 6 in four throws. Since the
throws are independent, we can use the simplified
multiplication rule, which tells us that the probability of
getting no 6 in four throws is (5/6)4 = 0.482. Since the
probability of getting at least one 6 covers all the other
outcomes, we can use the subtraction rule to calculate this
probability; it is 1 - 0.482 = 0.518.
Similarly, we can calculate the probability of getting no
double 6 in 24 throws. Since the probability of not getting a
double 6 in one throw is 35/36, and the throws are
independent, the probability of getting no double 6 in 24
throws is (35/36)24 = 0.509. Again, by the subtraction
rule, the probability of getting at least one double 6 is
1 - 0.509 = 0.491. This is lower than the probability of
getting at least one 6 in four throws, as the gambler had
noticed.
§1. Examples and fallacies
A fallacy is a mistake in reasoning. The following examples
each contain some reasoning about probabilities, some of which
is correct and some of which is mistaken. See if you can spot
any mistake, and then click "correct" if you think the
reasoning is o.k. and "fallacy" if you think it is wrong.
Fred is playing roulette in a Macau casino. The
roulette wheel has 36 numbers (ignoring the zero), of which
half are red and half are black. Fred reasons as follows: In
the last ten spins, all the winning numbers have been red.
But on average, only half the winning numbers are red. So to
even things out, there must be more black numbers than red
numbers coming up. So I stand a better chance of winning if I
bet on black.
correct
or
Yes, the reasoning is fallacious. The outcomes of the previous
ten spins of the wheel can have no effect on the motion of the
wheel and the ball; past outcomes can't affect future
outcomes. In other words, the outcomes are independent. The
probability of a black number is still 1/2 on each spin,
irrespective of what has come before. For obvious reasons,
this kind of mistake is called the gambler's fallacy.
It is a very common mistake.
?
The chance of the Mark Six numbers being exactly the
same two days in a row is extremely small. So to maximize my
chances of winning today, I should not choose yesterday's
winning numbers.
No, this reasoning is fallacious. Go back and ask yourself:
What is the chance of winning if I choose yesterday's numbers?
What is the chance of winning if I don't choose
yesterday's numbers?
or
Yes, the reasoning is fallacious. It is another instance of the gambler's fallacy, since the winning numbers today are
entirely independent of the winning numbers yesterday. This
version of the gambler's fallacy is very tempting, as it is
true that the probability of the same numbers coming up two
days in a row is very small--about one in 10 million. So the
chance of winning if you pick yesterday's numbers is only one
in 10 million. But notice that the chance of winning is the
same whatever numbers you choose; for any set of six
numbers you choose, the chance of getting them all correct is
one in 10 million. In fact, an argument can be made that you are better off
picking yesterday's numbers. That is because if two people
win, they have to share the prize. Since people tend to avoid
yesterday's numbers (as well as "unlucky" numbers like 13), if
you choose yesterday's numbers you are less likely to have to
share your prize!
?
Suppose I am at the Pokfield Road bus terminus, waiting
for the number 23 bus to leave for North Point. The number 23
leaves from here every 8 minutes. So the longer I wait for
the bus, the higher the probability that it will leave in the
next minute.
Yes, this reasoning is correct. Superficially, this case
resembles the prior two examples, but the difference here is
that the chance of a bus coming in a given minute is not
independent of what happened in the previous minutes,
since the buses are timed to be a fixed number of minutes
apart. When you first arrive at the bus stop, the bus has an
equal chance of coming in each of the next 8 minutes, so the
probability of it coming within one minute is 1/8. If it
doesn't come in the first minute, then it must come in one of
the following 7 minutes, so the probability of it coming
within one minute goes up to 1/7. If it doesn't come in the
first two minutes, the probability of it coming within one
minute goes up to 1/6. And so on.
A city has a crackdown on speeding drivers, and the
number of traffic fatalities falls by 12%. The local
government claims that the increased enforcement has saved
lives. But the crackdown was started because of a sudden
increase in traffic fatalities the prior year. After an
unusually high value, the number of deaths is likely to fall
the following year anyway. So there is no reason to think
that the crackdown caused the decrease in fatalities.
Yes, this reasoning is correct. In general, it is more
probable that you will get a number of traffic fatalities that
is close to average than one which is far higher than average.
This is true whatever the number was for the prior year,
since presumably the numbers of traffic fatalities in
different years are largely independent. So for this year, it
is likely that the number of traffic fatalities will be not
too far from average, in which case it will be lower than last
year's unusually high value. Statisticians call this
phenomenon regression to the mean.
or
No, this reasoning is correct. This case is quite tricky. It
looks at first glance like a version of the gambler's fallacy,
since the number of traffic fatalities in one year is
presumably independent of the number of fatalities the prior
year. But in fact it is not a fallacy; the very fact that
this year's fatalities are independent of last year's means
that this year's rate is likely to be lower than last year's.
It might help to think about the following analogy: Suppose
you roll five dice, and you get four sixes (which is an
unusually high number of sixes--you can calculate the
probability). What is the chance of getting four or more
sixes on the next roll? What is the chance of getting fewer
than four sixes? Go back and think about how that applies to
this case.
?
"BALTIMORE (AP) A Maryland woman this week gave birth to
triplets for the second time in less than two years, defying
odds of about one in 50 million, hospital officials said."
The reasoning here is that since only about one birth in seven
thousand is of triplets, the odds of having two sets of
triplets in a row is about one in 70002, which is one in 50 million.
correct or
Yes, this reasoning is almost certainly fallacious. The odds
of one in 50 million were calculated using the simplified
multiplication rule, which only applies if the two events are
independent. But giving birth to triplets the second time may
well not be independent of giving birth to triplets the first
time; the woman may have a biological predisposition for
multiple births. In that case, the probability of the second
set of triplets--the probability of a woman giving birth to
triplets given that she has already had triplets--may
be considerably greater than 1/7000. The chance of having two
sets of triplets is obtained by multiplying this probability
by the chance of having the first set of triplets (1/7000).
This chance is still very small, but it may be nowhere near as
small as reported.