Thursday, August 26, 2010

One son on Tuesday: a probabilistic puzzle

John Baez, a savior of the Earth, discusses an interesting puzzle that was sent to him by Greg Egan:
A few months ago I read about a very simple but fun probability puzzle. Someone tells you:

“I have two children. At least one is a boy born on a Tuesday. [And if it were not the case, I would have told you.] What is the probability I have two boys?”
Try to solve it yourself. John Baez mentions that you would think or he would think that the information about Tuesday is irrelevant because the days of the week are independent of the sex and we only care about the latter.

So you would think that there are 4 equally represented groups of 2-kid families, namely boy-boy, boy-girl, girl-boy, and girl-girl families where the two hyphenated words refer to the younger and older kid, respectively. Only the girl-girl families are eliminated, and 1 of the remaining 3 groups is a two-boy family, so the conditional probability is 1/3.

However, that's a wrong result. The information about the Tuesday actually does matter. Here's why:

Correct solution

In all families with exactly 2 children, one may label the children as the "younger" and "older" one, even if the difference is just in seconds.

Each kid may be born on any day and have any sex, so there are 14 equally likely possibilities for each child. The two children are independent (forget that the phenomenon of twins tends to increase the same-day pairs), so there are 14 x 14 possibilities for two kids. Each of these 14 x 14 possibilities is equally likely. So 1/196 of the world's families with exactly 2 kids fits each condition.




Among the 196 types of the families, how many of them contain at least one Tuesday son? Well, in 14 of them, the younger kid is a Tuesday son (the older one may be anything chosen from the 14 possibilities). In 14 other of them (the younger can be anything), the older one is a Tuesday son. However, I have counted the families with two Tuesday sons twice. So there are 14+14-1 = 27 possibilities among the 196 for which the condition "at least one kid is a Tuesday son" is satisfied.

This is the assumption which is a part of the calculation of the conditional probability. We need the other part, too. Among these 27/196 of the families, 13/196 of all families have two boys, by pure counting, so the result is
P = 13/27
as the fraction of the families that satisfied the condition. Note that it is just slightly less than 1/2 = 13.5/27 i.e. much more than 1/3. I had to highlight the result because almost no one reads the full article and almost no one notices that the right results is neither 1/3 nor 1/2.

Indeed, the large difference of the right result from 1/3 appears because one de facto identifies one of the sons by mentioning that it is the kid from Tuesday. If you assumed there were infinitely many days in a week and you would take any family with at least one Tuesday kid, the "Tuesday" information would identify this kid completely (two Tuesday kids would be infinitesimally unlikely), and the question what is the probability of 2 sons would be reduced to the question what is the probability that the other, equally specific kid - the non-Tuesday kid - is male - which is of course 1/2.

I will discuss this "identification" and reasons why the result is close to 1/2 at the very end.

Indistinguishable kids' bound states

With kids that would satisfy the Bose or Fermi statistics, the counting would be different but equally straightforward. Instead of 14 x 14 = 196 possibilities, one has 14 x 15 / 2 = 105 for bosons (the symmetric triangle) and 14 x 13 / 2 = 91 (the antisymmetric triangle) for fermions. Among the 105 or 91 options, how many of them contain at least one Tuesday son? Well, in these two cases, we can't say which of them is older and younger: they're identical.

So if there is at least 1 Tuesday son, the number of states with at least 1 Tuesday son is 14 for the bosons - we can just create the other particle into the 1-particle state - or 13 for the fermions - we can also add the second creation operator, but with another Tuesday son, the state will vanish because of Pauli's exclusion principle.

Among these 14 or 13 states respectively, for bosons and fermions, 7 or 6 are two-son states, respectively. So the odds are 7/14 = 1/2 for the bosons and 6/13 for the fermions. Note that the bosons literally saturate the 1/2 bound while the fermions are just slightly below it.

Why not one third?

Finally, I want to comment on "why the information about Tuesday matters". If we sum up the probabilities for the problems where the son is born on Sunday, Monday... and up to Saturday, shouldn't we get the same result? And by symmetry, the result must be equal for all 7 days, so doesn't each term have to be 1/3?

The answer is that we can't add the probabilities in this way because the "at least one Monday son" etc. are assumptions, not propositions conditioned by these assumptions, and they're not disjoint. At any rate, the calculation is nonlinear because the conditional probabilities have the probability of the assumption in the denominator rather than the numerator, so you can't simply add the possibilities in any way.

The word term in the previous paragraph is therefore incorrect.

How and why 1/3 gets enhanced to nearly 1/2

If you were only told that "one of the kids is a boy", the mixed families would be overrepresented over the two-boy families by the 2-to-1 ratio because boy-girl and girl-boy families are as likely as boy-boy families; again, the kids notation is younger-older.

However, if you're told that "one of the kids is a Tuesday boy", this overrepresentation almost disappears. Why? Because 1/7 of the boy-girl and girl-boy families have a Tuesday boy. But (approximately) 2/7 of the boy-boy families have at least one Tuesday boy because each of these two boys has a chance to be born on Tuesday.

In this way, the boy-boy families (nearly) compensate the factor of two by which they were underrepresented relatively to the mixed families.



Bonus: this puzzle and crackpot Sean Carroll's misunderstanding of logic

This logical puzzle is actually a very precise pedagogical example showing what's wrong with the thinking of various people about the arrow of time. Some people - those who say that the information about Tuesday doesn't matter and who typically end up with the result 1/3 - think that
Prob(cond,any_day) = Prob(cond,Monday) + ... + Prob(cond,Sunday)
where "cond" is an extra condition. So if we make a statement about a specific object and if this statement doesn't prefer any day of the week, then adding the information about "its" day of the week doesn't matter. It only reduces the probability by a factor of 7 if the probability is day-blind.

That's right for "conclusions" or "outcomes". However, the error that these people are making is that they think that this "additive" counting of the probabilities also holds for the probabilities of assumptions, i.e. probabilities of conditions in the conditional probability. But no such a linearity exists over there. Conditions (and initial states) don't follow the same maths as the outcomes (and final states)!

There is no condition-outcome or past-future symmetry in mathematical logic! That's why it matters for the probabilities whether the information about Tuesday is specified even though there is nothing special about Tuesday.