The specific case taken in Novella's blog post involves a study that posed the following problem:
Participants are given information about a hypothetical woman named Linda:
- (E) Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in antinuclear demonstrations.
After reading the description of that target E, they requested the participants to estimate the probability of a number of statements that were true referring to E. Three statements are included as follows:
- (T) Linda is a bank teller.
- (F) Linda is active in the feminist movement.
- (T ∧ F) Linda is a bank teller and is active in the feminist movement.
It's unfortunate that the scientists did not follow up with interviews investigating the thought processes of the participants (at least I couldn't find where they had). They just immediately conclude that the participants are guilty of the conjunction fallacy and blame it on a reliance on "intuition". But I suspect that if the participants were asked straightforwardly about the conjunction fallacy - "Is both A and B being true more probable than A by itself being true?" - most everyone would come up with the right answer. So there is likely more going on here than a simple failure to understand the conjunction fallacy.
What's going on, I think, is that the Linda Problem is actually a poorly formed question in probability. Likelihoods have meaning only in the context of an implied probabilistic experiment. We understand what "there is a 50% chance it will rain tomorrow" means because we supply for ourselves the implied probabilistic context: "Given days with meteorological conditions like today, half the time the next day is rainy and half the time it is not." The question about tomorrow's weather is really a continuation of the experiment and we estimate the probability based on prior outcomes.
But it's the case that once a probabilistic experiment has occurred, the probability of the outcome for that experiment goes to 1 and all other outcomes go to 0. Once the dice are rolled and come up 7, the likelihood that the outcome of that experiment was 7 is 1 and that it was anything else is 0.
Now consider the statement T, "Linda is a bank teller." There is no probabilistic context here. Linda is what she is and is nothing else, like a dice roll that has already happened. So the probability that Linda is a bank teller is either 1 or 0 depending on whether she actually is a bank teller. Same with her being a feminist, and same with the conjunction of her being both a bank teller and a feminist. They are all either 1 or 0.
Of course, if Linda is not a bank teller, then the probability of T is 0 and T ^ F is 0. But if she is a bank teller but not a feminist then the probability of T is 1, F is 0, and T ^ F is 0. So in that sense it is strictly true that the third statement can never be more probable than the first.
But this is a degenerate use of "likelihood." Likelihood adds nothing to the analysis which is strictly logical. And in fact the participants are not "failed" for a failure to estimate probabilities correctly, but for the alleged failure to perceive the logical necessity of the conditional "If T^F then T". So there is a bit of a bait and switch going on.
People generally approach test questions in good faith. They assume the questions are well-formed, and when they aren't, they provide their own context in an attempt to interpret the question as well-formed. In this case, being explicitly told that the question is probabilistic - and intuiting that any probabilistic question requires a probabilistic context within which the concept of likelihood makes sense - they supply the probabilistic context that is not provided by the question. Really they should say that the problem is degenerate and the likelihood of each statement is either 1 or 0, but we can't say which.
To come up with any other numbers requires a probabilistic background against which to generate a non-degenerate likelihood. This is where the much maligned "intuition" comes in. Forced to generate their own background, the participants likely tell themselves something like "If I was trying to find Linda, would I be more likely to find her starting with the general population of bank tellers, or focusing on the population that is both bank tellers and feminists?" And they reasonably conclude the latter is the better choice. Formally what they are doing is saying "Given a random draw on the populations of bank tellers, or the population that is both bank tellers and feminists, I'm more likely to come up with Linda making a draw on the latter." And they are right about that.
An intelligent participant would be understandably irritated if told later that he got the question wrong because he doesn't understand that if Linda is a bank teller and a feminist, then she is a bank teller. What the experimenters did was bait (or force) the participants into treating the question as a probabilistic one (requiring a probabilistic context), then graded them as though they had asked them a logical question about conjunction.