Top Qs
Timeline
Chat
Perspective
Berkson's paradox
Tendency to misinterpret statistical experiments involving conditional probabilities From Wikipedia, the free encyclopedia
Remove ads
Berkson's paradox, also known as Berkson's bias, collider bias, or Berkson's fallacy, is a result in conditional probability and statistics which is often found to be counterintuitive, and hence a veridical paradox. It is a complicating factor arising in statistical tests of proportions. Specifically, it arises when there is an ascertainment bias inherent in a study design. The effect is related to the explaining away phenomenon in Bayesian networks, and conditioning on a collider in graphical models.
|  | This article includes a list of references, related reading, or external links, but its sources remain unclear because it lacks inline citations.  (March 2023) | 

- Top: a graph where talent and attractiveness are uncorrelated in the population.
- Bottom: The same graph truncated to only include celebrities (where a person must be both talented and attractive, in some combination, to have become a celebrity). Someone sampling this population may wrongly infer that talent is negatively correlated with attractiveness.
This paradox is often illustrated using scenarios from the fields of medical statistics or biostatistics,[1] as in the original description of the problem by Joseph Berkson.
The most common example of Berkson's paradox is a false observation of a negative correlation between two desirable traits, i.e., that members of a population which have some desirable traits tend to lack a second. Berkson's paradox occurs when this observation appears true when in reality the two properties are unrelated—or even positively correlated—because members of the population where both are absent are not equally observed. For example, a person may observe from their experience that attractive celebrities tend to be untalented, and talented celebrities unattractive; but people who are neither particularly talented nor attractive will not become celebrities, so will not be part of the observer's perspective.
Remove ads
Examples
Summarize
Perspective
Original illustration
Berkson's original illustration involves a retrospective study examining a risk factor for a disease in a statistical sample from a hospital in-patient population. Because samples are taken from a hospital in-patient population, rather than from the general public, this can result in a spurious negative association between the disease and the risk factor.[2]
For example, if the risk factor is diabetes and the disease is cholecystitis, a hospital patient without diabetes is more likely to have cholecystitis than a member of the general population, since the patient must have had some non-diabetes (possibly cholecystitis-causing) reason to enter the hospital in the first place. That result will be obtained regardless of whether there is any association between diabetes and cholecystitis in the general population.
Dating pool example

An example presented by mathematician Jordan Ellenberg is that of a dating pool, measured on axes of niceness and handsomeness. A person might conclude from their own dating experience that "the handsome ones tend not to be nice, and the nice ones tend not to be handsome".[3]
Suppose Alice will only date men whose niceness plus handsomeness exceeds some threshold. Then nicer men do not have to be as handsome to qualify for her dating pool. So, among the men that Alice dates, the nicer ones are less handsome on average (and vice versa), even if these traits are uncorrelated in the general population.
This does not mean that men in the dating pool compare unfavorably with men in the population. On the contrary, the selection criterion for the pool means that Alice has high standards. The average nice man that she dates is actually more handsome than the average man in the population (since even among nice men, the ugliest portion of the population is skipped). Berkson's negative correlation is an effect that arises within the dating pool: the rude men that Alice dates must have been even more handsome to qualify, and the ugly men even more nice.
Quantitative example
As a quantitative example, suppose a collector has 1000 postage stamps, of which 300 are pretty and 100 are rare, with 30 being both pretty and rare. 30% of all his stamps are pretty and 10% of his pretty stamps are rare, so prettiness tells nothing about rarity. He puts the 370 stamps which are pretty or rare on display. Just over 27% of the stamps on display are rare (100/370), but still only 10% (30/300) of the pretty stamps are rare (and 100% of the 70 not-pretty stamps on display are rare). If an observer only considers stamps on display, they will observe a spurious negative relationship between prettiness and rarity as a result of the selection bias (that is, not-prettiness strongly indicates rarity in the display, but not in the total collection).
Remove ads
Statement
Summarize
Perspective
Two independent events become conditionally dependent given that at least one of them occurs. Symbolically:
- If  and   then 
Proof: Note that and which, together with and (so ) implies that
One can see this in tabular form as follows: the yellow regions are the outcomes where at least one event occurs (and ~A means "not A").
For instance, if one has a sample of , and both and occur independently half the time ( ), one obtains:
So in outcomes, either or occurs, of which have occurring. By comparing the conditional probability of to the unconditional probability of :
We see that the probability of is higher () in the subset of outcomes where () occurs, than in the overall population (). On the other hand, the probability of given both and () is simply the unconditional probability of , , since is independent of . In the numerical example, we have conditioned on being in the top row:
Here the probability of is .
Berkson's paradox arises because the conditional probability of given within the three-cell subset equals the conditional probability in the overall population, but the unconditional probability within the subset is inflated relative to the unconditional probability in the overall population, hence, within the subset, the presence of decreases the conditional probability of (back to its overall unconditional probability):
Because the effect of conditioning on derives from the relative size of and the effect is particularly large when is rare () but very strongly correlated to (). For example, consider the case below where N is very large:
For the case without conditioning on we have
So A occurs rarely, unless B is present, when A occurs always. Thus B is dramatically increasing the likelihood of A.
For the case with conditioning on we have
Now A occurs always, whether B is present or not. So B has no impact on the likelihood of A. Thus we see that for highly correlated data a huge positive correlation of B on A can be effectively removed when one conditions on .
Remove ads
See also
References
Wikiwand - on
Seamless Wikipedia browsing. On steroids.
Remove ads
