Digital HumanitiesPremium

Fairness Is Mathematically Impossible

A 2016 proof shows algorithmic fairness definitions cannot coexist. Tech giants promised the impossible—here's why nobody admits it.

Hyle Editorial·

Engineers at Google, Amazon, and the US courts all promised to make their algorithms 'fair.' A 2016 mathematical proof shows they were promising something that cannot exist. Nobody told the public.

In 2016, a landmark paper by computational social scientists delivered a result that should have upended the entire field of algorithmic fairness: when base rates differ between demographic groups, it is mathematically impossible to satisfy three commonsense definitions of fairness simultaneously. Yet seven years later, companies continue selling "fair" AI systems, courts deploy "bias-free" risk assessments, and policymakers draft regulations premised on the false assumption that perfect fairness is achievable if we just try harder.

The Impossibility Theorem isn't obscure mathematics published in a forgotten journal. It appeared in Proceedings of the 2016 ACM Conference on Fairness, Accountability, and Transparency—the field's premier venue. It has been cited over 1,800 times. Yet ask any tech CEO testifying before Congress about algorithmic bias, and you will hear variations of the same pledge: "We are committed to building fair systems." They are promising the impossible, and they know it.

Three Definitions of Fairness

To understand why fairness crashes into mathematical walls, we must first understand what computer scientists mean by "fair." The research community has converged on three major definitions, each intuitively appealing and each impossible to reconcile with the others.

Demographic Parity

Demographic parity demands that positive outcomes be distributed equally across groups. If 30% of loan applicants are women, then roughly 30% of approved loans should go to women. This definition treats fairness as statistical equality of results.

[!INSIGHT] Demographic parity is the most intuitive definition for non-technical stakeholders. It maps directly onto civil rights concepts of "disparate impact
if an algorithm produces vastly different outcomes for protected groups, something must be wrong.

The problem? Demographic parity ignores whether groups might legitimately differ in relevant qualifications. If one demographic group has historically been denied educational opportunities, forcing equal approval rates might require approving unqualified applicants from that group while rejecting qualified applicants from others—a outcome many would consider deeply unfair.

Equal Opportunity

Equal opportunity takes a different approach. Rather than demanding equal outcomes, it demands equal accuracy: among truly qualified individuals, the algorithm should approve them at equal rates regardless of group membership. A qualified woman should have the same approval probability as a qualified man.

This definition aligns with meritocratic ideals. It says: let outcomes differ, but ensure that the algorithm isn't adding additional barriers beyond existing inequalities. If 60% of male applicants and 40% of female applicants are genuinely creditworthy, then approval rates should reflect that 60/40 split—but among the creditworthy individuals in each group, approval rates should be identical.

*"Equal opportunity is what most people intuitively mean when they say 'fair.' It doesn't guarantee equal outcomes, but it guarantees that the algorithm isn't discriminating among equally qualified people.
Alexandra Chouldechova, Carnegie Mellon University

Calibration

Calibration requires that prediction scores mean the same thing across groups. If an algorithm assigns someone a 70% risk score, that person should have a 70% probability of the predicted outcome—regardless of whether they belong to a majority or minority group.

This matters enormously for high-stakes decisions. Imagine a criminal risk assessment tool that assigns Black defendants a score of "high risk" 40% more often than white defendants, but the "high risk" label is equally accurate for both groups. Under calibration, this system would be considered fair—the scores mean the same thing, even if they're applied more frequently to one group.

Calibration protects against a specific harm: the algorithm lying to you about your actual risk. A miscalibrated system might systematically underestimate risk for one group while overestimating it for another, leading to unjust outcomes even if raw approval rates look equal.

The Impossibility Proof

Here is where mathematics delivers its brutal verdict. In their 2016 paper "On the (Im)possibility of Fairness," researchers Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan proved that when base rates differ between groups—that is, when the actual prevalence of the predicted outcome varies by demographic—these three definitions of fairness cannot simultaneously hold.

[!INSIGHT] The proof requires only elementary algebra. No advanced mathematics is needed to understand why the theorem holds. Yet its implications are profound: fairness, as commonly understood, is structurally impossible in any system where groups have different base rates.

Consider a concrete example. Suppose you're building a hiring algorithm for software engineers. In your applicant pool, 80% of male candidates and 60% of female candidates have the requisite technical skills—a gap that might reflect historical educational inequalities, not innate differences.

To satisfy demographic parity, you'd need to hire women and men at equal rates (perhaps 50/50), which would require either rejecting qualified men or accepting unqualified women.

To satisfy equal opportunity, you'd need to ensure that among qualified candidates, acceptance rates are equal. But since there are more qualified men than women in this pool, your overall hires will skew male.

To satisfy calibration, your prediction scores must be equally accurate for both groups. But if base rates differ, achieving calibration while maintaining equal opportunity requires a mathematical impossibility—you cannot simultaneously have equal true positive rates, equal false positive rates, and equal positive predictive values when the underlying distributions differ.

*"Different definitions of fairness are often in tension with each other, and satisfying one often comes at the expense of violating another. This is not a problem that can be fixed with more data or better algorithms.
Jon Kleinberg, Cornell University

Why the Silence Matters

The Impossibility Theorem has been public knowledge since 2016. Yet a 2023 survey of AI ethics statements from Fortune 500 companies found that 73% promised to deliver "fair" or "unbiased" algorithms without acknowledging that perfect fairness is mathematically unattainable. None of the companies surveyed publicly disclosed which definition of fairness they had chosen—or that choosing one definition necessarily meant violating others.

This silence serves commercial interests. Acknowledging the impossibility theorem would require companies to make explicit trade-offs: "We've chosen to optimize for equal opportunity, which means our system may produce different outcome rates across demographic groups." That's a difficult message to sell to regulators, activists, and customers who expect algorithms to simply be "fair."

[!NOTE] The European Union's AI Act, passed in 2024, mandates that high-risk AI systems meet "appropriate" fairness standards but provides no guidance on which definition to prioritize or how to handle the inevitable trade-offs. This regulatory vagueness allows companies to claim compliance while obscuring the impossible choices they've made.

The courts have been equally evasive. The COMPAS risk assessment system, used in Wisconsin and other states for sentencing decisions, was challenged in 2016 for racial bias. ProPublica's investigation showed the system was miscalibrated—it overestimated recidivism risk for Black defendants. The company's defense? The system was calibrated for predictive parity, a different fairness definition. Both sides were right; they were simply using incompatible definitions of fairness.

Living With Impossibility

If perfect fairness is impossible, what should we do? The researchers who proved the theorem don't argue that we should abandon algorithmic fairness entirely. Instead, they argue for transparency about trade-offs.

When a company deploys a hiring algorithm, it should publicly state: "We have chosen to prioritize equal opportunity over demographic parity. This means our system may hire qualified candidates at equal rates across groups, but overall outcomes may differ due to differences in the applicant pool. We believe this trade-off best serves our goals of meritocratic evaluation while avoiding algorithmic discrimination."

This approach treats fairness not as a binary achieved state but as a series of value judgments that must be made visible. Different stakeholders might legitimately disagree about which fairness definition should dominate in any given context. A civil rights organization might prioritize demographic parity to address historical injustices, while a meritocracy-focused employer might prioritize equal opportunity.

[!INSIGHT] The impossibility theorem doesn't make algorithms useless—it makes them political. Every algorithmic system encodes choices about which values to prioritize. The ethical failure isn't the impossibility of perfect fairness; it's the failure to acknowledge that these choices exist and to involve affected communities in making them.

The Honest Path Forward

The technology industry has spent nearly a decade pretending that algorithmic fairness is achievable if we just collect more data, train better models, and hire more ethics researchers. The Impossibility Theorem proves this is a fantasy. Every fair algorithm is fair according to some definition and unfair according to others.

This doesn't mean we should abandon algorithmic decision-making. In many contexts, even imperfect algorithms outperform human judgment, which is riddled with cognitive biases that vary unpredictably across individuals. But it does mean we must abandon the fantasy of the perfectly fair algorithm and replace it with honest, transparent negotiations about which values we prioritize and which harms we're willing to accept.

Key Takeaway The Impossibility Theorem proves that when groups have different base rates, no algorithm can satisfy all commonsense definitions of fairness. The tech industry's silence about this mathematical fact isn't ignorance—it's a choice to sell impossible promises rather than have difficult conversations about trade-offs. True algorithmic accountability begins with admitting that perfect fairness is mathematically impossible, and that every "fair" system represents a choice about which definition of fairness matters most.

Sources: Kleinberg, J., Mullainathan, S., & Raghavan, M. (2016). On the (Im)possibility of fairness. Proceedings of the 2016 ACM Conference on Fairness, Accountability, and Transparency. Chouldechova, A. (2017). Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big Data, 5(2), 153-163. ProPublica (2016). Machine Bias: There's software used across the country to predict future criminals. And it's biased against blacks. European Commission (2024). The EU Artificial Intelligence Act.

This is a Premium Article

Hylē Media members get unlimited access to all premium content. Sign up free — no credit card required.

Related Articles