Not Found — Hylē Media

Proxy Variables: The Backdoor to Bias

When an algorithm is trained on historical data, it learns patterns — correlations that may reflect systemic inequalities rather than meaningful predictive signals. Consider the following proxy pathways:

Zip Code → Race: In the United States, residential segregation means that postal codes are highly correlated with racial demographics. An algorithm that excludes race but includes location will effectively infer race.
Name → Gender: Research from 2022 showed that resume-screening algorithms can infer gender from first names with over 95% accuracy, even when gender is explicitly removed from the dataset.
Educational Institution → Socioeconomic Status: Elite university attendance correlates strongly with family income, creating a class-based proxy that disadvantages first-generation applicants.

The mathematical formalization is straightforward. Let $\hat{Y}$ be the model's prediction, $S$ be a protected attribute (e.g., race), and $X$ be the observed features. Even if we train on $X$ alone:

$$P(\hat{Y} | X) \approx P(\hat{Y} | S) \text{ when } I(X; S) \text{ is high}$$

Where $I(X; S)$ represents the mutual information between observed features and protected attributes. When this mutual information is substantial — and in real-world data, it almost always is — removing $S$ does nothing to break the causal chain.

[!INSIGHT] Algorithmic fairness researchers call this "fairness through unawareness," and it has been mathematically proven ineffective since 2011. Yet 73% of corporate AI ethics guidelines still rely on this approach.

The COMPAS Case Study: A Forensic Analysis

Northpointe (now Equivant) developed COMPAS using 137 questions designed to assess defendant risk. The algorithm was proprietary, its training data opaque. ProPublica's reverse-engineering revealed the core problem: the model was calibrated on a criminal justice system with documented racial disparities.

False Positive Rate by Race (COMPAS):

Race	Falsely Labeled High-Risk	Correctly Labeled Low-Risk
Black	44.9%	55.1%
White	22.9%	77.1%

The algorithm wasn't "broken" — it was optimizing exactly what it was trained to optimize. Historical arrest data reflects policing patterns, not just criminal behavior. If police patrol Black neighborhoods more heavily, arrest records will show higher Black arrest rates. The algorithm learns: "Black defendants = higher risk." The circle closes.

“*"The algorithm is not biased. The data is biased. And the data is biased because the world is biased. You cannot fix the world by pretending the data doesn't reflect it.”

— Cathy O'Neil, Author of "Weapons of Math Destruction"

Medical AI: When Bias Means Life or Death

In 2019, researchers published a landmark study in Science examining a commercial healthcare risk-prediction algorithm used on 200 million Americans annually. The algorithm exhibited dramatic racial bias: at the same risk score, Black patients were considerably sicker than white patients, with 48,770 additional chronic conditions undetected among the Black patient population.

The Root Cause: Cost as a Proxy for Health

The algorithm was trained to predict future healthcare costs. The assumption seemed reasonable: sicker patients incur higher costs. But this ignored a fundamental truth about American healthcare:

$$\text{Healthcare Access} \neq \text{Health Need}$$

Black patients face systemic barriers to accessing care — lack of insurance, transportation, trust in the medical system. Less access means less spending, even when health needs are greater. The algorithm learned: "Lower spending = healthier patient." Black patients received lower risk scores despite greater illness burden.

Algorithmic Bias in Dermatology AI: A 2020 analysis of machine learning systems for skin cancer detection found that training datasets were overwhelmingly composed of images from white patients. When tested on darker skin tones:

Accuracy dropped by 34-52% for darker skin types (Fitzpatrick types V-VI)
Melanoma detection rates fell below 50% for Black patients, compared to 87% for white patients

[!NOTE] The Fitzpatrick scale classifies skin types I-VI based on melanin content and reaction to UV exposure. Most dermatology AI systems are trained on datasets where Type I-II skin constitutes over 80% of images, despite these types representing a minority of the global population.

The Amazon Hiring Tool: Pattern Recognition Gone Wrong

Amazon's failed recruitment AI offers a masterclass in how machine learning can weaponize historical prejudice. The system was trained on resumes submitted over a decade — a period during which the tech industry was overwhelmingly male and hiring practices systematically favored men.

The algorithm noticed that male candidates were hired more often. It then searched for patterns that differentiated male from female applicants:

Language patterns: Verbs like "executed" and "captained" appeared more frequently in male candidates' resumes and correlated with hiring success
Educational signals: Women's colleges were downweighted
Explicit gender markers: The word "women's" (as in "women's chess club captain") became a negative predictor

Amazon's engineers discovered the bias and attempted to correct it by neutralizing explicit gender terms. But the model simply found new proxies — the mutual information between resume features and gender remained too high.

[!INSIGHT] The Amazon case illustrates a fundamental principle: if the historical outcome you're predicting reflects bias, your model will learn bias. Training on "merit-based" hiring outcomes from a non-meritocratic system produces a model that automates non-meritocracy.

Breaking the Cycle: Technical and Structural Interventions

Addressing algorithmic bias requires intervention at multiple levels of the machine learning pipeline.

Pre-Processing: Data-Centric Approaches

Resampling and Reweighting: Adjust the training distribution to compensate for historical underrepresentation. If women comprise 20% of historical hires but 50% of qualified applicants, upweight female-positive examples by 2.5x.

Fair Representation Learning: Transform input features into a new representation that preserves predictive power while minimizing mutual information with protected attributes:

$$\min_{\theta} \mathcal{L}{pred}(f{\theta}(X), Y) + \lambda \cdot I(f_{\theta}(X); S)$$

Where $\lambda$ controls the trade-off between prediction accuracy and fairness.

In-Processing: Algorithmic Constraints

Adversarial Debiasing: Train a predictor and an adversary simultaneously. The predictor tries to predict the outcome; the adversary tries to predict the protected attribute from the predictor's output. The predictor learns to be accurate while making protected attributes unpredictable.

Fairness Constraints: Impose mathematical constraints such as Demographic Parity (equal positive prediction rates across groups) or Equalized Odds (equal true positive and false positive rates across groups):

$$P(\hat{Y}=1 | S=0, Y=y) = P(\hat{Y}=1 | S=1, Y=y) \quad \forall y \in {0, 1}$$

[!NOTE] There is no universal fairness metric. Demographic parity, equalized odds, and calibration are mutually incompatible in most real-world scenarios (proved by the Impossibility Theorem of Chouldechova, 2017). Organizations must choose which fairness properties matter most for their context.

Post-Processing: Threshold Adjustment

The simplest intervention requires no retraining: apply different decision thresholds for different groups to equalize outcomes. If Black defendants have a higher false positive rate at threshold 0.5, raise the threshold for Black defendants until parity is achieved.

This approach is transparent, auditable, and immediately actionable. It is also controversial — some argue it constitutes "reverse discrimination." The counterargument: if the underlying model is already discriminatory, adjusting thresholds merely corrects for existing bias.

The Governance Gap: Why Self-Regulation Fails

In 2023, the EU AI Act became the first comprehensive regulation to classify hiring and criminal justice algorithms as "high-risk," mandating bias audits and transparency. The United States has no equivalent federal legislation.

Corporate AI ethics boards have proven largely ineffective:

Google dissolved its AI ethics board in 2019 after internal controversy
Amazon's algorithmic fairness research team was gutted by layoffs in 2023
Facebook's Oversight Board has no authority over algorithmic systems

The fundamental conflict of interest is inescapable: companies optimizing for profit cannot simultaneously self-regulate against bias that may be embedded in their core products.

The Path Forward

Algorithmic bias is not a bug — it is a feature of systems trained on historical data. The data reflects the world as it was, not as it should be. When we deploy these systems without intervention, we are not being neutral; we are actively choosing to encode past discrimination into future decisions.

The technical solutions exist. The political will does not. Until regulators mandate algorithmic audits, until companies are held liable for discriminatory AI, and until the public demands transparency, the machines will continue to learn from our worst instincts.

Key Takeaway Algorithms are mirrors. They reflect the biases in their training data with mathematical precision. The myth of technical neutrality is a dangerous fiction that allows organizations to launder historical prejudice through black-box models while claiming objectivity. Fixing algorithmic bias requires not just technical intervention, but a fundamental reckoning with the discriminatory systems that generate our data in the first place.

Sources: ProPublica (2016) "Machine Bias"; Obermeyer et al. (2019) "Dissecting Racial Bias in an Algorithm Used to Manage the Health of Millions"; Dastin (2018) "Amazon Scraps Secret AI Recruiting Tool"; Chouldechova (2017) "Fair Prediction with Disparate Impact"; Buolamwini & Gebru (2018) "Gender Shades"; Barocas et al. (2023) "Fairness and Machine Learning"

When Algorithms Discriminate: The Bias Hidden in the Data

Proxy Variables: The Backdoor to Bias

The COMPAS Case Study: A Forensic Analysis

Medical AI: When Bias Means Life or Death

The Root Cause: Cost as a Proxy for Health

The Amazon Hiring Tool: Pattern Recognition Gone Wrong

Breaking the Cycle: Technical and Structural Interventions

Pre-Processing: Data-Centric Approaches

In-Processing: Algorithmic Constraints

Post-Processing: Threshold Adjustment

The Governance Gap: Why Self-Regulation Fails

The Path Forward

This is a Premium Article

Related Articles

The Carbon Cost of Your Recommendation Algorithm

Data Minimalism: The Case for Collecting Less

Data Colonialism: Who Owns the Raw Material?