Skip to content

How to Conduct Bias Testing for AI Systems

Ready to Use

Quick Reference
  • What: Identify, measure, and mitigate systematic unfairness in AI systems
  • When: During development, before deployment, and ongoing in production
  • Key metrics: Demographic parity, equal opportunity, calibration
  • Tools: Use our Bias Detection Tool for analysis

Purpose

This guide provides practical steps for identifying, measuring, and mitigating bias in AI/ML systems deployed in government contexts.


What is AI Bias?

AI bias occurs when a system produces systematically unfair outcomes that disadvantage certain groups. In government contexts, this can lead to: - Unequal service delivery - Discriminatory decision-making - Erosion of public trust - Legal and compliance issues

Types of Bias

Type Description Example
Historical Bias Training data reflects past discrimination Hiring model trained on historically biased decisions
Representation Bias Underrepresentation of groups in training data Facial recognition less accurate for minority groups
Measurement Bias Proxy variables correlate with protected attributes Using postcode as proxy for socioeconomic status
Aggregation Bias One model fails across different subgroups Healthcare model optimized for majority population
Evaluation Bias Testing doesn't cover all groups equally Performance metrics only measured on majority group

Step 1: Define Protected Attributes

Common Protected Attributes in Australian Context

Attribute Legislation Considerations
Age Age Discrimination Act Service eligibility, employment
Disability Disability Discrimination Act Accessibility, accommodation
Race/Ethnicity Racial Discrimination Act Cultural sensitivity, language
Sex/Gender Sex Discrimination Act Service access, employment
Religion Various state laws Service delivery, scheduling
Geographic location - Urban/rural service equity
Indigenous status Various Culturally appropriate services

Action Items

  • List all protected attributes relevant to your use case
  • Identify which attributes are available in your data
  • Determine proxy variables that may correlate with protected attributes
  • Document justification for any attribute use

Step 2: Understand Your Data

Data Audit Checklist

□ What is the source of your training data?
□ What time period does the data cover?
□ What are the demographics of people in the data?
□ Are all relevant groups adequately represented?
□ Were any groups systematically excluded?
□ Does the data reflect historical discrimination?
□ What proxy variables exist in the data?

Representation Analysis

Calculate the representation of each group in your data:

# Example: Check group representation
import pandas as pd

def check_representation(df, attribute):
    """Check representation of groups in dataset."""
    counts = df[attribute].value_counts()
    percentages = df[attribute].value_counts(normalize=True) * 100

    print(f"\nRepresentation by {attribute}:")
    for group in counts.index:
        print(f"  {group}: {counts[group]:,} ({percentages[group]:.1f}%)")

    # Flag underrepresented groups (< 5% or < 100 samples)
    underrepresented = percentages[percentages < 5].index.tolist()
    small_samples = counts[counts < 100].index.tolist()

    if underrepresented or small_samples:
        print(f"\n  WARNING: Underrepresented groups: {underrepresented + small_samples}")

    return counts, percentages

Historical Bias Check

  • Review historical outcomes by group
  • Identify any systematic patterns of disadvantage
  • Consider whether historical patterns should be replicated

Step 3: Select Fairness Metrics

Common Fairness Metrics

Metric Definition Use When
Demographic Parity Equal positive prediction rates across groups Equal access is priority
Equalized Odds Equal TPR and FPR across groups Accuracy matters for all groups
Equal Opportunity Equal TPR across groups Focus on not missing positive cases
Predictive Parity Equal PPV across groups Consistent precision needed
Calibration Predicted probabilities match actual rates Risk scores used

Choosing the Right Metric

Decision Framework:

1. Is equal access the primary concern?
   → Use Demographic Parity

2. Is minimizing false negatives critical?
   → Use Equal Opportunity

3. Are false positives and negatives equally harmful?
   → Use Equalized Odds

4. Are predictions used as risk scores?
   → Use Calibration

5. Multiple concerns?
   → Use multiple metrics, document trade-offs

Important Note

It is mathematically impossible to satisfy all fairness metrics simultaneously (except in special cases). Document which metrics you prioritize and why.


Step 4: Implement Bias Testing

Pre-Training Testing

Test your data before model training:

from scipy import stats

def test_label_bias(df, label_col, protected_attr):
    """Test if labels differ significantly across groups."""
    groups = df[protected_attr].unique()
    rates = {}

    for group in groups:
        group_data = df[df[protected_attr] == group]
        rates[group] = group_data[label_col].mean()

    print(f"\nLabel rates by {protected_attr}:")
    for group, rate in rates.items():
        print(f"  {group}: {rate:.3f}")

    # Chi-square test for significant differences
    contingency = pd.crosstab(df[protected_attr], df[label_col])
    chi2, p_value, _, _ = stats.chi2_contingency(contingency)

    print(f"\nChi-square test: chi2={chi2:.2f}, p-value={p_value:.4f}")
    if p_value < 0.05:
        print("  WARNING: Significant difference in labels across groups")

    return rates, p_value

Post-Training Testing

Test model predictions:

from sklearn.metrics import confusion_matrix

def calculate_fairness_metrics(y_true, y_pred, groups):
    """Calculate key fairness metrics by group."""
    results = {}

    for group in groups.unique():
        mask = groups == group
        y_true_group = y_true[mask]
        y_pred_group = y_pred[mask]

        tn, fp, fn, tp = confusion_matrix(y_true_group, y_pred_group).ravel()

        results[group] = {
            'positive_rate': (tp + fp) / (tp + fp + tn + fn),  # Demographic parity
            'tpr': tp / (tp + fn) if (tp + fn) > 0 else 0,     # Equal opportunity
            'fpr': fp / (fp + tn) if (fp + tn) > 0 else 0,     # Equalized odds
            'ppv': tp / (tp + fp) if (tp + fp) > 0 else 0,     # Predictive parity
            'n': len(y_true_group)
        }

    return results

def print_fairness_report(metrics):
    """Print formatted fairness report."""
    print("\nFairness Metrics by Group:")
    print("-" * 70)
    print(f"{'Group':<15} {'Pos Rate':>10} {'TPR':>10} {'FPR':>10} {'PPV':>10} {'N':>10}")
    print("-" * 70)

    for group, m in metrics.items():
        print(f"{group:<15} {m['positive_rate']:>10.3f} {m['tpr']:>10.3f} "
              f"{m['fpr']:>10.3f} {m['ppv']:>10.3f} {m['n']:>10,}")

    # Calculate disparities
    groups = list(metrics.keys())
    if len(groups) >= 2:
        print("\nDisparities (ratio of min/max):")
        for metric in ['positive_rate', 'tpr', 'fpr', 'ppv']:
            values = [m[metric] for m in metrics.values() if m[metric] > 0]
            if values:
                disparity = min(values) / max(values)
                status = "PASS" if disparity >= 0.8 else "FAIL"
                print(f"  {metric}: {disparity:.3f} ({status})")

Threshold Testing

The "80% rule" is a common fairness benchmark:

A metric passes if: min_group_rate / max_group_rate >= 0.80

Example:
- Group A positive rate: 0.40
- Group B positive rate: 0.35
- Ratio: 0.35 / 0.40 = 0.875 ✓ PASS

Step 5: Mitigate Identified Bias

Pre-Processing Techniques

Technique Description Pros Cons
Resampling Balance representation Simple May reduce data
Reweighting Weight samples by group Preserves data May not fully address bias
Fair representation Transform features Removes bias Complex, may lose information

In-Processing Techniques

Technique Description Use Case
Fairness constraints Add fairness to objective When training your own model
Adversarial debiasing Train discriminator to remove bias Complex models
Fair regularization Penalize unfair predictions Fine-tuning models

Post-Processing Techniques

Technique Description Use Case
Threshold adjustment Different thresholds per group Quick fix, interpretable
Calibration Adjust predicted probabilities Risk scores
Reject option Don't decide near threshold High-stakes decisions

Example: Threshold Adjustment

from sklearn.metrics import precision_recall_curve

def find_fair_thresholds(y_true, y_prob, groups, target_metric='tpr'):
    """Find thresholds that equalize a metric across groups."""
    thresholds = {}

    # Find threshold for each group that achieves target rate
    target_rate = 0.5  # Example: target 50% positive rate

    for group in groups.unique():
        mask = groups == group
        y_prob_group = y_prob[mask]

        # Find threshold that gives target positive rate
        sorted_probs = np.sort(y_prob_group)[::-1]
        target_idx = int(len(sorted_probs) * target_rate)
        thresholds[group] = sorted_probs[min(target_idx, len(sorted_probs)-1)]

    return thresholds

def apply_fair_thresholds(y_prob, groups, thresholds):
    """Apply group-specific thresholds."""
    y_pred = np.zeros(len(y_prob))

    for group, threshold in thresholds.items():
        mask = groups == group
        y_pred[mask] = (y_prob[mask] >= threshold).astype(int)

    return y_pred

Step 6: Document and Monitor

Bias Testing Documentation

Create a record of your bias testing:

## Bias Testing Record

**Model/System:** [Name]
**Date:** [Date]
**Tester:** [Name]

### Protected Attributes Tested
- [List attributes]

### Metrics Used
- [List metrics and justification]

### Results Summary
| Attribute | Metric | Group A | Group B | Disparity | Status |
|-----------|--------|---------|---------|-----------|--------|
| | | | | | |

### Issues Identified
1. [Issue description]
2. [Issue description]

### Mitigations Applied
1. [Mitigation and rationale]
2. [Mitigation and rationale]

### Residual Risk
[Description of remaining bias risk and justification]

### Approval
- [ ] Technical review
- [ ] Ethics review
- [ ] Business sign-off

Ongoing Monitoring

Set up continuous bias monitoring:

def monitor_bias_production(predictions_log, protected_attr,
                           alert_threshold=0.8):
    """Monitor production predictions for bias drift."""
    # Calculate recent metrics
    recent_metrics = calculate_fairness_metrics(
        predictions_log['actual'],
        predictions_log['predicted'],
        predictions_log[protected_attr]
    )

    # Check for disparity
    positive_rates = [m['positive_rate'] for m in recent_metrics.values()]
    disparity = min(positive_rates) / max(positive_rates)

    if disparity < alert_threshold:
        send_alert(f"Bias alert: {protected_attr} disparity = {disparity:.3f}")

    return disparity

Quick Reference: Bias Testing Checklist

Before Training

  • Identify protected attributes
  • Audit training data representation
  • Check for historical bias in labels
  • Document proxy variables

During Development

  • Select appropriate fairness metrics
  • Test model predictions across groups
  • Calculate disparity ratios
  • Apply 80% rule threshold

If Bias Detected

  • Identify root cause (data vs model vs both)
  • Select appropriate mitigation technique
  • Re-test after mitigation
  • Document residual risk

Before Deployment

  • Complete bias testing record
  • Obtain ethics review sign-off
  • Set up production monitoring
  • Plan regular bias audits

In Production

  • Monitor fairness metrics continuously
  • Set alerting thresholds
  • Schedule periodic re-assessment
  • Collect feedback from affected groups

Resources

Tools and Libraries

  • Fairlearn (Microsoft): Fairness assessment and mitigation
  • AI Fairness 360 (IBM): Comprehensive fairness toolkit
  • What-If Tool (Google): Interactive fairness exploration
  • Aequitas: Bias and fairness audit toolkit

Further Reading

  • AHRC: Human Rights and Technology Final Report
  • OECD Principles on AI
  • Australian Government AI Ethics Framework