How to Conduct Bias Testing for AI Systems¶
Ready to Use
- What: Identify, measure, and mitigate systematic unfairness in AI systems
- When: During development, before deployment, and ongoing in production
- Key metrics: Demographic parity, equal opportunity, calibration
- Tools: Use our Bias Detection Tool for analysis
Purpose¶
This guide provides practical steps for identifying, measuring, and mitigating bias in AI/ML systems deployed in government contexts.
What is AI Bias?¶
AI bias occurs when a system produces systematically unfair outcomes that disadvantage certain groups. In government contexts, this can lead to: - Unequal service delivery - Discriminatory decision-making - Erosion of public trust - Legal and compliance issues
Types of Bias¶
| Type | Description | Example |
|---|---|---|
| Historical Bias | Training data reflects past discrimination | Hiring model trained on historically biased decisions |
| Representation Bias | Underrepresentation of groups in training data | Facial recognition less accurate for minority groups |
| Measurement Bias | Proxy variables correlate with protected attributes | Using postcode as proxy for socioeconomic status |
| Aggregation Bias | One model fails across different subgroups | Healthcare model optimized for majority population |
| Evaluation Bias | Testing doesn't cover all groups equally | Performance metrics only measured on majority group |
Step 1: Define Protected Attributes¶
Common Protected Attributes in Australian Context¶
| Attribute | Legislation | Considerations |
|---|---|---|
| Age | Age Discrimination Act | Service eligibility, employment |
| Disability | Disability Discrimination Act | Accessibility, accommodation |
| Race/Ethnicity | Racial Discrimination Act | Cultural sensitivity, language |
| Sex/Gender | Sex Discrimination Act | Service access, employment |
| Religion | Various state laws | Service delivery, scheduling |
| Geographic location | - | Urban/rural service equity |
| Indigenous status | Various | Culturally appropriate services |
Action Items¶
- List all protected attributes relevant to your use case
- Identify which attributes are available in your data
- Determine proxy variables that may correlate with protected attributes
- Document justification for any attribute use
Step 2: Understand Your Data¶
Data Audit Checklist¶
□ What is the source of your training data?
□ What time period does the data cover?
□ What are the demographics of people in the data?
□ Are all relevant groups adequately represented?
□ Were any groups systematically excluded?
□ Does the data reflect historical discrimination?
□ What proxy variables exist in the data?
Representation Analysis¶
Calculate the representation of each group in your data:
# Example: Check group representation
import pandas as pd
def check_representation(df, attribute):
"""Check representation of groups in dataset."""
counts = df[attribute].value_counts()
percentages = df[attribute].value_counts(normalize=True) * 100
print(f"\nRepresentation by {attribute}:")
for group in counts.index:
print(f" {group}: {counts[group]:,} ({percentages[group]:.1f}%)")
# Flag underrepresented groups (< 5% or < 100 samples)
underrepresented = percentages[percentages < 5].index.tolist()
small_samples = counts[counts < 100].index.tolist()
if underrepresented or small_samples:
print(f"\n WARNING: Underrepresented groups: {underrepresented + small_samples}")
return counts, percentages
Historical Bias Check¶
- Review historical outcomes by group
- Identify any systematic patterns of disadvantage
- Consider whether historical patterns should be replicated
Step 3: Select Fairness Metrics¶
Common Fairness Metrics¶
| Metric | Definition | Use When |
|---|---|---|
| Demographic Parity | Equal positive prediction rates across groups | Equal access is priority |
| Equalized Odds | Equal TPR and FPR across groups | Accuracy matters for all groups |
| Equal Opportunity | Equal TPR across groups | Focus on not missing positive cases |
| Predictive Parity | Equal PPV across groups | Consistent precision needed |
| Calibration | Predicted probabilities match actual rates | Risk scores used |
Choosing the Right Metric¶
Decision Framework:
1. Is equal access the primary concern?
→ Use Demographic Parity
2. Is minimizing false negatives critical?
→ Use Equal Opportunity
3. Are false positives and negatives equally harmful?
→ Use Equalized Odds
4. Are predictions used as risk scores?
→ Use Calibration
5. Multiple concerns?
→ Use multiple metrics, document trade-offs
Important Note¶
It is mathematically impossible to satisfy all fairness metrics simultaneously (except in special cases). Document which metrics you prioritize and why.
Step 4: Implement Bias Testing¶
Pre-Training Testing¶
Test your data before model training:
from scipy import stats
def test_label_bias(df, label_col, protected_attr):
"""Test if labels differ significantly across groups."""
groups = df[protected_attr].unique()
rates = {}
for group in groups:
group_data = df[df[protected_attr] == group]
rates[group] = group_data[label_col].mean()
print(f"\nLabel rates by {protected_attr}:")
for group, rate in rates.items():
print(f" {group}: {rate:.3f}")
# Chi-square test for significant differences
contingency = pd.crosstab(df[protected_attr], df[label_col])
chi2, p_value, _, _ = stats.chi2_contingency(contingency)
print(f"\nChi-square test: chi2={chi2:.2f}, p-value={p_value:.4f}")
if p_value < 0.05:
print(" WARNING: Significant difference in labels across groups")
return rates, p_value
Post-Training Testing¶
Test model predictions:
from sklearn.metrics import confusion_matrix
def calculate_fairness_metrics(y_true, y_pred, groups):
"""Calculate key fairness metrics by group."""
results = {}
for group in groups.unique():
mask = groups == group
y_true_group = y_true[mask]
y_pred_group = y_pred[mask]
tn, fp, fn, tp = confusion_matrix(y_true_group, y_pred_group).ravel()
results[group] = {
'positive_rate': (tp + fp) / (tp + fp + tn + fn), # Demographic parity
'tpr': tp / (tp + fn) if (tp + fn) > 0 else 0, # Equal opportunity
'fpr': fp / (fp + tn) if (fp + tn) > 0 else 0, # Equalized odds
'ppv': tp / (tp + fp) if (tp + fp) > 0 else 0, # Predictive parity
'n': len(y_true_group)
}
return results
def print_fairness_report(metrics):
"""Print formatted fairness report."""
print("\nFairness Metrics by Group:")
print("-" * 70)
print(f"{'Group':<15} {'Pos Rate':>10} {'TPR':>10} {'FPR':>10} {'PPV':>10} {'N':>10}")
print("-" * 70)
for group, m in metrics.items():
print(f"{group:<15} {m['positive_rate']:>10.3f} {m['tpr']:>10.3f} "
f"{m['fpr']:>10.3f} {m['ppv']:>10.3f} {m['n']:>10,}")
# Calculate disparities
groups = list(metrics.keys())
if len(groups) >= 2:
print("\nDisparities (ratio of min/max):")
for metric in ['positive_rate', 'tpr', 'fpr', 'ppv']:
values = [m[metric] for m in metrics.values() if m[metric] > 0]
if values:
disparity = min(values) / max(values)
status = "PASS" if disparity >= 0.8 else "FAIL"
print(f" {metric}: {disparity:.3f} ({status})")
Threshold Testing¶
The "80% rule" is a common fairness benchmark:
A metric passes if: min_group_rate / max_group_rate >= 0.80
Example:
- Group A positive rate: 0.40
- Group B positive rate: 0.35
- Ratio: 0.35 / 0.40 = 0.875 ✓ PASS
Step 5: Mitigate Identified Bias¶
Pre-Processing Techniques¶
| Technique | Description | Pros | Cons |
|---|---|---|---|
| Resampling | Balance representation | Simple | May reduce data |
| Reweighting | Weight samples by group | Preserves data | May not fully address bias |
| Fair representation | Transform features | Removes bias | Complex, may lose information |
In-Processing Techniques¶
| Technique | Description | Use Case |
|---|---|---|
| Fairness constraints | Add fairness to objective | When training your own model |
| Adversarial debiasing | Train discriminator to remove bias | Complex models |
| Fair regularization | Penalize unfair predictions | Fine-tuning models |
Post-Processing Techniques¶
| Technique | Description | Use Case |
|---|---|---|
| Threshold adjustment | Different thresholds per group | Quick fix, interpretable |
| Calibration | Adjust predicted probabilities | Risk scores |
| Reject option | Don't decide near threshold | High-stakes decisions |
Example: Threshold Adjustment¶
from sklearn.metrics import precision_recall_curve
def find_fair_thresholds(y_true, y_prob, groups, target_metric='tpr'):
"""Find thresholds that equalize a metric across groups."""
thresholds = {}
# Find threshold for each group that achieves target rate
target_rate = 0.5 # Example: target 50% positive rate
for group in groups.unique():
mask = groups == group
y_prob_group = y_prob[mask]
# Find threshold that gives target positive rate
sorted_probs = np.sort(y_prob_group)[::-1]
target_idx = int(len(sorted_probs) * target_rate)
thresholds[group] = sorted_probs[min(target_idx, len(sorted_probs)-1)]
return thresholds
def apply_fair_thresholds(y_prob, groups, thresholds):
"""Apply group-specific thresholds."""
y_pred = np.zeros(len(y_prob))
for group, threshold in thresholds.items():
mask = groups == group
y_pred[mask] = (y_prob[mask] >= threshold).astype(int)
return y_pred
Step 6: Document and Monitor¶
Bias Testing Documentation¶
Create a record of your bias testing:
## Bias Testing Record
**Model/System:** [Name]
**Date:** [Date]
**Tester:** [Name]
### Protected Attributes Tested
- [List attributes]
### Metrics Used
- [List metrics and justification]
### Results Summary
| Attribute | Metric | Group A | Group B | Disparity | Status |
|-----------|--------|---------|---------|-----------|--------|
| | | | | | |
### Issues Identified
1. [Issue description]
2. [Issue description]
### Mitigations Applied
1. [Mitigation and rationale]
2. [Mitigation and rationale]
### Residual Risk
[Description of remaining bias risk and justification]
### Approval
- [ ] Technical review
- [ ] Ethics review
- [ ] Business sign-off
Ongoing Monitoring¶
Set up continuous bias monitoring:
def monitor_bias_production(predictions_log, protected_attr,
alert_threshold=0.8):
"""Monitor production predictions for bias drift."""
# Calculate recent metrics
recent_metrics = calculate_fairness_metrics(
predictions_log['actual'],
predictions_log['predicted'],
predictions_log[protected_attr]
)
# Check for disparity
positive_rates = [m['positive_rate'] for m in recent_metrics.values()]
disparity = min(positive_rates) / max(positive_rates)
if disparity < alert_threshold:
send_alert(f"Bias alert: {protected_attr} disparity = {disparity:.3f}")
return disparity
Quick Reference: Bias Testing Checklist¶
Before Training¶
- Identify protected attributes
- Audit training data representation
- Check for historical bias in labels
- Document proxy variables
During Development¶
- Select appropriate fairness metrics
- Test model predictions across groups
- Calculate disparity ratios
- Apply 80% rule threshold
If Bias Detected¶
- Identify root cause (data vs model vs both)
- Select appropriate mitigation technique
- Re-test after mitigation
- Document residual risk
Before Deployment¶
- Complete bias testing record
- Obtain ethics review sign-off
- Set up production monitoring
- Plan regular bias audits
In Production¶
- Monitor fairness metrics continuously
- Set alerting thresholds
- Schedule periodic re-assessment
- Collect feedback from affected groups
Resources¶
Tools and Libraries¶
- Fairlearn (Microsoft): Fairness assessment and mitigation
- AI Fairness 360 (IBM): Comprehensive fairness toolkit
- What-If Tool (Google): Interactive fairness exploration
- Aequitas: Bias and fairness audit toolkit
Further Reading¶
- AHRC: Human Rights and Technology Final Report
- OECD Principles on AI
- Australian Government AI Ethics Framework