Skip to content

About the Model

Forbidden Questions

Questions About AI That Vendors Won't Answer
"The vendor demo worked perfectly." That's the last time it will.
Questions They Hope You Won't Ask
  • What customers have left you, and why?
  • What's your worst failure case?
  • Can we see internal metrics, not marketing ones?
  • What's the total cost of ownership?

The Fundamental Questions

1. How does it actually work?

The forbidden version: "Not the marketing explanation—the technical reality that would let us evaluate it."

What you'll hear instead: - "It uses advanced machine learning" - "Our proprietary algorithms..." - "State-of-the-art AI technology" - "It's too complex to explain simply"

What to probe: - What's the actual model architecture? - What are the key assumptions? - How are decisions made—specifically? - Can we inspect the logic for a given decision? - What would an independent expert say about this approach?

Why it matters: "Trust us, it works" is not sufficient for government decisions. If you can't explain it, you can't defend it. If you can't inspect it, you can't govern it.


2. What can't it do?

The forbidden version: "What are the genuine limitations that the vendor isn't mentioning?"

What you'll hear instead: - "It can handle any use case" - "It adapts to your needs" - "With enough data, anything is possible"

What to probe: - Where does the model perform poorly? - What types of cases does it fail on? - What has it been tested on vs. assumed to work for? - What would break it? - What does "works" even mean in edge cases?

Why it matters: Every AI system has boundaries. Knowing them prevents deploying systems into territory they can't handle. The vendor won't volunteer this; you must extract it.


3. What's the actual accuracy?

The forbidden version: "Not the cherry-picked benchmark—the real-world performance on cases like ours."

What you'll hear instead: - "97% accuracy" - "Best-in-class performance" - "Outperforms human experts"

What to probe: - Accuracy on what test set? - How was that test set constructed? - What's performance on edge cases? - What's accuracy for specific subpopulations? - Is the test set representative of our actual use? - What's the accuracy when it matters most (high-stakes decisions)?

Why it matters: Accuracy metrics are often measured in conditions nothing like real deployment. High average accuracy can mask terrible performance for specific groups or cases.


4. What happens when it's wrong?

The forbidden version: "Not 'errors are rare'—what specifically happens to people when the model fails?"

What you'll hear instead: - "Error rates are within acceptable bounds" - "Human oversight catches errors" - "There's an appeals process"

What to probe: - What does a false positive mean for the affected person? - What does a false negative mean? - How long before errors are caught? - What's the remediation process? - Who bears the cost of errors—the system or the citizen?

Why it matters: A 5% error rate sounds technical. "5,000 people wrongly denied benefits" sounds like what it is. Errors aren't statistics; they're experiences.


The Bias Questions

5. Is the model biased?

The forbidden version: "Don't tell me you removed demographic variables—tell me you tested for disparate impact."

What you'll hear instead: - "We don't use protected characteristics" - "The model is objective" - "We followed fairness best practices"

What to probe: - Have you tested outcomes across demographic groups? - What proxies for protected characteristics might exist? - What disparate impact exists in training data? - What definition of fairness are you using (there are many, conflicting ones)? - Who decided what "fair" means here, and were affected groups consulted?

Why it matters: Removing demographic variables doesn't remove bias. Bias can encode through proxies, historical patterns, and structural inequities. Only testing for disparate impact reveals it.


6. Whose values are encoded?

The forbidden version: "When trade-offs were made in model design, who decided what was acceptable?"

What you'll hear instead: - "We optimized for performance" - "Industry best practices guided design" - "The model is technically neutral"

What to probe: - When accuracy conflicts with fairness, how is the trade-off made? - When efficiency conflicts with individual review, who decides? - Whose definition of "good outcome" is the model optimizing for? - Were affected communities involved in defining objectives?

Why it matters: AI systems embed values in their design. These values are often implicit and unexamined. The question isn't whether values are present—it's whose values.


The Vendor Questions

7. What are you not telling us?

The forbidden version: "What would you tell us if you weren't trying to win this contract?"

What you'll hear instead: - "We've been fully transparent" - "All relevant information is in the proposal" - "We're committed to partnership"

What to probe: - What challenges have other customers faced? - What implementation has failed, and why? - What does your competitor's solution do better? - What would make this not the right fit for us? - What concerns would you have if you were in our position?

Why it matters: Vendors have incentives to emphasize strengths and minimize weaknesses. The information asymmetry is real. Probing for what's not being said levels the field.


8. What does "AI" mean in your solution?

The forbidden version: "Is this actually machine learning, or is it a rules engine you're calling AI?"

What you'll hear instead: - "Our AI-powered solution..." - "Advanced artificial intelligence..." - "Cutting-edge machine learning..."

What to probe: - What specific ML techniques are used? - What's the role of hand-coded rules? - How much of the "AI" is statistical models vs. heuristics? - Would an ML expert recognize this as AI? - Is "AI" the marketing wrapper for something simpler?

Why it matters: "AI" is a marketing term. Sometimes what's sold as AI is glorified if-then-else logic. That's not inherently bad, but calling it AI creates misaligned expectations.


9. What happens when the model needs to be updated?

The forbidden version: "Who controls the model after deployment, and what does it cost?"

What you'll hear instead: - "We provide ongoing support" - "The model continuously improves" - "Updates are included in the contract"

What to probe: - Who can modify the model—us or vendor only? - What's the process and timeline for updates? - What does updating cost (money, time, disruption)? - How do we know if the model is drifting/degrading? - Can we take the model in-house if the vendor relationship ends?

Why it matters: Models degrade over time as the world changes. If you can't update the model—or if updating is costly and slow—you're locked into degrading performance.


10. What's the exit strategy?

The forbidden version: "If we need to stop using this, can we?"

What you'll hear instead: - "Our customers stay with us for years" - "The solution is designed for long-term partnership" - "We're committed to your success"

What to probe: - Can we export our data in usable format? - Can we transfer the model or knowledge to another system? - What's the actual cost and timeline of switching? - What happens to institutional knowledge if the contract ends? - Are we building dependency or capability?

Why it matters: Vendor relationships can become traps. Understanding exit before entry is essential. If you can't leave, you're not a partner—you're a hostage.


The Technical Depth Questions

11. What's in the training data?

The forbidden version: "Not 'representative data'—what specific data, from what sources, with what limitations?"

What you'll hear instead: - "We use industry-leading datasets" - "Data is carefully curated" - "Training data is representative"

What to probe: - What are the actual data sources? - From what time periods? What geographies? What populations? - What was excluded, and why? - How was labeling done, and by whom? - What biases exist in training data selection?

Why it matters: The model learns from its training data. If you don't know what's in the data, you don't know what the model has learned—including what biases it has absorbed.


12. How was the model validated?

The forbidden version: "Not 'extensive testing'—what specific validation, and by whom?"

What you'll hear instead: - "Rigorous validation process" - "Tested by leading researchers" - "Peer-reviewed methodology"

What to probe: - What specific validation was done? - Was validation internal or independent? - What datasets were used for validation? - Were validation datasets truly independent from training? - Has it been validated in contexts like ours?

Why it matters: Internal validation is necessary but insufficient. Validation on convenient data doesn't guarantee performance on your data. Independent validation in realistic conditions is gold standard.


13. Can we audit this?

The forbidden version: "Not 'auditing is available'—can we actually inspect what matters?"

What you'll hear instead: - "We support auditability" - "Full transparency is provided" - "Documentation is available"

What to probe: - Can we examine the model itself, not just inputs/outputs? - Can we access training data for review? - Can we bring in independent auditors with full access? - What's proprietary and off-limits? - What audit rights are in the contract, specifically?

Why it matters: Auditability isn't binary. Vendors may offer "transparency" that's actually just input/output logging. True auditability requires access to model internals and training data.


14. What's the explanation for a given decision?

The forbidden version: "Not 'the model considers these factors'—why did this specific citizen get this specific decision?"

What you'll hear instead: - "The model provides explanations" - "Transparency is built in" - "Users can see relevant factors"

What to probe: - Can we explain a specific decision to the affected person? - Are explanations faithful to actual model logic, or post-hoc rationalizations? - Can explanations be understood by non-experts? - Are explanations sufficient for legal/appeal purposes? - Would the explanation satisfy a tribunal?

Why it matters: People affected by AI decisions have rights to understand why. "The algorithm decided" isn't an acceptable explanation for government decisions. Explainability must be real, not cosmetic.


15. What happens in adversarial conditions?

The forbidden version: "If someone tries to manipulate the model, how hard is that?"

What you'll hear instead: - "Security is a priority" - "Robust to manipulation" - "We've considered adversarial scenarios"

What to probe: - How hard is it to game the model? - What would an adversary need to know to exploit it? - Have you tested for adversarial attacks? - What happens if training data is poisoned? - How would you detect manipulation?

Why it matters: Once an AI system makes decisions with consequences, someone will try to game it. Understanding adversarial robustness is part of understanding what you're deploying.


The Questions They Hope You Won't Ask

Question Why They Hope You Won't Ask
"Can we see the contract failure clauses?" They're usually weak
"What customers have left you, and why?" The truth hurts
"Can we talk to your unhappy customers?" Only happy ones are offered
"What's your worst failure case?" They'd rather not say
"How many government projects have you failed?" They've spun it as success
"Can we see internal performance metrics, not marketing ones?" Marketing versions are better
"What happens when your company is acquired?" It's a real risk they minimize
"Can we negotiate IP ownership of improvements?" They want to keep it
"What's the total cost of ownership, including hidden costs?" Much higher than proposal

Before You Trust the Model

Ask yourself:

  1. Do I understand how it actually works?
  2. Do I know where it fails?
  3. Have I seen performance on data like mine?
  4. Have I tested for bias, not just heard claims about it?
  5. Can I explain a decision to an affected citizen?
  6. Can I audit it properly if needed?
  7. Can I exit this relationship if necessary?

If you can't confidently answer "yes" to all of these, you don't trust the model—you're trusting the vendor.

That's not the same thing.


"Any vendor who won't answer these questions is telling you something by not answering. Listen to that."