DESK · THEORY
WorkflowBeginner · June 4, 2026 · 7 min read

Why AI gave you a bad answer, and how to fix it

You didn't break it. You gave it nothing to work with.

You've had this experience. You type something into Claude or ChatGPT. You get back a clean, detailed, fluent response. You read it and feel vaguely cheated. It's either generic, wrong in a specific way you can't immediately articulate, confidently wrong on a fact you can verify, or it answered a question you weren't actually asking.

The instinct is to blame the model. That instinct is almost always wrong.

Bad output almost always lives in the prompt, not the model. There are five failure modes that account for the overwhelming majority of bad answers. Each one has a one-line fix. Know these five and you'll spend a fraction of the time you currently spend rewriting, second-guessing, or writing off the tool entirely.


The five failure modes

1. Generic, could-be-anyone output

Symptom. The answer is technically correct but could have been written for any company in any industry at any stage. It's competent. It's useless. You wouldn't actually send it to anyone.

Why it happened. The model had no information about you, your company, your situation, or who the output is for. When you give it nothing, it writes for the statistical average person asking the average version of your question. That is not you.

The fix. Add who you are, the business context, and the audience before you state the task. Tell the model your industry, your company size, who's reading the output, and what decision it feeds. This is the single biggest unlock in AI output quality, and most people skip it entirely.

Give it context covers exactly what to include and hands you a reusable block you fill out once.


2. It answered the wrong question

Symptom. The output is competent. It might even be impressive. But it's not what you needed. You asked for a summary and got an analysis. You wanted three bullet points and got a strategic memo. You needed the specific and got the general.

Why it happened. Your task was vague, so the model guessed your goal. It guesses fluently. It always guesses toward the most common version of whatever you asked, which is rarely the version you meant.

The fix. State the specific deliverable and the goal it serves. Not "help me with our pricing" but "list three pricing models worth considering for a bootstrapped $4M SaaS business, with one tradeoff for each, in a table." The task and the format together define a much smaller solution space. The model fills that space instead of picking its own.

How to write a prompt that actually works breaks down the five-part briefing structure that closes this gap.


3. It made up a fact, citation, or number

Symptom. A confident claim in the output turns out to be wrong. A statistic you can't source. A citation that doesn't exist. A competitor capability that isn't accurate. The model said it with the same tone it uses when it's right.

Why it happened. Language models predict plausible text, not verified truth. They're very good at generating text that sounds like it belongs next to the other text in the output. A meaningful share of outputs can include fabricated details, especially for specific figures, citations, and recent events. The model has no internal experience of "I'm not sure about this one." It produces the most probable next word regardless.

The fix. For any output that carries load-bearing facts (numbers, statistics, named citations, specific competitor claims), ask for sources and acknowledge uncertainty: "list any claims you're not confident about" or "flag anything you'd want me to verify." Then verify. For a quick overview of what's happening when this occurs, hallucination explains the mechanics in plain language. For the habit of checking outputs systematically, how to verify AI output gives you a three-step process.


4. It forgot something you said earlier, or drifted

Symptom. You set a constraint at the start of a long conversation: "keep everything under 150 words" or "we're not pursuing enterprise customers." An hour later, deep in the same chat, the model violates the constraint. It's not ignoring you. It literally can't see as far back as it used to.

Why it happened. Every AI conversation has a context window: a finite amount of text the model can hold in view at once. As a conversation grows, earlier instructions lose salience. They're still technically in the window but they're competing with everything that came after. The constraint you set in message three is a long way from the model's current attention in message forty.

The fix. Two options. Start a fresh chat and re-ground: paste the key constraints into the opening of the new conversation. Or restate the constraints mid-thread before the next request. For anything you reuse across sessions, put it in a Claude Project: your standing instructions load automatically at the start of every conversation so nothing erodes.


5. Wrong shape, format, length, or tone

Symptom. You needed three bullets. You got a 600-word essay. You needed formal. You got casual. You needed a table. You got prose. The content might be fine. The shape is wrong for where it needs to go.

Why it happened. You didn't specify. The default output shape for most models is verbose, essay-like, and comprehensive. That's the statistically safe default when the model has no instructions to the contrary. It's almost never what a CEO needs to actually use the output.

The fix. Specify format, length, tone, and what to avoid. Add one line: "three bullets, no more" or "one paragraph, direct, no hedging." The highest-leverage version: paste a short example of the shape you want. A two-sentence example does more than three paragraphs of format description. If you paste how you'd write it yourself, the model calibrates to that shape immediately.


The meta-fix

Knowing the five failure modes is useful. The moment they pay off is when you get a bad answer and you don't delete everything.

When the output is close but off, append one correction to the same chat. "Good structure. Cut to three bullets and warm up the tone." "The analysis is right but you answered the wrong question. I need the deliverable to be a recommendation, not a breakdown." One line of correction is cheaper than a new brief, and it keeps the context the model already has.

Restarting throws away everything the model learned about your situation in the current conversation. Don't do it unless the output is completely off-base.

The diagnostic question before you correct: which of the five is this? That tells you the one-line fix.

One of those five is almost always the answer.


When you're ready to go deeper

This is a diagnostic guide. For each failure mode, the linked articles carry the full method.

If failure modes one and two show up regularly (generic output, wrong question), the briefing structure linked above is the place to start. It's a single template you fill out once and reuse.

If failure mode four is a constant problem (constraints drifting, context resetting), a Claude Project is the structural fix. Projects let your standing context persist across every conversation so you stop rebuilding it.

If fabricated facts are showing up in your outputs, build the verification habit before you increase your reliance on AI output for anything consequential. The verification process linked above takes about twenty minutes to learn and will change how you use the tool.

The model is good. Most CEOs are just handing it bad briefs. Five failure modes. Five one-line fixes. Diagnose, correct, and keep going.

The Thursday 3

Get three workflows like this every Thursday

The Thursday 3 is a free weekly email. Three workflows that put you in the top 1% of CEOs. 90-second read. Every card links back to a step-by-step guide like this one.

Get the newsletter →
The Desk Theory books

The architecture behind this workflow.

Two operator manuals for the same job, run two ways: OpenCLAW for the always-on harness, Claude Code for the focused-work CLI. Pick one, or get the bundle for $149.

Browse the books · $99 each

Want one workflow like this taken apart end-to-end every week? The Tuesday Pro Deep Dive · $39/mo.