Prompting advice ages like fish. The magic phrases of 2023 — “think step by step,” “you are an expert” — were patches for weaknesses that newer models patched themselves. The tip lists got longer while the half-life got shorter. What survived is smaller and more boring: three levers that work because of what a language model is, not because of how any particular release was trained.
A model predicts text given context. It follows that everything you control lives in the context you provide. The three levers are just the three kinds of context that matter: what it needs to know, what it must respect, and what good looks like.
Lever one: context it can’t guess
The model has read more than any person alive, and it knows nothing about your situation. Not your customer, your codebase, your margins, your tone, or what you already tried. Every detail you leave out, it fills with the statistically average version — and the average is what generic output looks like.
The fix is mechanical, not clever. Before asking, paste in:
- The actual material. The email thread you’re replying to, the contract clause in dispute, the data you’re analyzing. Never describe a document you could paste.
- The situation. Who’s involved, what happened, what you’re trying to get done, what’s already been ruled out.
- The audience. A reply to a frustrated customer, a note to your accountant, and a post for strangers each want different words. Say which.
A useful self-check: could a smart temp worker do the task from your prompt alone, with no follow-up questions? If they’d have to ask “wait, which customer? what did we promise them?”, the model has the same gaps — it just won’t ask. It will answer anyway, fluently, from the average.
There’s a ceiling worth knowing about. Context windows are large now, but attention isn’t uniform — burying one critical sentence inside two hundred pasted pages still degrades results. Paste what’s relevant, not what’s available. Relevance is your filter to apply; the model can’t know which page actually matters to you.
Lever two: constraints you actually have
Left unconstrained, the model optimizes for plausible and agreeable. Constraints are how you make it optimize for useful. Most working prompts are mostly constraints:
- Format. “Three bullet points,” “a table with columns X and Y,” “subject line plus 120-word body,” “valid JSON matching this shape.”
- Length. Models pad by default. “Under 100 words” does more for quality than any phrase about being an expert.
- Scope. “Only address the billing question; don’t mention the upcoming migration.” Unscoped, the model helpfully mentions everything.
- Honesty rails. “If the information isn’t in the document I pasted, say ‘not found’ instead of guessing.” This single line converts a hallucination risk into a checkable output.
- Process. “Ask me up to three clarifying questions before drafting.” Modern models will actually use this permission — but only if granted.
Constraints do a second, quieter job: they make failure visible. An unconstrained answer fails vaguely — it’s just sort of mediocre, and you can’t say why. A constrained answer fails legibly: the table has a wrong column, the summary ignored the scope line. Legible failure is fixable failure; you adjust one constraint and run it again instead of starting over.
State constraints positively where you can. “Write in plain English a non-lawyer can follow” beats a list of ten banned jargon words, because it tells the model what to aim at rather than what to orbit.
Lever three: one good example
For anything with a format, a voice, or a house style, one worked example beats paragraphs of description. This is the oldest result in the field — models imitate patterns far better than they follow abstract instructions — and it has survived every architecture change since.
In practice:
- Voice: “Here’s a reply I sent last month that sounds like me: [paste]. Match that voice for this new reply.”
- Structure: “Here’s last quarter’s update: [paste]. Same structure for this quarter, using these numbers: [paste].”
- Judgment: “Here are two leads, one I marked qualified and one I didn’t, with my one-line reasons. Classify the next ten the same way.”
One representative example usually moves quality more than any other single change. Two or three help when the pattern has real variety. Past a handful, returns fall fast — and a bad example hurts more than no example, because the model will faithfully reproduce whatever your sample actually demonstrates, including the flaws you stopped noticing. Choose the example you wish were typical of your output, not the first one you find.
What didn’t survive
It’s worth being explicit about what this list excludes, because the excluded advice still circulates:
- Role-play incantations. “You are a world-class expert” measurably mattered on weaker models. On current ones the effect is mostly placebo — useful only insofar as it smuggles in real context, and you can just provide the context.
- Politeness and threats. Neither tipping, pleading, nor menace reliably changes quality. The model has no stake; it has a context window.
- Secret keywords. Every viral magic phrase has decayed within months as models changed. The three levers decayed not at all, because they’re not exploits — they’re information.
A simpler way to say it: tricks target the model, and models keep changing. The levers target the task — what’s needed, what’s required, what good looks like — and your task doesn’t care which model is running this quarter.
A worked example
Weak prompt: “Write a follow-up email to a client about the delayed project.”
Strong prompt: “Below is my last email to the client and their reply. Write my follow-up. Context: the delay is on our side (a supplier slipped), the new delivery date is March 14, and this client has already absorbed one delay in January, so the tone should own the slip without groveling. Constraints: under 150 words, no exclamation marks, end with one concrete next step plus the date. Here’s an apology email I sent another client last year that sounds like me: [paste].”
Same task, same model. The second prompt isn’t smarter — it’s just complete: context the model couldn’t guess, constraints the task genuinely had, one example of what good sounds like. That completeness, not vocabulary, is the entire skill.
When the output is still wrong
The levers raise the floor; they don’t abolish bad outputs. When a well-fed prompt still misses, resist the urge to argue with the draft in ten follow-up messages. Diagnose which lever failed: did it lack a fact (context), ignore a requirement you never actually stated (constraints), or produce the wrong kind of thing (example)? Fix that one input and regenerate. Editing the prompt beats litigating the output — the second draft from a corrected prompt is almost always better than the fifth patch on a misfired one.
Prompting, in the end, is a writing skill, and the writing it most resembles is delegation: the brief you’d give a capable stranger. People who delegate well prompted well on day one, and they’ll still prompt well on whatever ships next year.