Module 1.2 - Prompting that ships | Agentic AI Engineering Bootcamp

Async

Module 1.2 - Prompting that ships

Prompting in production is not about clever wording that coaxes a good answer once. It is about making outputs deterministic enough that the rest of your system can depend on them. A prompt that works brilliantly on three inputs and unpredictably on the fourth is not shippable. The goal is not the best possible single answer - it is the most reliable behaviour across all the inputs you will actually see.

Technique one: structured outputs

This is the big one - the technique that turns a chatbot into a component. Instead of letting the model reply in free prose, you require it to return data in a schema you defined, typically JSON matching a specific shape. Now downstream code can rely on that shape: it can read `result.category` and `result.confidence` without parsing English. Modern APIs support this directly (constrained or structured outputs) so the model is forced to conform to your schema rather than merely asked to. The mental shift: you are not getting an answer - you are getting a typed object your system can build on.

Technique two: few-shot

Showing beats telling. Instead of describing the behaviour you want in the abstract, you include two or three examples of exactly the input-output pairs you expect, and the model matches the pattern. Few-shot is often the fastest way to fix a model that "almost" does what you want: give it examples of the edge cases it is fumbling, and it aligns. The craft is choosing examples that cover the behaviour you care about - especially the tricky cases - rather than three easy ones that teach nothing.

Technique three: decomposition

Some tasks a model handles poorly in one shot but reliably in steps. Decomposition means breaking a hard ask into a sequence the model does well - extract the facts, then reason over them, then format - rather than demanding all three at once. This trades a few extra calls (and tokens, which you will learn to budget) for a large gain in reliability. The judgment is knowing which tasks need it: if one prompt is flaky no matter how you word it, decomposition is usually the answer.

Determinism as the through-goal

Notice the thread: all three techniques exist to make output predictable. Structured outputs constrain the shape, few-shot constrains the behaviour, decomposition constrains the difficulty of each step. You test them live against your own endpoint so you feel the difference between "usually returns something usable" and "reliably returns exactly what my code expects."

Watch out

Common mistakes

Optimizing wording on a handful of inputs and calling it done.
Using free-text output where structured output belongs.
Cramming a hard multi-part task into one prompt.
Picking easy few-shot examples that do not cover your real failure cases.

Go deeper (optional)