Module 1.5 - Choosing the right model
Not every call deserves your most powerful model. Model selection is the discipline of matching the model to the job across three axes that always trade off: capability, cost, and latency. Reasoning models are more capable, slower, and dearer; fast models are cheaper, quicker, and weaker. There is no best model - only the right model for a given task's needs.
How to actually choose
Start from the task, not the model. A simple classification or extraction often runs perfectly on a small, fast, cheap model - and paying for a frontier reasoning model there is pure waste. A genuinely hard multi-step reasoning task may need the strong model, and skimping there costs you quality that no prompting can recover. The skill is reading which axis a task is bound by.
The mixed-model system
Most production systems do not pick one model - they use several deliberately: the strong model for the hard reasoning step, the cheap fast model for the high-volume simple steps. Your decomposition from Module 1.2 sets this up nicely: once a task is broken into steps, you can route each step to the model it actually needs. This is a senior pattern and it reads extremely well in an interview.
Doing it live
You practise by swapping models on the same endpoint and watching the tradeoffs move: same prompt, different model, and quality, cost, and latency all shift in front of you. Seeing that a cheaper model handles your task just as well - or clearly does not - turns model selection from guesswork into evidence.
Watch out
Common mistakes
- Defaulting to the most powerful model everywhere.
- Choosing by reputation rather than by testing on your task.
- Ignoring latency until users complain.
- Never revisiting the choice as models and prices change.
Go deeper (optional)