Live

RAG evals

You cannot improve retrieval you cannot measure. This lesson teaches you to evaluate the RAG system as two separable problems: is retrieval finding the right context, and is the model using that context faithfully.

Measuring them separately tells you where a bad answer came from - wrong chunks retrieved, or right chunks ignored - which is the difference between fixing your chunking and fixing your prompt. This is the on-ramp to the full evals masterclass in Week 4, applied to retrieval specifically.

Go deeper (optional)