Week 4 (Thursday, shared with the Engineering cohort): Evals, the TRACE loop

7 lessons · Back to full syllabus

What you keep

How to evaluate an AI feature like a leader, using TRACE. Error analysis is product work, you own the front of this loop.

You ship

Your product, evaluated, with a clear read on where it stands before the sprint.

Lessons

Live

The vibe-check trap

"It looked good when I tried it" is not evaluation, and vibes do not survive change.

Read lesson
Live

TRACE: Trace and Read (error analysis is your job)

Capture real interactions, read them one by one, and journal what went wrong, product work, not engineering.

Read lesson
Live

TRACE: Analyze (decide what matters, fix the obvious)

Cluster failures by frequency, fix the cheap ones, and judge pass/fail, not vague scores.

Read lesson
Live

What good evals tooling looks like (so you can lead it)

Codify and Enforce are engineering, but you can recognise good checks and hold a team to them.

Read lesson
Deep dive

Build a simple must-pass checklist for your product

Turn top failures into binary pass/fail cases, and re-run the list every time you change the product.

Read lesson
Deep dive

How to brief an engineer to build the evals you need

Hand traces and must-pass cases, not a request for generic quality metrics.

Read lesson
Assignment

Run TRACE on your product

Traces read, failures ranked, must-pass checklist built, at least one fix shipped.

Read lesson