Week 4 (Thursday, shared with the Engineering cohort): Evals, the TRACE loop

What you keep

How to evaluate an AI feature like a leader, using TRACE. Error analysis is product work, you own the front of this loop.

You ship

Your product, evaluated, with a clear read on where it stands before the sprint.

Lessons

Live

"It looked good when I tried it" is not evaluation, and vibes do not survive change.

Live

Capture real interactions, read them one by one, and journal what went wrong, product work, not engineering.

Live

Cluster failures by frequency, fix the cheap ones, and judge pass/fail, not vague scores.

Live

Codify and Enforce are engineering, but you can recognise good checks and hold a team to them.

Deep dive

Turn top failures into binary pass/fail cases, and re-run the list every time you change the product.

Deep dive

Hand traces and must-pass cases, not a request for generic quality metrics.

Assignment

Traces read, failures ranked, must-pass checklist built, at least one fix shipped.

Lessons in this module

Enroll via Maven

Covered by the Maven Guarantee