Engineering

Build a Visual QA Agent That Checks Screenshots

Build an automated QA agent that captures screenshots of your web app, sends them to Claude vision, and evaluates them against plain-English acceptance criteria to catch visual regressions automatically.

June 26, 2026
5 min read
Aki Wijesundara
#QA#Claude Code#Testing

Key Takeaways

  • Comprehensive strategies proven to work at top companies
  • Actionable tips you can implement immediately
  • Expert insights from industry professionals

Why Automated Visual QA Keeps Failing

Automated testing catches logic bugs but misses visual regressions: a button that shifted 20px, a dropdown that renders behind another element, a page that loads fine in Chrome but breaks on mobile. Traditional screenshot diffing tools compare pixels and flood you with noise from anti-aliasing and font rendering differences.

What you actually want is a test that understands the page the way a human QA reviewer does. That is what a vision-based QA agent gives you: Claude looks at a screenshot and evaluates it against your acceptance criteria in plain English.

The Architecture: Playwright Plus Vision

The pipeline has three steps: capture screenshots with Playwright, send them to Claude with a structured checklist, and report failures. You can run this on every pull request or as a nightly regression job.

import anthropic
import base64
import json
from playwright.sync_api import sync_playwright

def capture_screenshot(url, output_path):
    with sync_playwright() as p:
        browser = p.chromium.launch()
        page = browser.new_page(viewport={"width": 1280, "height": 800})
        page.goto(url)
        page.wait_for_load_state("networkidle")
        page.screenshot(path=output_path, full_page=True)
        browser.close()
    return output_path

def check_screenshot(image_path, criteria):
    client = anthropic.Anthropic()
    with open(image_path, "rb") as f:
        image_data = base64.standard_b64encode(f.read()).decode("utf-8")

    criteria_text = "
".join(f"- {c}" for c in criteria)
    message = client.messages.create(
        model="claude-opus-4-5",
        max_tokens=1024,
        messages=[{
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "image/png",
                        "data": image_data
                    }
                },
                {
                    "type": "text",
                    "text": f"Check this screenshot against these acceptance criteria:
{criteria_text}

For each criterion, respond PASS or FAIL with a one-sentence reason. Return as JSON array."
                }
            ]
        }]
    )
    return json.loads(message.content[0].text)

Writing Acceptance Criteria That Work

The quality of the agent depends entirely on how you write the criteria. Vague inputs produce vague outputs. Be specific about what should be visible, where, and in what state.

Prompt

"Write QA acceptance criteria for a checkout page. Include: cart summary is visible with at least one item, a Place Order button is present and not greyed out, the total price appears near the button, no error messages are visible, and the page does not appear to be loading or blank."

Claude turns these into a structured JSON list your agent can evaluate against every screenshot. Store criteria in version control alongside your tests and update them when the UI changes. You now have a QA spec that is both human-readable and machine-executable.

Want to build this live with Aki?

Join a Lightning Lesson and go deeper on this topic. Browse upcoming sessions →

A

Aki Wijesundara

Expert team of AI professionals and career advisors with experience at top tech companies. We've helped 500+ students land internships at Google, Meta, OpenAI, and other leading AI companies.

📍 Silicon Valley🎓 500+ Success Stories⭐ 98% Success Rate

Ready to Launch Your AI Career?

Join our comprehensive program and get personalized guidance from industry experts who've been where you want to go.

Share Article

Get Weekly AI Career Tips

Join 5,000+ professionals getting actionable career advice in their inbox.

No spam. Unsubscribe anytime.