Masterclass Foundations Cold Open
The Testing Academy · Masterclass

Claude Code for QA.

A free Claude Code tutorial for QA engineers and SDETs — learn AI test automation, agentic testing workflows, Playwright MCP, and how to use Claude AI for testing across 46 hands-on chapters.

HostPramod Dutta
TrackEnd-to-End Workflow
AudienceQA / SDET
ProjectQA Portfolio + Deploy

The 30-second pitch

Four things you walk away able to do tonight.

flake

Flaky test?

Claude rewrites the wait, replaces waitForTimeout with expect.poll, runs it 20× to prove stability — before you finish your coffee.

spec

Jira ticket vague?

Claude reads the AC, generates Gherkin scenarios + a Playwright skeleton, files the edge-case bugs it spots — straight to QA-942.

repro

Visual bug from support?

Drag the screenshot. Claude infers viewport, route, OS. Reproduces, captures a Playwright trace, attaches it to the bug.

migrate

Migration anxiety?

Cypress → Playwright. 240 specs. Claude converts in one overnight run · you wake to a passing suite + a PR ready to review.

Claude Code is a coding agent that lives in your terminal — reads your repo, drives your browser, writes your tests, runs your suite, files the bug. For a QA, that flips the SDLC.

You stop being the bottleneck on the right and start being the brain on the left. You design the contract · the agent does the toil. Same job title · 10× leverage.

10×
Test authoring throughput when paired with an agent loop.
0
Flaky locators when the agent reads the live DOM via MCP.
Patience to write the boring 80% of test bodies.
1
Tester still in the loop — you. Design, judge, ship.

Where Claude Code sits in your stack.

Think of it as a terminal-native pair tester. It speaks files, shells, and — through MCP servers — browsers, Jira, Confluence, Notion, GitHub, calendars, even your inbox. Your job is to orchestrate, not to type.

Step 1
Read

It scans the repo, the failing log, the screenshot, the Jira ticket.

Step 2
Plan

It proposes a plan in plan-mode — you approve, edit, or redirect.

Step 3
Act

Edits files, runs commands, drives Playwright, posts comments.

Step 4
Verify

Runs the suite. Re-reads diffs. Confirms green before claiming done.

Step 5
Ship

Commits, opens PR, deploys preview. You review one diff, not 40.

Setup · macOS / Linux / WSL

Five lines. One terminal.

# 1. install
npm install -g @anthropic-ai/claude-code

# 2. cd into your QA repo
cd ~/work/qa-portfolio

# 3. start a session
claude

# 4. first thing you say
> /init  → writes CLAUDE.md

# 5. then ask anything
> read the repo and tell me what test framework we use

A session is just a long REPL with a model that holds the full directory in its head. Anything you type with no leading slash is a prompt. Anything starting with / is a command. Anything starting with ! is a passthrough shell call.

node ≥ 18gitripgrep (rec)gh cli (rec)uv / pip
Tip · run inside a git repo for safest diffs

A session has four moving parts.

01Context

Working dir

The folder you launched in. The agent will read files here. Treat it like the scope of one feature, not your whole laptop.

02Memory

CLAUDE.md

Project rules, conventions, do/don't. Loaded every turn. Global one lives at ~/.claude/CLAUDE.md.

03Tools

Read / Edit / Write / Bash

The four primitives. Everything else (Playwright, Jira, GitHub) is built on top via MCP servers.

04Modes

Permission · Plan · Worktree

Control what runs automatically. Plan-mode = read-only thinking. Worktree = isolated branch sandbox.

Most QA workflows live entirely inside Read + Bash + a Playwright MCP. You almost never need root, and you almost never need to leave the terminal.

Three ways to talk to the agent.

prompt

Plain English

Free text. The agent decides which tools to use.

> run the login spec and screenshot
  every failure
/command

Slash command

Built-in or custom. Predictable, repeatable.

> /init
> /review
> /compact
!shell

Shell escape

Run any shell command directly, no agent.

> !npx playwright test --headed
> !git status

Rule of thumb · use prompts for thinking, slash commands for workflows you repeat, and shell when you already know the exact command.

The complete cheat-sheet

Every slash command a QA touches.

Built-in · sessionPurpose
/helpList every command + skill.
/initGenerate a CLAUDE.md from the repo.
/clearWipe conversation, keep cwd.
/compactSummarise history, free context.
/resumeResume an old session by id.
/statusShow session state + token budget.
/costSpend so far in this session.
/exportSave the conversation to a file.
/bugFile a Claude Code bug report.
/release-notesShow what's new in your version.
/upgradeUpdate CLI to latest.
/login · /logoutAuth handling.
Built-in · config & reviewPurpose
/configOpen settings · theme, model, perms.
/modelSwitch model mid-session.
/permissionsManage allowed tools / commands.
/add-dirBring another folder into scope.
/memoryEdit CLAUDE.md inline.
/hooksInspect configured hooks.
/agentsList + manage subagents.
/mcpList / debug MCP servers.
/reviewReview current diff / PR.
/security-reviewOWASP-style audit of changes.
/pr-commentsPull PR review comments.
/ide · /vimIDE bridge · vim keybindings.

QA-authored customs to add to .claude/commands/ · /flaky · /smoke · /bug-from-trace · /audit-locators · /gen-pom · /triage-failures · /quarantine · /report-run.

Need the full cheat-sheet?

The complete keyboard shortcuts · slash commands · config & env · skills & agents reference card lives in the Reference appendix at the end of the deck. Jump there anytime — print it, pin it above your desk.

Jump to Reference · One-page cheat-sheet →

Read · Edit · Write · Bash.
Plus Grep · Glob · WebFetch. Everything else is sugar on top.

> open tests/login.spec.ts, find the assertion
  that checks the toast, and tighten it to verify
  both text and aria-role

// agent will:
  Read(tests/login.spec.ts)
  Edit(tests/login.spec.ts)  // 1 hunk
  Bash(npx playwright test login --reporter=line)
  // reports green ✓

You never call the tools directly — you describe the intent and the agent picks the tool. The cool part: it shows you every tool call before / as it runs, so you stay in the loop.

Read

cat with a brain

Pulls only the slice it needs — line ranges, page ranges, image content.

Edit

exact-string swap

Fails loudly if the target isn't unique. Safer than sed.

Write

new files

Whole-file create / overwrite. Used sparingly.

Bash

your shell

Runs tests, git, curl, anything. Honours permissions.

The single file that changes everything

Teach the agent your house rules once.

A CLAUDE.md at the repo root is auto-loaded every turn. Put your test framework, your locator policy, your no-flake rules. The agent obeys it.

# QA conventions — qa-portfolio

## Locators — STRICT
- Prefer getByRole / getByLabel.
- Never use raw .locator('xpath=…').
- Brittle CSS selectors must include a comment why.

## Waits
- No page.waitForTimeout() in committed code.
- Use auto-waiting + expect-poll only.

## Test data
- Generate via faker; never hard-code emails.

## Commits
- Conventional Commits. No co-author trailers.

Three levels of memory cascade — global → project → local. Local overrides project, project overrides global. Lowest line wins.

ScopePath
Global~/.claude/CLAUDE.md
Project./CLAUDE.md
Local (gitignored)./CLAUDE.local.md
Demo · run /init to bootstrap it.

What goes inside CLAUDE.md

Nine sections every QA repo needs.

Treat CLAUDE.md as your team's pair-programming contract. Write it once, every session obeys it. Run /init to scaffold, then edit by hand. Lives at the repo root.

# QA Conventions · qa-portfolio

## 1. Stack
- Framework: Playwright 1.49+
- Test runner: @playwright/test
- Lang: TypeScript strict
- Node: 20.x · pnpm

## 2. Folder layout
- tests/e2e/        · browser specs
- tests/api/        · APIRequestContext
- tests/fixtures/   · shared fixtures
- tests/pom/        · page objects
- tests/data/       · faker builders

## 3. Locators — STRICT
- Prefer getByRole, getByLabel, getByTestId.
- Never raw .locator('xpath=…').
- Brittle CSS must include a "// why" comment.

## 4. Waits
- No page.waitForTimeout.
- Use auto-wait + expect.poll only.
- Retry once, then quarantine.

## 5. Data
- Generate with @faker-js/faker.
- Never hard-code emails / phones / addresses.
- Test users: env-based, not committed.

## 6. Tagging
- @smoke @regression @flaky @wip
- CI runs @smoke on every PR.

## 7. Reporting
- HTML + JSON reporter on CI.
- Attach trace + screenshot on retry.

## 8. Commits
- Conventional Commits.
- No co-author trailers.
- No "🤖 Generated with…" footers.

## 9. Do / Don't
- DO: ask before deleting any spec.
- DO: run the impacted spec after every edit.
- DON'T: edit playwright.config without a plan.
- DON'T: bump deps without a separate PR.

Why each section matters

StackAgent picks correct imports / matchers.
LayoutNew files land in the right folder.
LocatorsKills the most common flake source.
WaitsNo fixed sleeps survive review.
DataNo PII or hard-coded secrets.
TaggingCI lanes stay predictable.
ReportingBug repros come with evidence.
CommitsClean git history.
Do/Don'tHard rails on destructive ops.

Three scopes load in cascade · ~/.claude/CLAUDE.md (global) → ./CLAUDE.md (project) → ./CLAUDE.local.md (gitignored personal). Lowest line wins.

Pro tip · use # in any prompt to append a line to CLAUDE.md live.

Hire specialists, not generalists.

A subagent is a separate Claude session spawned for one bounded job. Its output is summarised back — your main context stays clean. Think of them as contractors who clock out when done. Note · the Plan subagent below is the planner specialist — distinct from Plan-mode (⇧Tab toggle, read-only thinking on the main agent).

research

Explore

Read-only code locator. "Where is X defined? What calls Y?" Fast.

> explore: find every place we
  click "Add to cart"
plan

Plan

Architect mode. Designs the implementation plan before any code change.

> plan: add a parallel visual
  regression suite for /pricing
review

code-reviewer

Audits a diff. One line per finding, severity-tagged. No fluff.

> review the last 3 commits
  for race conditions
build

e2e-runner

Owns Playwright. Generates, maintains, quarantines flaky specs.

> e2e: add coverage for
  password-reset happy path
tdd

tdd-guide

Enforces tests-first. Will refuse to write impl before a failing test.

> tdd: implement the new
  promo-code validator
sec

security-reviewer

OWASP top-10 sweep on the diff. Flags secrets, SSRF, injection, XSS.

> security: review the new
  /auth/reset endpoint

Two ways to scale your main agent

Subagent vs Agent Team.

Old model · main agent spawns isolated subagents; results flow upward. New model · main agent acts as team lead, all teammates share a task list, communicate peer-to-peer. Pick the model that fits the job.

TRADITIONAL HIERARCHY (Subagents) Main Agent spawn spawn spawn Subagent Subagent Subagent work work work Result Result Result report ↑ Characteristics • Hierarchical structure • Isolated work · no shared state • Results flow upward only • Limited cross-agent communication COLLABORATIVE TEAM (Agent Teams) Main Agent · Lead assign tasks Shared Task List centralized · visible to all teammates claim claim claim Teammate Teammate Teammate work in parallel Characteristics • Collaborative team structure • Shared context and visibility • Peer-to-peer communication • Better coordination · higher throughput
Fig 1 · Subagents (hierarchy) vs Agent Teams (shared task list + peer comms)

When to pick which

Task shapePickWhy
Independent fan-out (review 20 PRs)SubagentsNo shared state needed; isolated runs scale linearly.
Coordinated build (scaffold + tests + docs)Agent TeamShared task list keeps teammates from duplicating work.
One-shot codebase locateSubagentSingle Explore agent · output summarised back.
Long-running migrationAgent TeamPeer-to-peer ack lets agents hand off mid-flight.
Bug-bash across surfacesAgent TeamShared task list = no two agents reproducing same bug.

From contractors → task board → engineering team

The three levels of Claude Code agents.

As coordination needs grow, you climb the ladder · isolated Subagents → managed Agent View → collaborative Agent Teams. Each level adds shared state and inter-agent communication. Pick the lowest level that fits the job.

The 3 Levels of Claude Code Agents Level 1: Subagents YOU Run inside your session Report results back Can't talk to each other Best for: review, test, docs Think: contractors you send a brief Level 2: Agent View 72% 48% 91% YOU Full-screen dashboard Dispatch, peek, attach Survives terminal closure Best for: 3-10 tasks Think: task board with live workers Level 3: Agent Teams LEAD T1 T2 T3 shared tasks Lead coordinates teammates Teammates talk to each other Shared task list Best for: dependent tasks Think: an actual engineering team more coordination full collaboration
Fig 4 · original by @0x_rody on X · 3 levels of Claude Code agents — pick the lowest that fits the job

When to pick which level

You needLevelWhy
Review a single PRSubagent (L1)One bounded ask · contractor you send a brief.
Run 8 lint + a11y + perf sweeps in parallelAgent View (L2)Need to dispatch + peek progress without losing terminal.
Build a feature with spec → backend → frontend → testsAgent Team (L3)Tasks depend on each other · shared task list keeps state.
Migrate Cypress → Playwright across 12 packagesAgent Team (L3)Cross-package coordination · agents talk + claim work.
"Hunt this flake"Subagent (L1)One spec, one fix · zero need for shared state.

What is a Skill

Skills are playbooks Claude reads on demand.

A Skill is a reusable, versioned collection of instructions, resources, and examples that teaches Claude Code how to complete a specific type of task. Drop one into .claude/skills/<name>/SKILL.md (repo) or ~/.claude/skills/<name>/SKILL.md (personal). Claude auto-discovers and auto-triggers on description match.

📘 expertise

Reusable Expertise

Encapsulate proven workflows and best practices once. Every future session uses them.

🎯 consistency

Consistent Results

Deliver reliable outcomes by following defined steps + context. Same input · same output.

🧩 composable

Composable

Use alone or combine with hooks + plugins for powerful, multi-step automations.

🤝 share

Share & Collaborate

Ship skills via git to your team — or release to the community marketplace.

1. You request plain English prompt 2. Claude selects matches description 3. Skill executes uses tools + plugins 4. Results produced files · code · actions 5. Claude responds summary + next step SKILL LIFECYCLE · request → response in 5 steps Discovery: .claude/skills/<name>/SKILL.md (project) · ~/.claude/skills/ (personal) Marketplace: npx @qaskills/cli add <skill> · npx skillsadd <owner/repo>
Fig 2 · Skill lifecycle · how Claude finds, loads, and executes a Skill on your behalf

How a Skill runs · 5 steps

Step 1
You request a task

Describe what you need in Claude Code · plain prompt.

Step 2
Claude selects a Skill

Most relevant skill chosen by matching its description field.

Step 3
Skill executes

Claude follows the skill's procedure · uses tools, plugins, files.

Step 4
Results produced

Structured outputs · code, files, or actions as defined.

Step 5
Claude responds

Presents results + continues the conversation.

A real skill · flake-hunter

---
name: flake-hunter
description: Use when a Playwright spec
  fails intermittently. Locates the wait,
  rewrites it to expect-based polling,
  re-runs 10× to confirm stability.
---

## Steps
1. Read the failing spec.
2. Identify any waitForTimeout / sleep.
3. Replace with expect.poll().
4. Run npx playwright test --repeat-each=10.
5. Report flake-rate before vs after.

QA-relevant built-ins

playwright-e2e playwright-api playwright-cli pytest-patterns tdd-workflow e2e selenium-pom security-review vibe-check seo-audit marp (slides)

Trigger live · type "hunt the flake in checkout.spec.ts". Claude matches against the skill's description field and loads it automatically.

Pro tip · best description reads "Use when [condition] · [what it does] · [stop conditions]."

Canonical docs · build your own

Build your own skill in under 10 minutes

A Skill is a markdown file with a recipe.

Frontmatter declares name + description. Body holds the procedure. Drop in ~/.claude/skills/<skill>/SKILL.md (global) or .claude/skills/<skill>/SKILL.md (repo-scoped). Claude auto-discovers and triggers on description match.

---
name: flake-hunter
description: Use when a Playwright spec
  fails intermittently or has any sleep /
  waitForTimeout / networkidle wait. Locates
  the bad wait, rewrites with expect.poll,
  re-runs 20x to confirm stability.
allowed-tools: Read, Edit, Bash, Grep, Glob
---

# Flake Hunter

## When to trigger
- Spec failed retry on CI.
- User says "this is flaky" / "intermittent".
- Code contains waitForTimeout / sleep.

## Procedure
1. Read the spec end-to-end.
2. Grep for: waitForTimeout, sleep,
   networkidle, hard delays.
3. For each hit, replace with expect.poll
   or explicit element wait.
4. Run npx playwright test {file} \
     --repeat-each=20 --workers=1.
5. Report flake-rate before vs after
   as a markdown table.
6. If still flaky > 5%, surface the
   most-likely locator candidate.

## Stop conditions
- 20/20 pass · report success.
- Any locator looks brittle · ask user.

Frontmatter fields

FieldPurpose
nameSlug · used in /skills.
descriptionThe trigger phrase. Be vivid — claude matches on this.
allowed-toolsWhitelist of tools the skill may call.
modelOptional override · sonnet / opus / haiku.
colorOptional · sidebar accent.

Scaffold faster · use skill-creator

> /skill-creator
// or, with the skill name:
> use skill-creator to build a
  skill called "locator-auditor"
  that scans tests/ for raw xpath
  and proposes role-based replacements

Reference links

docsdocs.claude.com/en/docs/claude-code/skills
repogithub · anthropics/skills
specSKILL.md frontmatter reference
toolskill-creator · scaffolds new skills
guideanthropic-skills · skill-creator (built-in)

Don't write skills from scratch · install them

Skill marketplaces · qaskills.sh + skills.sh.

Two community registries let you npx-install pre-built skills into your Claude Code (or Cursor / Copilot / Windsurf). qaskills.sh is QA-only · 450+ testing skills. skills.sh is the broader directory across all domains. Both ship via one command · zero copy-paste.

QA-relevant skills worth installing first day

SkillSourceWhat it gives you
playwright-e2eqaskills.shBest-practice Playwright spec authoring with Page Object Model + fixtures.
api-test-generatorqaskills.shGenerates positive · negative · schema · auth coverage from one curl or OpenAPI URL.
flaky-test-detectorqaskills.shHunts waitForTimeout · networkidle · sleeps · rewrites with expect.poll.
wcag-auditqaskills.shWCAG 2.2 a11y sweep · pulls axe-core · reports per-rule violations.
selenium-to-playwrightqaskills.shMechanical migration helper · POM + waits + selectors translation.
perf-baselineqaskills.shk6 / Lighthouse baseline harness · saves perf budget JSON.
frontend-designskills.sh · anthropicDistinctive UI generation · already used inside this masterclass deck.
skill-creatorskills.sh · anthropicScaffolds new skills with frontmatter + procedure + examples folder.
security-reviewskills.sh · officialOWASP top-10 sweep over a diff · same engine as /security-review.
# 1. land in your test repo
cd ~/work/my-qa-repo && claude

# 2. paste this prompt — claude installs the lot for you
> install these skills into .claude/skills/ and update CLAUDE.md
  to mention each one is available:
    - npx @qaskills/cli add playwright-e2e
    - npx @qaskills/cli add api-test-generator
    - npx @qaskills/cli add flaky-test-detector
    - npx @qaskills/cli add wcag-audit
    - npx skillsadd anthropic/skill-creator
    - npx skillsadd anthropic/security-review
  After install, run /skills and confirm all six show up.
when

Use marketplace skill

Generic, reusable patterns · authoring tests, audits, migrations. Don't reinvent.

when

Write your own

Company-specific rules, internal framework wrappers, secret locator policy.

when

Fork an existing one

Marketplace skill ~90% right · clone to .claude/skills/, tweak the description + procedure, commit.

Pro tip · npx @qaskills/cli list shows everything installed in current repo · pair with /skills inside Claude to verify load order.

Hooks fire around tool calls.

A hook is a shell command the harness runs at a lifecycle event. Use them to auto-format on save, run a smoke test after every write, or block edits to main.

"hooks": {
  "PostToolUse": [
    {
      "matcher": "Edit|Write",
      "hooks": [{
        "type": "command",
        "command": "npx prettier --write $CLAUDE_FILE_PATH"
      }]
    }
  ],
  "SessionStart": [{
    "hooks": [{
      "type": "command",
      "command": "echo '🧪 QA mode ready. Run /init.'"
    }]
  }]
}
EventFires
SessionStartOnce when you launch claude.
UserPromptSubmitEvery time you press enter.
PreToolUseBefore any tool runs (can block).
PostToolUseAfter each tool call succeeds.
StopWhen the assistant finishes a turn.

QA pattern · PostToolUse on Edit → run the spec that owns the changed file. Feedback loop drops to seconds.

MCP turns Claude into a browser, a Jira, a Notion, a Gmail.

Model Context Protocol is the USB-C of AI tools. A small server exposes verbs (navigate, click, createJiraIssue); Claude calls them like any other tool. For QA, the killer one is the Playwright MCP.

MCP ARCHITECTURE · ONE PROTOCOL · MANY ADAPTERS QA / SDET plain prompts prompt Claude Code reasoning + tool calls MCP MCP Servers JSON-RPC adapter navigate · click · query Playwright Jira GitHub Notion · … WHY QA LOVES MCP • Locators via a11y tree — no XPath • File bugs straight from a failed test (Atlassian MCP) • Pull AC from Jira, generate Gherkin in one prompt • Auto-screenshot + trace + PR comment chain • ~50 lines to wrap an internal API as a custom MCP • One adapter, swap the underlying tool freely • Transports: http (remote) · stdio (local) · sse • Auth via env vars, never in prompt • Same protocol works in Cursor, Copilot, Windsurf modelcontextprotocol.io · the spec
Fig 3 · MCP architecture · QA → Claude Code → MCP servers → Playwright / Jira / GitHub / Notion / your custom API
browser

Playwright MCP

Drive Chromium, Firefox, WebKit. Snapshot the a11y tree, click by role, screenshot, network log.

tickets

Atlassian MCP

Read/write Jira issues, Confluence pages. File bugs straight from a failed test.

repo

GitHub via gh

PRs, issues, checks, releases. No extra MCP needed — uses gh CLI under the hood.

docs

Notion MCP

Pull the test plan, push the run report. Round-trip in one prompt.

comms

Gmail / Calendar

Draft the release email, find the freeze window, never alt-tab.

custom

Your own MCP

Wrap your internal API. Spec is just JSON-RPC. ~50 lines to start.

Demo · authoring a spec without typing locators

"Open saucedemo.com, log in, add a hoodie, screenshot the cart."

claude · qa-portfolio
open saucedemo, login as standard_user, add the Sauce Labs Backpack to cart, screenshot the cart page, then generate a Playwright spec from what you just did

browser_navigate('https://saucedemo.com')
browser_snapshot() · captured a11y tree
browser_fill_form(user='standard_user', pass=•••)
browser_click(role=button, name='Login')
browser_click(role=button, name='Add to cart', within='Sauce Labs Backpack')
browser_click(role=link, name='cart')
browser_take_screenshot(path='cart.png')
Write(tests/cart-add.spec.ts) ✓
done · spec runs green on first try
import { test, expect } from '@playwright/test';

test('guest adds backpack to cart', async ({ page }) => {
  await page.goto('https://www.saucedemo.com');
  await page.getByRole('textbox', { name: /user/i }).fill('standard_user');
  await page.getByRole('textbox', { name: /pass/i }).fill('secret_sauce');
  await page.getByRole('button', { name: 'Login' }).click();

  await page.getByRole('button', { name: /add to cart/i }).first().click();
  await page.getByRole('link', { name: /cart/i }).click();

  await expect(page.getByText('Sauce Labs Backpack')).toBeVisible();
});

Notice — every locator is getByRole. No CSS, no XPath. The agent reads the a11y tree, so it picks accessible names by default.

API tests from a single curl.

Paste a curl, an OpenAPI URL, or a Postman export. The agent infers the contract and generates a fixture-based suite — schema validation, negative paths, auth variants, the lot.

> generate API tests for this endpoint with positive,
  negative, schema, and auth coverage. use playwright
  APIRequestContext, group by describe, log only on fail.

curl -X POST https://api.demo.dev/v1/users \
  -H 'Authorization: Bearer $T' \
  -H 'Content-Type: application/json' \
  -d '{"email":"a@b.co","plan":"pro"}'
import { test, expect } from '@playwright/test';
import { z } from 'zod';

const User = z.object({
  id: z.string().uuid(),
  email: z.string().email(),
  plan: z.enum(['free', 'pro'])
});

test.describe('POST /v1/users', () => {
  test('creates a pro user', async ({ request }) => {
    const r = await request.post('/v1/users', {
      data: { email: 'a@b.co', plan: 'pro' }
    });
    expect(r.status()).toBe(201);
    User.parse(await r.json());
  });

  test('rejects bad email', async ({ request }) => {
    const r = await request.post('/v1/users', {
      data: { email: 'not-an-email', plan: 'pro' }
    });
    expect(r.status()).toBe(400);
  });
});

Tests from requirements

A Jira ticket in. A test plan out.

> fetch QA-482 from jira, read the acceptance
  criteria, produce: 1) a Gherkin scenarios file,
  2) a Playwright skeleton, 3) a coverage matrix
  mapping each AC to a test id.
Uses · Atlassian MCP + Write tool
Feature: Guest checkout

  Scenario: Apply valid promo
    Given I have a backpack in my cart
    When I apply promo "SAVE10"
    Then the subtotal drops by 10%
    And the promo chip shows "-10%"

  Scenario: Reject expired promo
    Given I have any item in my cart
    When I apply promo "BLACKFRIDAY23"
    Then I see error "This code has expired"

  Scenario Outline: Country-specific tax
    Given I check out from 
    Then tax shows %

    Examples:
      | country | tax |
      | IN      | 18  |
      | US-CA   | 9   |
      | DE      | 19  |

From a screenshot to a filed bug.

> [screenshot.png attached]
  user reports the price chip overflows on
  the pricing card on mobile. reproduce in
  Chromium @ 390×844, capture a trace,
  attach screenshot, file a Jira bug in
  QA project with steps + expected/actual.

The agent reads the image, drives a 390×844 viewport, captures a Playwright trace, posts everything to Jira as a single bundled bug. You sip coffee.

claude · bug-bash
analyzing screenshot…
▸ identified component: PricingCard / .price__chip
browser_resize(390, 844)
browser_navigate('/pricing')
▸ overflow confirmed → trace.zip (1.2 MB)
createJiraIssue(QA, type=Bug)
QA-941 "Pricing chip overflow @ 390×844"
steps · expected · actual · screenshot · trace
severity: Minor · component: pricing-card
done

Drag a screenshot. Get tests.

Claude is multimodal — it reads images natively. Paste a Figma export, a customer's broken-UI screenshot, a flaky CI run's failure image. It identifies the component, locates it in code, proposes a fix or a test.

design

Figma → test

Drop a Figma frame. Get a visual-regression spec that snapshots the matching live route.

support

User report → repro

Paste customer screenshot. Claude infers viewport, route, OS hints, and reproduces.

ci

CI failure → root cause

Drop a failed-snapshot diff. Claude reads both, explains the visual delta in plain English.

Live · paste any screenshot into the prompt — it just works.

QA reviews code too. Now they have leverage.

CommandOutcome
/reviewSeverity-tagged findings on the current diff.
/security-reviewAuth, SSRF, injection, secret leakage.
/ultrareviewMulti-agent cloud review of the branch / PR.
/caveman-reviewOne-line-per-finding terse review.

Each finding follows the format path:line: severity: problem · fix. Easy to triage, easy to paste into a PR comment.

tests/login.spec.ts:14: ⚠ medium:
  page.waitForTimeout(2000) introduces
  fixed sleep. fix: use expect.poll()
  on the dashboard heading.

src/auth/middleware.ts:42: 🔴 high:
  token comparison uses ==, vulnerable
  to timing attack. fix: crypto.timingSafeEqual.

playwright.config.ts:8: ◇ low:
  retries: 3 hides flakes. fix: cap at 1
  and quarantine instead.

Run Claude in your pipeline.

The same CLI is headless. Put it on GitHub Actions to auto-triage failing tests, generate PR review comments, or open a Jira when a smoke fails on main.

name: qa-bot
on: { pull_request: { types: [opened, synchronize] } }
jobs:
  triage:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Install Playwright
        run: npx playwright install --with-deps
      - name: Claude review + smoke
        env: { ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} }
        run: |
          npx -y @anthropic-ai/claude-code -p \
            "review this PR for QA risk, then run npx playwright test --grep @smoke
             and post a summary as a PR comment"

The -p flag (print/headless) gives you a one-shot run that exits with stdout. Perfect for cron, CI, or wrapping in your own scripts.

Cost guardrails · keep CI bills bounded

budget

Per-job ceiling

Wrap claude -p with a timeout (timeout 600s) and a token cap via CLAUDE_MAX_TOKENS. Fail-fast beats infinite spend.

model

Pick the cheapest model that works

Use Haiku for triage / log-summary jobs. Reserve Sonnet for review jobs. Avoid Opus on every PR.

scope

Diff-only context

Pipe git diff origin/main instead of the full repo. 90% of review jobs need only the diff plus 2-3 related files.

cache

Prompt caching on

Set ENABLE_PROMPT_CACHING_1H=1. Repeated reviewer prompts hit the 1h cache and bill at <10% of full price.

batch

Batch · don't fan out

One Claude job reviewing 5 PRs in sequence beats 5 jobs reviewing 1 PR each — shared system prompt cached.

alarm

Spend alerts

Anthropic console · set a monthly soft + hard limit. Slack webhook on threshold = no surprise invoice.

timeout 600s npx -y @anthropic-ai/claude-code -p \
  --model claude-haiku-4 \
  --max-turns 6 \
  "review the changes in this PR for QA risk only.
   focus on: flaky waits, hard-coded data, missing assertions.
   skip style nits. cap report at 8 bullets." < $(git diff origin/main)

Plan first. Then let it loose.

plan mode

Read-only thinking

Toggle with ⇧Tab. Claude reads, searches, designs — but cannot Edit / Write / shell-mutate. Approve the plan, then exit and execute.

large refactors · scary migrations ·
unfamiliar codebases · auth changes
worktree

Isolated sandbox

Spin a git worktree on a temp branch. Agent works there. If it goes sideways, you delete the dir. Your main checkout never moves.

> work on this in a worktree
  so my dev server keeps running

Together these are the two safety belts that let you run the agent autonomously on a long task while you go to lunch. Come back to a green PR.

Things that will save you a workday.

@filenameReference a file inline. Tab-completes paths.
#Quick-add a CLAUDE.md note from the prompt.
⇧TabToggle Plan ↔ Auto-Accept Edits mode.
EscCancel current tool call without killing session.
Esc EscRewind — pick an older message and branch.
/compactHand-roll a context summary when token budget tight.
!cmdShell passthrough — no agent involvement.
/costSee spend before continuing the next sweep.
/resumePick up yesterday's session by id.
--continueResume the last session non-interactively.
--dangerously-skip-permissionsYOLO mode for sandboxed containers only.
claude -p "…"One-shot run, prints answer, exits. CI-friendly.

Make your own /commands

Your team. Your verbs.

Drop a markdown file in .claude/commands/. The filename becomes the command. Inside, write the prompt template — with placeholders. Share it via git so every QA in your team has the same playbook.

---
description: Hunt a flaky Playwright spec
argument-hint: [spec path]
---

You are debugging a flaky Playwright test.

Spec path: $ARGUMENTS

Do these in order:
1. Read the spec end-to-end.
2. Find every waitForTimeout, sleep,
   hard-coded delay, or networkidle wait.
3. Replace with expect.poll / explicit
   element wait.
4. Run it --repeat-each=20.
5. Report flake rate before vs after as a table.

Stop and ask if any locator looks brittle.
> /flaky tests/checkout.spec.ts

// claude expands the template, fills
// $ARGUMENTS, and runs the playbook

Useful QA commands to author

/flaky /smoke /bug-from-trace /audit-locators /gen-pom /triage-failures /report-run /quarantine
Tip · commit them to the repo. Onboarding new QAs becomes a one-line README.

A QA day · before vs after.

Before

09:00Stand-up. 4 flaky tests overnight.
09:30Open Jira. Read AC. Translate to Gherkin by hand.
10:30Hunt the locator. Tweak the wait. Re-run. Cry.
12:00Lunch · still 3 flakes.
14:00Write spec body. Stack-overflow the matcher.
16:00Smoke breaks. Bisect commits manually.
17:30File bug. Attach screenshot. Update Confluence.
18:30Out the door. Backlog grew.

After

09:00Stand-up. /triage-failures ran overnight.
09:15/gen-tests QA-482 · Gherkin + skeleton in 2 min.
09:45/flaky checkout.spec.ts · agent fixes, retries ×20.
11:00Review the diff. Approve. PR opened.
12:00Lunch · backlog smaller.
14:00Exploratory session with Playwright MCP.
15:00/bug-from-trace · 3 bugs filed in 5 min.
16:00Deep work — architecture, risk, mentoring.
17:00Out the door. Slept fine.

One session, every job vs one agent, one job

Vibe Coding vs Software Factory.

Pile every role onto one Claude session and you get context chaos — research notes, frontend code, QA assertions, reviewer remarks all bleed into the same buffer. The fix · split roles across small, focused agents in a pipeline. Each agent owns one job, with clean context, and hands off to the next.

Vibe Coding Product Analyst Architect Backend Dev Frontend Dev QA Engineer Reviewer ? One session. Every job. Context chaos. Software Factory Researcher Story Spec </> Backend Frontend Tests Validator One agent. One job. Clean context.
Fig 5 · Vibe coding (one session does everything · context chaos) vs Software Factory (one agent per role · clean handoff)

What this means for QA

vibe

Symptom · main session is doing too much

The same Claude turn just researched the Jira, wrote Playwright tests, reviewed your migration, and drafted the PR description. By turn 30 it's forgotten which framework you're using.

factory

Fix · pipeline with explicit roles

Researcher subagent reads AC → Spec subagent writes Gherkin → Tests subagent writes Playwright specs → Validator subagent runs the suite + reports. Each gets its own context.

> Build a software factory pipeline for QA-482.
  Step 1 · researcher subagent · pull AC from Jira, summarise.
  Step 2 · spec subagent · turn AC into Gherkin scenarios.
  Step 3 · test subagent · write Playwright specs from Gherkin.
  Step 4 · validator subagent · run the suite, report pass/fail.
  Each step in its own context · pass only the artefact forward.
  Stop and ask if any step output looks off.
Rule of thumb · if your main session has scrolled past 4 unrelated topics, you've slipped into Vibe Coding. /compact or split into subagents.

When your repo has 12 apps and 8 test suites

Claude Code in a monorepo.

One Claude session can scope multiple packages via /add-dir. Per-package CLAUDE.md overrides the root one. Per-package skills live under packages/<app>/.claude/skills/. Run diff-aware tests so you don't smoke-test the whole world for a 1-file fix.

## Repo · Turborepo monorepo

Layout:
  apps/web/        Next.js app
  apps/admin/      admin dashboard
  apps/api/        Express API
  packages/ui/     shared components
  packages/qa/     shared Playwright fixtures

Each app has its own CLAUDE.md.
ALWAYS read the package-level CLAUDE.md
before editing inside that package.

Diff-aware test command:
  turbo test --filter='[origin/main]'

CI runs only changed packages.
> /add-dir apps/web
> /add-dir packages/qa

> the new apps/web/PricingCard component
  reuses fixtures from packages/qa. Read
  both CLAUDE.md files, then write a Playwright
  spec at apps/web/tests/pricing.spec.ts
  that imports the shared checkoutFixture.
  Run only that one spec:

  turbo test --filter=web -- --grep=pricing
Tip · per-package .claude/skills/ beats root-level — keeps locator policy local.

Monorepo cookbook

PatternRecipe
Diff-aware CIturbo test --filter='[origin/main]' · only packages that changed.
Per-package skillspackages/web/.claude/skills/locator-auditor/ · stays out of apps/api.
Shared fixturespackages/qa · import via @org/qa-fixtures in every test app.
Cross-package refactorRun Claude with /add-dir on every affected package · one PR, one diff.
Selective Playwright projectsplaywright.config.ts · one project per app, filter via --project=web.

Migration playbook · don't rewrite by hand

From Cypress · Selenium · TestCafe → Playwright.

Four-step playbook. Claude does 90% of the mechanical work · you review the diff. Cite the selenium-to-playwright skill from qaskills.sh in your prompt and it follows the official translation map.

Step 1
Audit

Claude inventories every spec, groups by complexity, flags brittle ones.

Step 2
Convert

Spec-by-spec mechanical translation · locators, waits, hooks.

Step 3
Quarantine

Anything Claude isn't 95% sure about lands in @flaky tag for human review.

Step 4
Cut over

Run both suites on CI for 1 week · drop old when delta = 0.

Locator translation matrix

Cypress / SeleniumPlaywrightWhy
cy.get('[data-test=login]')page.getByTestId('login')Built-in test-id locator, no CSS lookup.
cy.contains('Submit').click()page.getByRole('button', { name: 'Submit' }).click()Role-based · screen-reader-equivalent.
cy.intercept('POST', '/api/users')page.route('/api/users', …)Native request interception.
cy.wait('@createUser')await page.waitForResponse(r => r.url().includes('/users'))Promise-based · no aliases.
driver.findElement(By.xpath('//button[…]'))page.getByRole('button', { name: /…/ })Auto-wait + a11y tree.
Thread.sleep(2000)await expect(locator).toBeVisible()Auto-waiting · no sleeps.
beforeEach hooktest.beforeEach fixture · or named fixtureFixtures compose better.
> migrate cypress/e2e/ to Playwright at tests/e2e/.
  Use the selenium-to-playwright skill from qaskills.sh.
  Rules:
    - Use getByRole / getByLabel / getByTestId first.
    - No waitForTimeout. Replace with expect-based waits.
    - cy.interceptpage.route. cy.task → fixture method.
    - Anything you're not 95% sure about, tag @flaky + add a TODO.
  After conversion, run the new suite locally and report:
    - total specs converted
    - pass count / fail count / @flaky count
    - diff in run-time vs old Cypress suite
pitfall

Aliased intercepts

Cypress @alias doesn't map cleanly · use waitForResponse with URL matcher instead.

pitfall

Custom commands

Cypress Cypress.Commands.add → Playwright fixtures · not 1:1. Build a fixture file.

pitfall

iframe handling

Cypress can't deeply traverse iframes natively; Playwright can via frameLocator. Better outcome.

Numbers that drive QA decisions

Flake rate · MTBF · p95 — let Claude do the math.

Claude can ingest your Playwright JSON reporter output, compute the stability metrics that matter, and rank specs by flake risk. Wire it into CI · fail the build if flake rate goes above 2%.

The four numbers

flake rate

Failed-then-passed-on-retry %

If > 2% on main, your suite is rotting. Track per-spec and per-package.

MTBF

Mean time between flakes

Runs between flake events. Useful for noisy specs · alert when it drops.

p95

p95 duration

95th percentile spec runtime. Bisect anything spiking 2× week-over-week.

pass-rate

Pass rate over 30 runs

If < 98% on a non-WIP spec, quarantine or fix · don't ship more on top.

---
description: Read the last 30 Playwright
  JSON reports, compute flake rate, rank
  flakiest specs, propose fixes for top 5.
argument-hint: [days]
---

You have $ARGUMENTS days of JSON reports
under .playwright/reports/.

For each spec, compute:
  - total runs
  - failed-then-passed-on-retry count
  - flake rate (= retries / runs * 100)
  - p95 duration

Sort by flake rate desc.
For the top 5, read the spec file and
propose 1-line fixes.

Output as a markdown table + a fix list.
# after the playwright job
- name: Flake gate
  run: |
    rate=$(node scripts/flake-rate.js)
    echo "flake rate: ${rate}%"
    if (( $(echo "$rate > 2.0" | bc -l) )); then
      echo "::error::Flake rate ${rate}% exceeds 2%"
      exit 1
    fi

# scripts/flake-rate.js
const { suites } = require('./playwright-report.json');
const all = suites.flatMap(s => s.specs);
const retries = all.filter(s => s.tests[0].results.length > 1).length;
console.log(((retries / all.length) * 100).toFixed(2));
> /report-flake 7
  also: produce a 5-bullet exec summary for the QA channel —
  start with the headline number, then the biggest mover,
  then top-3 specs to fix this week. Format for Slack.

When your test framework has no public docs

Teach Claude your in-house harness.

Three ways to teach Claude a framework it's never seen — pick one or stack all three. Pattern works for any internal CLI / DSL / fixture system.

1 · CLAUDE.md

Document the rules

Locator policy, fixture system, naming conventions in a "## AcmeTest" section of CLAUDE.md. Loaded every turn.

2 · examples

Show input → output pairs

Drop 8-12 example specs in .claude/skills/acmetest/examples/. Claude few-shots from them.

3 · custom MCP

Wrap your CLI

~50 lines of JSON-RPC turns acme-test run --spec X into a real MCP tool Claude can call.

## AcmeTest · our in-house test harness

Locators are NEVER raw CSS. Always use:
  acme.locate('@:')

Fixtures live in tests/_fixtures/ and are
auto-injected · NEVER instantiate manually.

Setup hook is beforeScenario, not beforeEach.

Assertions use acme.expect(x).toMatchSnapshot();
DO NOT import @playwright/test.

CLI:
  acme-test run --spec=tests/login.acme
  acme-test list-fixtures
  acme-test snapshot --update --spec=…

Reports land in .acme/reports/<ts>/.
Trace files at .acme/traces/<ts>/<spec>.trace.

Minimal custom MCP wrapper (Node)

import { Server } from '@modelcontextprotocol/sdk/server';
import { exec } from 'node:child_process';
import { promisify } from 'node:util';
const sh = promisify(exec);

const server = new Server({ name: 'acme-test', version: '1.0' });

server.tool('acme_run', {
  description: 'Run an AcmeTest spec and return the JSON report.',
  inputSchema: { type: 'object', properties: { spec: { type: 'string' } }, required: ['spec'] }
}, async ({ spec }) => {
  const { stdout } = await sh(`acme-test run --spec=${spec} --json`);
  return { content: [{ type: 'text', text: stdout }] };
});

server.tool('acme_list_fixtures', { description: 'List all available fixtures.' },
  async () => ({ content: [{ type: 'text', text: (await sh('acme-test list-fixtures')).stdout }] }));

server.start({ transport: 'stdio' });
{
  "mcpServers": {
    "acme-test": {
      "command": "node",
      "args": [".claude/mcp/acme-test/server.js"]
    }
  }
}

Day-by-day playbook for week one

Onboard a new QA to Claude Code in 5 days.

Hand a printable checklist to every new hire. By Friday they ship their first PR. By week two they're authoring custom skills.

DayTaskOutcome
Mon · Setupnpm i -g @anthropic-ai/claude-code · clone the test repo · /init · read root CLAUDE.md end-to-end · install team skills via npx @qaskills/cli add <skill>Working environment · CLAUDE.md memorised · 6 team skills installed
Tue · ReadPair with a senior on /explore · ask Claude to walk the test directory · run /flaky tests/checkout.spec.ts together · watch the workflowUnderstands repo shape · knows where fixtures + POM live
Wed · First PRPick a Jira ticket from the "good-first-bug" lane · ask Claude to generate Gherkin + Playwright skeleton · finish, run, push · /review before opening the PRFirst PR opened with green CI
Thu · ToolsAuthor a personal custom command in .claude/commands/me/<name>.md · try Plan mode (⇧Tab) on a scarier task · run /security-review on yesterday's PRKnows custom commands · plan mode · security gates
Fri · ShadowShadow the bug-bash · use Vision (drop screenshots) to repro 3 bugs · file each via Jira MCP · pair on /ultrareview for the senior's PRFamiliar with Vision · Jira MCP · review workflows

The buddy prompt template

> /btw I'm a new QA on this repo. I'm trying to understand
  why tests/checkout/payment.spec.ts uses a custom
  fixture instead of the shared one in packages/qa.
  Read both files, give me a 4-bullet explanation a junior
  tester can follow. Do not change any code.

# /btw runs the question as a side-thread so the senior's
# main session isn't polluted.
stumble

"It edited the wrong file"

Forgot to /clear between tasks · old context bled in. Always clear between unrelated scopes.

stumble

"My CI bill went up"

Ran --continue on yesterday's huge session. Use /compact instead, or start fresh.

stumble

"The agent is hallucinating tests"

No CLAUDE.md rules on locators. Spend 10 min writing them · save 10 hours of fights.

Week-2 KPIs (manager checklist)

Authored at least one custom slash command in .claude/commands/
Shipped 3+ PRs with /review + /security-review green
Filed 5+ bugs with Vision + Jira MCP · no manual repro steps
Can explain to a peer what a Skill is and which 3 their team relies on
Knows when to use plan mode vs auto-accept vs worktrees

Hands-on · run these now

Five drills. Do them in order.

015 min

First contact

Pick any repo. Run claude. Type /init. Read the generated CLAUDE.md. Edit it to reflect your rules.

0210 min

Codegen a spec

Ask Claude to drive saucedemo.com via Playwright MCP, complete a checkout, and emit a spec. Run it. Expect green on first try.

0310 min

Tame a flake

Find a spec with a waitForTimeout. Ask Claude to remove all sleeps and prove stability with --repeat-each=20.

0410 min

Bug from screenshot

Drop a UI bug screenshot. Ask Claude to repro at the right viewport, capture a trace, and draft a Jira-ready bug write-up.

0520 min

Portfolio scaffold

Run the mega-prompt from the project chapter. End with a working localhost:3000 and a passing Playwright suite.

06stretch

Author /flaky

Build the custom /flaky command in .claude/commands/. Commit it. Run it on a real spec. Share with a teammate.

Capstone · channels → skill → site → live URL

Your channels become your portfolio.

You already produce QA content across 6 channels. We'll teach Claude to read them, package you as a Skill, generate a portfolio site, and ship it to app.thetestingacademy.com/masterclass/ClaudeCode.html.

GitHub LinkedIn Blog YouTube Instagram Medium

Four prompts. Zero hand-coding. Ends with a live URL + Playwright suite + Lighthouse gate + CI pipeline.

Test engineer.
Builder of QA crews.

240k+ engineers learn QA from my YouTube. I break, write, ship, repeat.

PlaywrightAISeleniumK6
The Testing Academy240k YouTube · 12k newsletter.
Open-sourcegithub.com/promode · 30+ repos.
TalksSelenium Conf · STPCon · meetups.
WritingMedium · personal blog · LinkedIn.

Step 1 · package yourself as a Skill

Aggregate 6 channels into one Skill.

The Skill becomes the single source of truth — every later prompt reads from it. Whenever you publish new content, you re-run the Skill and the site updates itself.

> create a new skill at .claude/skills/pramod-me/SKILL.md that
  packages me as data. Aggregate the following channels and produce
  a single data.json file in the same folder:

  - GitHub      → https://github.com/promode
  - LinkedIn    → https://www.linkedin.com/in/thetestingacademy
  - Blog        → https://scrolltest.com
  - YouTube     → https://www.youtube.com/@TheTestingAcademy
  - Instagram   → https://instagram.com/thetestingacademy
  - Medium      → https://medium.com/@thetestingacademy

  For each channel:
    1. Fetch the public page with WebFetch.
    2. Extract: handle, follower count if visible, top 6 items
       (repo / post / video / reel / article) with title, url, date.
    3. Pull a one-line bio + headshot URL where available.

  Then in SKILL.md frontmatter:
    name: pramod-me
    description: Use whenever the user needs current data about
      Pramod Dutta — bio, top repos, latest videos, recent talks,
      newest blog posts — to render a personal site or CV.
    allowed-tools: WebFetch, Read, Write, Bash

  Procedure section must:
    - Read data.json if <= 7 days old, otherwise refresh from sources.
    - Expose 5 helpers: getBio(), getTopRepos(), getLatestVideos(),
      getRecentPosts(), getTalks().
    - Cache to data.json. Pretty-print, ASCII-only, sorted by date desc.

  Verify by running the skill end-to-end and pasting a summary of
  the JSON it produced. Stop if any fetch fails — ask before retrying.

Step 2 · scaffold + content + style

One mega-prompt → full Next.js site.

> using the pramod-me skill, scaffold a Next.js 15 portfolio at
  ./qa-portfolio. Requirements:

  STACK
    - Next 15 (app router), TypeScript strict, Tailwind, MDX.
    - Inter (sans) + JetBrains Mono. Teal accent #1a7c79.
    - Light theme, doc-style layout. No purple gradients, no neon.

  ROUTES
    /              hero · headline · 3 chip tags · CTA "watch on YouTube"
    /work          top 6 repos pulled from getTopRepos()
    /talks         talks + slide decks
    /writing       latest blog + Medium articles via getRecentPosts()
    /videos        embed latest 6 YouTube videos via getLatestVideos()
    /social        LinkedIn / Instagram cards, last 3 posts each
    /contact       email · calendar embed · X / GitHub links

  COMPONENTS
    - <ChannelStrip /> at footer · 6 icons → 6 URLs.
    - <Card /> with title, date, source-badge, hover lift.
    - <Hero /> · pulls from getBio() at build time.

  DATA
    - Read from .claude/skills/pramod-me/data.json.
    - getStaticProps for /work /talks /writing /videos.
    - ISR every 24h on prod.

  SEO + META
    - OpenGraph image generated dynamically per route.
    - JSON-LD Person schema on home.
    - sitemap.xml + robots.txt.

  After scaffold finishes, run pnpm dev and confirm the
  home page renders without console errors. Take a screenshot at
  390x844 and 1440x900, save them as docs/hero-mobile.png
  and docs/hero-desktop.png. Then stop and report.

Step 3 · Playwright + Lighthouse + GHA

Cover every route. Gate every merge.

> add a Playwright suite + a CI pipeline to ./qa-portfolio.

  PLAYWRIGHT
    - Install @playwright/test + @axe-core/playwright.
    - 3 projects: mobile (390x844 · Pixel), tablet (768x1024 · iPad),
      desktop (1440x900 · Chromium).
    - Suites:
        tests/smoke/      every route returns 200 + has <h1>
        tests/a11y/       axe.run() on every route, 0 critical issues
        tests/visual/     screenshot per route, threshold 0.2
        tests/links/      crawl all links from /, fail on 404
        tests/seo/        meta description + og:image present per route
    - Trace on retry, screenshot always, video on failure.
    - HTML reporter on CI.

  LIGHTHOUSE
    - lhci autorun on prod URL after deploy.
    - Gates: perf 90, a11y 95, best-practices 95, seo 100.
    - Fail the job below threshold.

  CI · .github/workflows/qa.yml
    on: pull_request, push to main
    jobs:
      lint        · eslint + prettier
      typecheck   · tsc --noEmit
      e2e         · playwright on 3 viewports
      lighthouse  · lhci autorun
      build       · pnpm build
    matrix the e2e job by project. shard 4 ways.

  Run the full suite locally. If anything fails, fix it.
  When green, commit each layer as its own conventional commit.

Step 4 · Vercel + Testing Academy upload

From localhost to thetestingacademy.com.

> deploy ./qa-portfolio to vercel.
  - run vercel login if needed (hand
    me the OTP prompt).
  - project name: pramod-qa
  - link to repo github.com/promode/qa-portfolio
  - env vars (encrypted):
      ANTHROPIC_API_KEY
      YOUTUBE_API_KEY
      MEDIUM_RSS_URL
  - run vercel --prod.
  - print preview + prod URLs.
  - run playwright suite against
    the PROD url; paste HTML report.
  - finally, attach custom domain:
      pramod.thetestingacademy.com
> publish this masterclass deck to
  app.thetestingacademy.com/masterclass
  /ClaudeCode.html.

  Steps:
  - rename index.htmlClaudeCode.html.
  - SSH/SFTP to the app host using
    creds from ~/.ssh/tta_deploy.
  - place under
    /var/www/app/masterclass/.
  - chmod 644, chown www-data.
  - reload nginx.
  - verify with curl + a Playwright
    snapshot of the live URL.
  - commit the deck source to
    github.com/thetestingacademy/
    masterclass-decks on a new
    branch · open a PR.
Live URL after this promptOwnerPurpose
pramod.thetestingacademy.comVercelPersonal portfolio · auto-rebuild on push.
app.thetestingacademy.com/masterclass/ClaudeCode.htmlNginxThis deck · shareable for the class.
github.com/promode/qa-portfolioGitHubSource · CI gates every PR.

Live demo · fill the form → see your portfolio appear

Try the portfolio generator right here.

Fill in your channels. Click Generate. A complete single-file portfolio renders in the preview pane on the right. Download it, deploy it. This is the same HTML Claude Code produces in the capstone — only here it runs in your browser so you can preview it instantly.

Identity

Stats

Channels

Press Generate to render.
portfolio.html · live preview

After download · ship it to Vercel

# 1. install vercel cli once
npm i -g vercel

# 2. make a folder, drop the file in
mkdir my-qa-portfolio && cd my-qa-portfolio
mv ~/Downloads/portfolio.html ./index.html

# 3. push to vercel · prod
vercel --prod

# follow prompts → pick project name → done.
# you get a https URL in ~20s.
# add a custom domain
vercel domains add pramod.thetestingacademy.com
vercel alias set <deployment-url> pramod.thetestingacademy.com

# add CI: redeploy on every push
vercel link
git init && git add . && git commit -m "init"
gh repo create pramod-qa --public --source=. --push

# vercel auto-detects the repo and
# rebuilds on every push to main.
Pro tip · the generator runs in this page — no server, no signup. Same template Claude Code produces during the capstone.

Print-friendly reference · pin this above your desk

Claude Code · full cheat sheet.

Every keybinding, slash command, skill knob, hook tag, MCP transport, env var. Lifted from Claude Code v2.1+ release notes. Use ⌘P to print this page; the grid below collapses cleanly to A4.

⌨ Keyboard shortcuts
General controls
CtrlCCancel input / generation
CtrlDExit session
CtrlLClear prompt input + full-screen redraw
CtrlOToggle transcript viewer (verbose tool usage); cycles focus in fullscreen
CtrlUClear entire input buffer
CtrlYRestore cleared input buffer
CtrlRReverse search history
CtrlGOpen in editor (same as CtrlX CtrlE)
CtrlX CtrlKKill all background agents (twice to confirm)
CtrlBBackground running tasks
EscEscRewind or summarize
Mode switching
TabCycle permission modes (default → acceptEdits → plan → …)
Mac · Option as Meta
MSwitch model
TToggle extended thinking
0Toggle fast mode
Input
\EnterNewline
vVim visual mode (char select + operators)
VVim visual-line mode
Prefixes
/Slash command
!Direct bash
@File mention + autocomplete
#Append note to CLAUDE.md
/ Slash commands · complete list
Session
/helpList every command + skill
/initCreate CLAUDE.md from repo
/clearWipe conversation, keep cwd
/compactSummarise history, free context
/resumeResume old session by id
/statusSession state + token budget
/costSpend so far this session
/exportSave conversation to file
/renameRename current session
/btw [q]Side question without polluting context
/insightsAnalyze sessions report
/extra-usageExtra usage when rate-limited
/usagePer-category cost breakdown
Config & tools
/configOpen settings · theme, model, perms
/modelSwitch model mid-session
/permissionsManage allowed tools / commands
/add-dirBring another folder into scope
/memoryEdit CLAUDE.md inline
/hooksManage hooks
/mcpManage MCP servers
/agentsManage agent configs
/skillsList available skills
/keybindingsCustomize keyboard shortcuts
/terminal-setupConfigure terminal keybindings
/scroll-speedAdjust output scroll speed
/ideIDE integrations status
/voicePush-to-talk voice dictation
/doctorDiagnose installation
/upgradeUpdate CLI to latest
/release-notesWhat's new in your version
/feedbackSubmit feedback (alias /bug)
/desktopContinue in Desktop app
/login · /logoutAuth handling
Review
/review [PR]Review PR locally
/ultrareview [PR#]Cloud multi-agent review
/security-reviewScan diff for vulnerabilities
/code-reviewEffort levels: low / med / high
/pr-commentsPull PR review comments
Scheduling & remote
/loop [interval] [prompt]Recurring task
--remoteWeb session via claude.ai
cat file | claude -pPipe input · headless one-shot
⚙ Config & env
Config files
~/.claude/settings.jsonUser settings
.claude/settings.jsonProject shared settings
.claude/settings.local.jsonLocal only · gitignored
~/.claude.jsonOAuth · MCP · state
.mcp.jsonProject MCP servers
managed-settings.d/Drop-in policy fragments
Key settings
modelOverridesMap model picker → custom IDs
autoMode.hard_denyUnconditional auto-mode deny rules
hooks: ifConditional hooks (permission rule syntax)
DISABLE_PROMPT_CACHINGWarn at startup if cache disabled
Monitor toolStream events from background scripts
PermissionDeniedHook · auto-model denial
showThinkingSummariesOpt-in (off by default)
hooks: "defer"Pause headless → resume later
type: "mcp_tool"Hook step invokes MCP tool
continueOnBlockHook config · keep running after blocked tool
disableSkillShellExecBlock !`cmd`
refreshIntervalRe-run custom status N sec
Key env vars
ANTHROPIC_API_KEYAPI key
ANTHROPIC_MODELDefault model
ANTHROPIC_BASE_URLProxy / gateway override
ANTHROPIC_BETASAdditional beta headers
ANTHROPIC_CUSTOM_MODEL_OPTIONCustom /model entry
MAX_THINKING_TOKENS0 = off
ENABLE_PROMPT_CACHING_1HOpt into 1h cache TTL
Memory & CLAUDE.md
./CLAUDE.mdProject (team-shared)
./CLAUDE.local.mdLocal personal notes · gitignored
~/.claude/CLAUDE.mdPersonal · all projects
/etc/claude-code/CLAUDE.mdManaged policy (Linux/WSL · org-wide)
MCP transport flags
--transport httpRemote HTTP (recommended)
--transport stdioLocal process
--transport sseRemote SSE
⚡ Skills & agents
Built-in skills
Skill toolDiscovers built-in slash commands (/init, /review, /security-review…)
/code-reviewCode review · low / med / high effort
/batchLarge parallel changes · 5–30 worktrees
/debug [desc]Troubleshoot from debug log
/loop [interval]Recurring scheduled task
/claude-apiLoad API + SDK reference
Custom skill locations
.claude/skills/<name>/Project skills
~/.claude/skills/<name>/Personal skills
Skill frontmatter
descriptionAuto-invocation trigger phrase
allowed-toolsSkip permission prompts
modelOverride model for skill
effortOverride effort level
paths: [globs]Path-specific (YAML list)
context: forkRun in subagent
$ARGUMENTSUser input placeholder
${CLAUDE_SKILL_DIR}Skill's own dir
${CLAUDE_EFFORT}Current effort level (skill var)
!`cmd`Dynamic context injection
plugin bin/Ship executables for Bash tool
Built-in agents
ExploreRead-only Haiku · locator agent
PlanResearch for plan mode
GeneralFull tools · complex tasks
BashTerminal separate context
Agent frontmatter
permissionModedefault / acceptEdits / plan / dontAsk / bypassPermissions
isolation: worktreeRun in git worktree
memory: user|project|localPersistent memory scope
background: trueBackground task
maxTurnsLimit agentic turns
initialPromptAuto-submit first turn
SendMessageResume agents (replaces resume)
@agent-nameMention named subagent
Tip · ⌘P / Ctrl P to print this card. Survives A4 with 4-column → 2-column auto-collapse.

Don't do these. Ever.

⊘ no

Letting the agent run on main

Use worktrees or a feature branch. Never let auto-mode write to your protected branch.

⊘ no

Trusting "I ran the tests" without proof

Ask for the exit code or the report path. Agents can hallucinate green.

⊘ no

One giant prompt for everything

Break the task. Plan first. Execute second. Verify third. Smaller turns = better diffs.

⊘ no

Skipping CLAUDE.md

Without it the agent guesses your conventions and you fight it every turn. Spend 10 min, save 10 hours.

⊘ no

Pasting secrets in prompts

Use env vars. Use 1Password CLI. Never put a token where the agent — or its logs — can see it raw.

⊘ no

Auto-merging Claude's PRs

You read the diff. Always. The agent is the writer; you are the editor.

The new QA toolbelt.

cli

Claude Code

The conductor. Lives in your terminal, reads your repo, fires every other tool.

runner

Playwright

The hands. Drives browsers and APIs. Default for E2E + API.

mcp

Playwright MCP

Bridge between agent and runner. Auto-locators via a11y tree.

mcp

Atlassian MCP

Jira + Confluence. Read AC, file bugs, post run reports.

vcs

GitHub + gh

PRs, checks, releases. Claude calls gh directly — no extra MCP.

host

Vercel

Preview-per-PR. Run Playwright against the preview before merge.

obs

Trace + Lighthouse

Visual evidence + perf gates. Claude attaches both to every PR.

brain

You

Strategy. Risk. Judgment. The only thing the agent can't replace.

Where to go next.

docs

docs.claude.com / code

Official reference. Always the source of truth for flags + tools.

repo

github · anthropics / claude-code

Issues, recipes, plugin authoring. Star it.

mcp

modelcontextprotocol.io

Spec + server registry. Find an MCP for almost anything.

play

playwright.dev

Auto-waiting, fixtures, traces. Master these and Claude works for you.

comm

The Testing Academy

Newsletter, courses, the community where we keep learning together.

try

Your repo · tonight

Open a real QA repo. Type claude. Type /init. Begin.

END · Q&A · thetestingacademy.com
END · OF · TRANSMISSION · THANK · YOU ·

Now stop
typing tests.
Start
writing intent.

SpeakerPramod Dutta
BrandThe Testing Academy
Mailthetestingacademy@gmail.com
StatusClass dismissed.