The Testing Academy · Masterclass

Claude Code for QA.

A free Claude Code tutorial for QA engineers and SDETs — learn AI test automation, agentic testing workflows, Playwright MCP, and how to use Claude AI for testing across 46 hands-on chapters.

HostPramod Dutta

TrackEnd-to-End Workflow

AudienceQA / SDET

ProjectQA Portfolio + Deploy

The 30-second pitch

Four things you walk away able to do tonight.

flake

Flaky test?

Claude rewrites the wait, replaces waitForTimeout with expect.poll, runs it 20× to prove stability — before you finish your coffee.

spec

Jira ticket vague?

Claude reads the AC, generates Gherkin scenarios + a Playwright skeleton, files the edge-case bugs it spots — straight to QA-942.

repro

Visual bug from support?

Drag the screenshot. Claude infers viewport, route, OS. Reproduces, captures a Playwright trace, attaches it to the bug.

migrate

Migration anxiety?

Cypress → Playwright. 240 specs. Claude converts in one overnight run · you wake to a passing suite + a PR ready to review.

Claude Code is a coding agent that lives in your terminal — reads your repo, drives your browser, writes your tests, runs your suite, files the bug. For a QA, that flips the SDLC.

You stop being the bottleneck on the right and start being the brain on the left. You design the contract · the agent does the toil. Same job title · 10× leverage.

10×

Test authoring throughput when paired with an agent loop.

0

Flaky locators when the agent reads the live DOM via MCP.

∞

Patience to write the boring 80% of test bodies.

1

Tester still in the loop — you. Design, judge, ship.

Where Claude Code sits in your stack.

Think of it as a terminal-native pair tester. It speaks files, shells, and — through MCP servers — browsers, Jira, Confluence, Notion, GitHub, calendars, even your inbox. Your job is to orchestrate, not to type.

Step 1

Read

It scans the repo, the failing log, the screenshot, the Jira ticket.

▸

Step 2

Plan

It proposes a plan in plan-mode — you approve, edit, or redirect.

▸

Step 3

Act

Edits files, runs commands, drives Playwright, posts comments.

▸

Step 4

Verify

Runs the suite. Re-reads diffs. Confirms green before claiming done.

▸

Step 5

Ship

Commits, opens PR, deploys preview. You review one diff, not 40.

Setup · macOS / Linux / WSL

Five lines. One terminal.

# 1. install
npm install -g @anthropic-ai/claude-code

# 2. cd into your QA repo
cd ~/work/qa-portfolio

# 3. start a session
claude

# 4. first thing you say
> /init  → writes CLAUDE.md

# 5. then ask anything
> read the repo and tell me what test framework we use

A session is just a long REPL with a model that holds the full directory in its head. Anything you type with no leading slash is a prompt. Anything starting with / is a command. Anything starting with ! is a passthrough shell call.

node ≥ 18gitripgrep (rec)gh cli (rec)uv / pip

Tip · run inside a git repo for safest diffs

A session has four moving parts.

01Context

Working dir

The folder you launched in. The agent will read files here. Treat it like the scope of one feature, not your whole laptop.

02Memory

CLAUDE.md

Project rules, conventions, do/don't. Loaded every turn. Global one lives at ~/.claude/CLAUDE.md.

03Tools

Read / Edit / Write / Bash

The four primitives. Everything else (Playwright, Jira, GitHub) is built on top via MCP servers.

04Modes

Permission · Plan · Worktree

Control what runs automatically. Plan-mode = read-only thinking. Worktree = isolated branch sandbox.

Most QA workflows live entirely inside Read + Bash + a Playwright MCP. You almost never need root, and you almost never need to leave the terminal.

Three ways to talk to the agent.

prompt

Plain English

Free text. The agent decides which tools to use.

> run the login spec and screenshot
  every failure

/command

Slash command

Built-in or custom. Predictable, repeatable.

> /init
> /review
> /compact

!shell

Shell escape

Run any shell command directly, no agent.

> !npx playwright test --headed
> !git status

Rule of thumb · use prompts for thinking, slash commands for workflows you repeat, and shell when you already know the exact command.

The complete cheat-sheet

Every slash command a QA touches.

Built-in · session	Purpose
/help	List every command + skill.
/init	Generate a CLAUDE.md from the repo.
/clear	Wipe conversation, keep cwd.
/compact	Summarise history, free context.
/resume	Resume an old session by id.
/status	Show session state + token budget.
/cost	Spend so far in this session.
/export	Save the conversation to a file.
/bug	File a Claude Code bug report.
/release-notes	Show what's new in your version.
/upgrade	Update CLI to latest.
/login · /logout	Auth handling.

Built-in · config & review	Purpose
/config	Open settings · theme, model, perms.
/model	Switch model mid-session.
/permissions	Manage allowed tools / commands.
/add-dir	Bring another folder into scope.
/memory	Edit CLAUDE.md inline.
/hooks	Inspect configured hooks.
/agents	List + manage subagents.
/mcp	List / debug MCP servers.
/review	Review current diff / PR.
/security-review	OWASP-style audit of changes.
/pr-comments	Pull PR review comments.
/ide · /vim	IDE bridge · vim keybindings.

QA-authored customs to add to .claude/commands/ · /flaky · /smoke · /bug-from-trace · /audit-locators · /gen-pom · /triage-failures · /quarantine · /report-run.

Need the full cheat-sheet?

The complete keyboard shortcuts · slash commands · config & env · skills & agents reference card lives in the Reference appendix at the end of the deck. Jump there anytime — print it, pin it above your desk.

Jump to Reference · One-page cheat-sheet →

Read · Edit · Write · Bash.
Plus Grep · Glob · WebFetch. Everything else is sugar on top.

> open tests/login.spec.ts, find the assertion
  that checks the toast, and tighten it to verify
  both text and aria-role

// agent will:
  Read(tests/login.spec.ts)
  Edit(tests/login.spec.ts)  // 1 hunk
  Bash(npx playwright test login --reporter=line)
  // reports green ✓

You never call the tools directly — you describe the intent and the agent picks the tool. The cool part: it shows you every tool call before / as it runs, so you stay in the loop.

Read

cat with a brain

Pulls only the slice it needs — line ranges, page ranges, image content.

Edit

exact-string swap

Fails loudly if the target isn't unique. Safer than sed.

Write

new files

Whole-file create / overwrite. Used sparingly.

Bash

your shell

Runs tests, git, curl, anything. Honours permissions.

The single file that changes everything

Teach the agent your house rules once.

A CLAUDE.md at the repo root is auto-loaded every turn. Put your test framework, your locator policy, your no-flake rules. The agent obeys it.

# QA conventions — qa-portfolio

## Locators — STRICT
- Prefer getByRole / getByLabel.
- Never use raw .locator('xpath=…').
- Brittle CSS selectors must include a comment why.

## Waits
- No page.waitForTimeout() in committed code.
- Use auto-waiting + expect-poll only.

## Test data
- Generate via faker; never hard-code emails.

## Commits
- Conventional Commits. No co-author trailers.

Three levels of memory cascade — global → project → local. Local overrides project, project overrides global. Lowest line wins.

Scope	Path
Global	~/.claude/CLAUDE.md
Project	./CLAUDE.md
Local (gitignored)	./CLAUDE.local.md

Demo · run /init to bootstrap it.

What goes inside CLAUDE.md

Nine sections every QA repo needs.

Treat CLAUDE.md as your team's pair-programming contract. Write it once, every session obeys it. Run /init to scaffold, then edit by hand. Lives at the repo root.

# QA Conventions · qa-portfolio

## 1. Stack
- Framework: Playwright 1.49+
- Test runner: @playwright/test
- Lang: TypeScript strict
- Node: 20.x · pnpm

## 2. Folder layout
- tests/e2e/        · browser specs
- tests/api/        · APIRequestContext
- tests/fixtures/   · shared fixtures
- tests/pom/        · page objects
- tests/data/       · faker builders

## 3. Locators — STRICT
- Prefer getByRole, getByLabel, getByTestId.
- Never raw .locator('xpath=…').
- Brittle CSS must include a "// why" comment.

## 4. Waits
- No page.waitForTimeout.
- Use auto-wait + expect.poll only.
- Retry once, then quarantine.

## 5. Data
- Generate with @faker-js/faker.
- Never hard-code emails / phones / addresses.
- Test users: env-based, not committed.

## 6. Tagging
- @smoke @regression @flaky @wip
- CI runs @smoke on every PR.

## 7. Reporting
- HTML + JSON reporter on CI.
- Attach trace + screenshot on retry.

## 8. Commits
- Conventional Commits.
- No co-author trailers.
- No "🤖 Generated with…" footers.

## 9. Do / Don't
- DO: ask before deleting any spec.
- DO: run the impacted spec after every edit.
- DON'T: edit playwright.config without a plan.
- DON'T: bump deps without a separate PR.

Why each section matters

StackAgent picks correct imports / matchers.

LayoutNew files land in the right folder.

LocatorsKills the most common flake source.

WaitsNo fixed sleeps survive review.

DataNo PII or hard-coded secrets.

TaggingCI lanes stay predictable.

ReportingBug repros come with evidence.

CommitsClean git history.

Do/Don'tHard rails on destructive ops.

Three scopes load in cascade · ~/.claude/CLAUDE.md (global) → ./CLAUDE.md (project) → ./CLAUDE.local.md (gitignored personal). Lowest line wins.

Pro tip · use # in any prompt to append a line to CLAUDE.md live.

Hire specialists, not generalists.

A subagent is a separate Claude session spawned for one bounded job. Its output is summarised back — your main context stays clean. Think of them as contractors who clock out when done. Note · the Plan subagent below is the planner specialist — distinct from Plan-mode (⇧Tab toggle, read-only thinking on the main agent).

research

Explore

Read-only code locator. "Where is X defined? What calls Y?" Fast.

> explore: find every place we
  click "Add to cart"

plan

Plan

Architect mode. Designs the implementation plan before any code change.

> plan: add a parallel visual
  regression suite for /pricing

review

code-reviewer

Audits a diff. One line per finding, severity-tagged. No fluff.

> review the last 3 commits
  for race conditions

build

e2e-runner

Owns Playwright. Generates, maintains, quarantines flaky specs.

> e2e: add coverage for
  password-reset happy path

tdd

tdd-guide

Enforces tests-first. Will refuse to write impl before a failing test.

> tdd: implement the new
  promo-code validator

sec

security-reviewer

OWASP top-10 sweep on the diff. Flags secrets, SSRF, injection, XSS.

> security: review the new
  /auth/reset endpoint

Two ways to scale your main agent

Subagent vs Agent Team.

Old model · main agent spawns isolated subagents; results flow upward. New model · main agent acts as team lead, all teammates share a task list, communicate peer-to-peer. Pick the model that fits the job.

Fig 1 · Subagents (hierarchy) vs Agent Teams (shared task list + peer comms)

When to pick which

Task shape	Pick	Why
Independent fan-out (review 20 PRs)	Subagents	No shared state needed; isolated runs scale linearly.
Coordinated build (scaffold + tests + docs)	Agent Team	Shared task list keeps teammates from duplicating work.
One-shot codebase locate	Subagent	Single Explore agent · output summarised back.
Long-running migration	Agent Team	Peer-to-peer ack lets agents hand off mid-flight.
Bug-bash across surfaces	Agent Team	Shared task list = no two agents reproducing same bug.

From contractors → task board → engineering team

The three levels of Claude Code agents.

As coordination needs grow, you climb the ladder · isolated Subagents → managed Agent View → collaborative Agent Teams. Each level adds shared state and inter-agent communication. Pick the lowest level that fits the job.

Fig 4 · original by @0x_rody on X · 3 levels of Claude Code agents — pick the lowest that fits the job

When to pick which level

You need	Level	Why
Review a single PR	Subagent (L1)	One bounded ask · contractor you send a brief.
Run 8 lint + a11y + perf sweeps in parallel	Agent View (L2)	Need to dispatch + peek progress without losing terminal.
Build a feature with spec → backend → frontend → tests	Agent Team (L3)	Tasks depend on each other · shared task list keeps state.
Migrate Cypress → Playwright across 12 packages	Agent Team (L3)	Cross-package coordination · agents talk + claim work.
"Hunt this flake"	Subagent (L1)	One spec, one fix · zero need for shared state.

What is a Skill

Skills are playbooks Claude reads on demand.

A Skill is a reusable, versioned collection of instructions, resources, and examples that teaches Claude Code how to complete a specific type of task. Drop one into .claude/skills/<name>/SKILL.md (repo) or ~/.claude/skills/<name>/SKILL.md (personal). Claude auto-discovers and auto-triggers on description match.

📘 expertise

Reusable Expertise

Encapsulate proven workflows and best practices once. Every future session uses them.

🎯 consistency

Consistent Results

Deliver reliable outcomes by following defined steps + context. Same input · same output.

🧩 composable

Composable

Use alone or combine with hooks + plugins for powerful, multi-step automations.

🤝 share

Share & Collaborate

Ship skills via git to your team — or release to the community marketplace.

Fig 2 · Skill lifecycle · how Claude finds, loads, and executes a Skill on your behalf

How a Skill runs · 5 steps

Step 1

You request a task

Describe what you need in Claude Code · plain prompt.

▸

Step 2

Claude selects a Skill

Most relevant skill chosen by matching its description field.

▸

Step 3

Skill executes

Claude follows the skill's procedure · uses tools, plugins, files.

▸

Step 4

Results produced

Structured outputs · code, files, or actions as defined.

▸

Step 5

Claude responds

Presents results + continues the conversation.

A real skill · flake-hunter

---
name: flake-hunter
description: Use when a Playwright spec
  fails intermittently. Locates the wait,
  rewrites it to expect-based polling,
  re-runs 10× to confirm stability.
---

## Steps
1. Read the failing spec.
2. Identify any waitForTimeout / sleep.
3. Replace with expect.poll().
4. Run npx playwright test --repeat-each=10.
5. Report flake-rate before vs after.

QA-relevant built-ins

playwright-e2e playwright-api playwright-cli pytest-patterns tdd-workflow e2e selenium-pom security-review vibe-check seo-audit marp (slides)

Trigger live · type "hunt the flake in checkout.spec.ts". Claude matches against the skill's description field and loads it automatically.

Pro tip · best description reads "Use when [condition] · [what it does] · [stop conditions]."

Canonical docs · build your own

official docs

docs.claude.com · Skills

Full reference · SKILL.md frontmatter, lifecycle, allowed-tools, packaging, distribution.

github

anthropics/skills

First-party skill examples · skill-creator, pdf, xlsx, pptx, docx, content pipelines.

spec

Agent Skills overview

How skills, agents, hooks, and MCP plug together in the Claude ecosystem.

Build your own skill in under 10 minutes

A Skill is a markdown file with a recipe.

Frontmatter declares name + description. Body holds the procedure. Drop in ~/.claude/skills/<skill>/SKILL.md (global) or .claude/skills/<skill>/SKILL.md (repo-scoped). Claude auto-discovers and triggers on description match.

---
name: flake-hunter
description: Use when a Playwright spec
  fails intermittently or has any sleep /
  waitForTimeout / networkidle wait. Locates
  the bad wait, rewrites with expect.poll,
  re-runs 20x to confirm stability.
allowed-tools: Read, Edit, Bash, Grep, Glob
---

# Flake Hunter

## When to trigger
- Spec failed retry on CI.
- User says "this is flaky" / "intermittent".
- Code contains waitForTimeout / sleep.

## Procedure
1. Read the spec end-to-end.
2. Grep for: waitForTimeout, sleep,
   networkidle, hard delays.
3. For each hit, replace with expect.poll
   or explicit element wait.
4. Run npx playwright test {file} \
     --repeat-each=20 --workers=1.
5. Report flake-rate before vs after
   as a markdown table.
6. If still flaky > 5%, surface the
   most-likely locator candidate.

## Stop conditions
- 20/20 pass · report success.
- Any locator looks brittle · ask user.

Frontmatter fields

Field	Purpose
name	Slug · used in `/skills`.
description	The trigger phrase. Be vivid — claude matches on this.
allowed-tools	Whitelist of tools the skill may call.
model	Optional override · sonnet / opus / haiku.
color	Optional · sidebar accent.

Scaffold faster · use skill-creator

> /skill-creator
// or, with the skill name:
> use skill-creator to build a
  skill called "locator-auditor"
  that scans tests/ for raw xpath
  and proposes role-based replacements

Reference links

docsdocs.claude.com/en/docs/claude-code/skills

repogithub · anthropics/skills

specSKILL.md frontmatter reference

toolskill-creator · scaffolds new skills

guideanthropic-skills · skill-creator (built-in)

Don't write skills from scratch · install them

Skill marketplaces · qaskills.sh + skills.sh.

Two community registries let you npx-install pre-built skills into your Claude Code (or Cursor / Copilot / Windsurf). qaskills.sh is QA-only · 450+ testing skills. skills.sh is the broader directory across all domains. Both ship via one command · zero copy-paste.

qa-only · 450+

qaskills.sh — QA Skills Directory

450+ ready-to-install skills for testing — Playwright E2E, Selenium, API generation, security audits, WCAG 2.2 a11y, Jest unit, flake detection, perf, CI optimisation. Browse by category or leaderboard. Works with Claude Code, Cursor, Copilot, Windsurf, Gemini · 26+ agents total.

npx @qaskills/cli add playwright-e2e
npx @qaskills/cli add api-test-generator
npx @qaskills/cli add flaky-test-detector
npx @qaskills/cli add wcag-audit
npx @qaskills/cli add selenium-converter

general · all domains

skills.sh — Agent Skills Directory

Searchable marketplace of agent skills across React, Next.js, Design, Mobile, Databases, Testing, Marketing. Browse trending / official / security tracks. Install any owner/repo with one command.

npx skillsadd anthropic/frontend-design
npx skillsadd vercel/react-best-practices
npx skillsadd microsoft/azure-ai
npx skillsadd anthropic/skill-creator
npx skillsadd <owner/repo>  # any GitHub skill

QA-relevant skills worth installing first day

Skill	Source	What it gives you
playwright-e2e	qaskills.sh	Best-practice Playwright spec authoring with Page Object Model + fixtures.
api-test-generator	qaskills.sh	Generates positive · negative · schema · auth coverage from one curl or OpenAPI URL.
flaky-test-detector	qaskills.sh	Hunts `waitForTimeout` · `networkidle` · sleeps · rewrites with `expect.poll`.
wcag-audit	qaskills.sh	WCAG 2.2 a11y sweep · pulls `axe-core` · reports per-rule violations.
selenium-to-playwright	qaskills.sh	Mechanical migration helper · POM + waits + selectors translation.
perf-baseline	qaskills.sh	k6 / Lighthouse baseline harness · saves perf budget JSON.
frontend-design	skills.sh · anthropic	Distinctive UI generation · already used inside this masterclass deck.
skill-creator	skills.sh · anthropic	Scaffolds new skills with frontmatter + procedure + examples folder.
security-review	skills.sh · official	OWASP top-10 sweep over a diff · same engine as `/security-review`.

# 1. land in your test repo
cd ~/work/my-qa-repo && claude

# 2. paste this prompt — claude installs the lot for you
> install these skills into .claude/skills/ and update CLAUDE.md
  to mention each one is available:
    - npx @qaskills/cli add playwright-e2e
    - npx @qaskills/cli add api-test-generator
    - npx @qaskills/cli add flaky-test-detector
    - npx @qaskills/cli add wcag-audit
    - npx skillsadd anthropic/skill-creator
    - npx skillsadd anthropic/security-review
  After install, run /skills and confirm all six show up.

when

Use marketplace skill

Generic, reusable patterns · authoring tests, audits, migrations. Don't reinvent.

when

Write your own

Company-specific rules, internal framework wrappers, secret locator policy.

when

Fork an existing one

Marketplace skill ~90% right · clone to .claude/skills/, tweak the description + procedure, commit.

Pro tip · npx @qaskills/cli list shows everything installed in current repo · pair with /skills inside Claude to verify load order.

Hooks fire around tool calls.

A hook is a shell command the harness runs at a lifecycle event. Use them to auto-format on save, run a smoke test after every write, or block edits to main.

"hooks": {
  "PostToolUse": [
    {
      "matcher": "Edit|Write",
      "hooks": [{
        "type": "command",
        "command": "npx prettier --write $CLAUDE_FILE_PATH"
      }]
    }
  ],
  "SessionStart": [{
    "hooks": [{
      "type": "command",
      "command": "echo '🧪 QA mode ready. Run /init.'"
    }]
  }]
}

Event	Fires
SessionStart	Once when you launch `claude`.
UserPromptSubmit	Every time you press enter.
PreToolUse	Before any tool runs (can block).
PostToolUse	After each tool call succeeds.
Stop	When the assistant finishes a turn.

QA pattern · PostToolUse on Edit → run the spec that owns the changed file. Feedback loop drops to seconds.

MCP turns Claude into a browser, a Jira, a Notion, a Gmail.

Model Context Protocol is the USB-C of AI tools. A small server exposes verbs (navigate, click, createJiraIssue); Claude calls them like any other tool. For QA, the killer one is the Playwright MCP.

Fig 3 · MCP architecture · QA → Claude Code → MCP servers → Playwright / Jira / GitHub / Notion / your custom API

browser

Playwright MCP

Drive Chromium, Firefox, WebKit. Snapshot the a11y tree, click by role, screenshot, network log.

tickets

Atlassian MCP

Read/write Jira issues, Confluence pages. File bugs straight from a failed test.

repo

GitHub via `gh`

PRs, issues, checks, releases. No extra MCP needed — uses gh CLI under the hood.

docs

Notion MCP

Pull the test plan, push the run report. Round-trip in one prompt.

comms

Gmail / Calendar

Draft the release email, find the freeze window, never alt-tab.

custom

Your own MCP

Wrap your internal API. Spec is just JSON-RPC. ~50 lines to start.

Demo · authoring a spec without typing locators

"Open saucedemo.com, log in, add a hoodie, screenshot the cart."

claude · qa-portfolio

› open saucedemo, login as standard_user, add the Sauce Labs Backpack to cart, screenshot the cart page, then generate a Playwright spec from what you just did

▸ browser_navigate('https://saucedemo.com')
▸ browser_snapshot() · captured a11y tree
▸ browser_fill_form(user='standard_user', pass=•••)
▸ browser_click(role=button, name='Login')
▸ browser_click(role=button, name='Add to cart', within='Sauce Labs Backpack')
▸ browser_click(role=link, name='cart')
▸ browser_take_screenshot(path='cart.png')
▸ Write(tests/cart-add.spec.ts) ✓
done · spec runs green on first try

import { test, expect } from '@playwright/test';

test('guest adds backpack to cart', async ({ page }) => {
  await page.goto('https://www.saucedemo.com');
  await page.getByRole('textbox', { name: /user/i }).fill('standard_user');
  await page.getByRole('textbox', { name: /pass/i }).fill('secret_sauce');
  await page.getByRole('button', { name: 'Login' }).click();

  await page.getByRole('button', { name: /add to cart/i }).first().click();
  await page.getByRole('link', { name: /cart/i }).click();

  await expect(page.getByText('Sauce Labs Backpack')).toBeVisible();
});

Notice — every locator is getByRole. No CSS, no XPath. The agent reads the a11y tree, so it picks accessible names by default.

API tests from a single curl.

Paste a curl, an OpenAPI URL, or a Postman export. The agent infers the contract and generates a fixture-based suite — schema validation, negative paths, auth variants, the lot.

> generate API tests for this endpoint with positive,
  negative, schema, and auth coverage. use playwright
  APIRequestContext, group by describe, log only on fail.

curl -X POST https://api.demo.dev/v1/users \
  -H 'Authorization: Bearer $T' \
  -H 'Content-Type: application/json' \
  -d '{"email":"a@b.co","plan":"pro"}'

import { test, expect } from '@playwright/test';
import { z } from 'zod';

const User = z.object({
  id: z.string().uuid(),
  email: z.string().email(),
  plan: z.enum(['free', 'pro'])
});

test.describe('POST /v1/users', () => {
  test('creates a pro user', async ({ request }) => {
    const r = await request.post('/v1/users', {
      data: { email: 'a@b.co', plan: 'pro' }
    });
    expect(r.status()).toBe(201);
    User.parse(await r.json());
  });

  test('rejects bad email', async ({ request }) => {
    const r = await request.post('/v1/users', {
      data: { email: 'not-an-email', plan: 'pro' }
    });
    expect(r.status()).toBe(400);
  });
});

Tests from requirements

A Jira ticket in. A test plan out.

> fetch QA-482 from jira, read the acceptance
  criteria, produce: 1) a Gherkin scenarios file,
  2) a Playwright skeleton, 3) a coverage matrix
  mapping each AC to a test id.

Uses · Atlassian MCP + Write tool

Feature: Guest checkout

  Scenario: Apply valid promo
    Given I have a backpack in my cart
    When I apply promo "SAVE10"
    Then the subtotal drops by 10%
    And the promo chip shows "-10%"

  Scenario: Reject expired promo
    Given I have any item in my cart
    When I apply promo "BLACKFRIDAY23"
    Then I see error "This code has expired"

  Scenario Outline: Country-specific tax
    Given I check out from 
    Then tax shows %

    Examples:
      | country | tax |
      | IN      | 18  |
      | US-CA   | 9   |
      | DE      | 19  |

From a screenshot to a filed bug.

> [screenshot.png attached]
  user reports the price chip overflows on
  the pricing card on mobile. reproduce in
  Chromium @ 390×844, capture a trace,
  attach screenshot, file a Jira bug in
  QA project with steps + expected/actual.

The agent reads the image, drives a 390×844 viewport, captures a Playwright trace, posts everything to Jira as a single bundled bug. You sip coffee.

claude · bug-bash

› analyzing screenshot…
▸ identified component: PricingCard / .price__chip
▸ browser_resize(390, 844)
▸ browser_navigate('/pricing')
▸ overflow confirmed → trace.zip (1.2 MB)
▸ createJiraIssue(QA, type=Bug)
→ QA-941 "Pricing chip overflow @ 390×844"
steps · expected · actual · screenshot · trace
severity: Minor · component: pricing-card
done

Drag a screenshot. Get tests.

Claude is multimodal — it reads images natively. Paste a Figma export, a customer's broken-UI screenshot, a flaky CI run's failure image. It identifies the component, locates it in code, proposes a fix or a test.

design

Figma → test

Drop a Figma frame. Get a visual-regression spec that snapshots the matching live route.

support

User report → repro

Paste customer screenshot. Claude infers viewport, route, OS hints, and reproduces.

ci

CI failure → root cause

Drop a failed-snapshot diff. Claude reads both, explains the visual delta in plain English.

Live · paste any screenshot into the prompt — it just works.

QA reviews code too. Now they have leverage.

Command	Outcome
/review	Severity-tagged findings on the current diff.
/security-review	Auth, SSRF, injection, secret leakage.
/ultrareview	Multi-agent cloud review of the branch / PR.
/caveman-review	One-line-per-finding terse review.

Each finding follows the format path:line: severity: problem · fix. Easy to triage, easy to paste into a PR comment.

tests/login.spec.ts:14: ⚠ medium:
  page.waitForTimeout(2000) introduces
  fixed sleep. fix: use expect.poll()
  on the dashboard heading.

src/auth/middleware.ts:42: 🔴 high:
  token comparison uses ==, vulnerable
  to timing attack. fix: crypto.timingSafeEqual.

playwright.config.ts:8: ◇ low:
  retries: 3 hides flakes. fix: cap at 1
  and quarantine instead.

Run Claude in your pipeline.

The same CLI is headless. Put it on GitHub Actions to auto-triage failing tests, generate PR review comments, or open a Jira when a smoke fails on main.

name: qa-bot
on: { pull_request: { types: [opened, synchronize] } }
jobs:
  triage:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Install Playwright
        run: npx playwright install --with-deps
      - name: Claude review + smoke
        env: { ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} }
        run: |
          npx -y @anthropic-ai/claude-code -p \
            "review this PR for QA risk, then run npx playwright test --grep @smoke
             and post a summary as a PR comment"

The -p flag (print/headless) gives you a one-shot run that exits with stdout. Perfect for cron, CI, or wrapping in your own scripts.

Cost guardrails · keep CI bills bounded

budget

Per-job ceiling

Wrap claude -p with a timeout (timeout 600s) and a token cap via CLAUDE_MAX_TOKENS. Fail-fast beats infinite spend.

model

Pick the cheapest model that works

Use Haiku for triage / log-summary jobs. Reserve Sonnet for review jobs. Avoid Opus on every PR.

scope

Diff-only context

Pipe git diff origin/main instead of the full repo. 90% of review jobs need only the diff plus 2-3 related files.

cache

Prompt caching on

Set ENABLE_PROMPT_CACHING_1H=1. Repeated reviewer prompts hit the 1h cache and bill at <10% of full price.

batch

Batch · don't fan out

One Claude job reviewing 5 PRs in sequence beats 5 jobs reviewing 1 PR each — shared system prompt cached.

alarm

Spend alerts

Anthropic console · set a monthly soft + hard limit. Slack webhook on threshold = no surprise invoice.

timeout 600s npx -y @anthropic-ai/claude-code -p \
  --model claude-haiku-4 \
  --max-turns 6 \
  "review the changes in this PR for QA risk only.
   focus on: flaky waits, hard-coded data, missing assertions.
   skip style nits. cap report at 8 bullets." < $(git diff origin/main)

Plan first. Then let it loose.

plan mode

Read-only thinking

Toggle with ⇧Tab. Claude reads, searches, designs — but cannot Edit / Write / shell-mutate. Approve the plan, then exit and execute.

large refactors · scary migrations ·
unfamiliar codebases · auth changes

worktree

Isolated sandbox

Spin a git worktree on a temp branch. Agent works there. If it goes sideways, you delete the dir. Your main checkout never moves.

> work on this in a worktree
  so my dev server keeps running

Together these are the two safety belts that let you run the agent autonomously on a long task while you go to lunch. Come back to a green PR.

Things that will save you a workday.

@filenameReference a file inline. Tab-completes paths.

#Quick-add a CLAUDE.md note from the prompt.

⇧TabToggle Plan ↔ Auto-Accept Edits mode.

EscCancel current tool call without killing session.

Esc EscRewind — pick an older message and branch.

/compactHand-roll a context summary when token budget tight.

!cmdShell passthrough — no agent involvement.

/costSee spend before continuing the next sweep.

/resumePick up yesterday's session by id.

--continueResume the last session non-interactively.

--dangerously-skip-permissionsYOLO mode for sandboxed containers only.

claude -p "…"One-shot run, prints answer, exits. CI-friendly.

Make your own /commands

Your team. Your verbs.

Drop a markdown file in .claude/commands/. The filename becomes the command. Inside, write the prompt template — with placeholders. Share it via git so every QA in your team has the same playbook.

---
description: Hunt a flaky Playwright spec
argument-hint: [spec path]
---

You are debugging a flaky Playwright test.

Spec path: $ARGUMENTS

Do these in order:
1. Read the spec end-to-end.
2. Find every waitForTimeout, sleep,
   hard-coded delay, or networkidle wait.
3. Replace with expect.poll / explicit
   element wait.
4. Run it --repeat-each=20.
5. Report flake rate before vs after as a table.

Stop and ask if any locator looks brittle.

> /flaky tests/checkout.spec.ts

// claude expands the template, fills
// $ARGUMENTS, and runs the playbook

Useful QA commands to author

/flaky /smoke /bug-from-trace /audit-locators /gen-pom /triage-failures /report-run /quarantine

Tip · commit them to the repo. Onboarding new QAs becomes a one-line README.

A QA day · before vs after.

Before

09:00Stand-up. 4 flaky tests overnight.

09:30Open Jira. Read AC. Translate to Gherkin by hand.

10:30Hunt the locator. Tweak the wait. Re-run. Cry.

12:00Lunch · still 3 flakes.

14:00Write spec body. Stack-overflow the matcher.

16:00Smoke breaks. Bisect commits manually.

17:30File bug. Attach screenshot. Update Confluence.

18:30Out the door. Backlog grew.

After

09:00Stand-up. /triage-failures ran overnight.

09:15/gen-tests QA-482 · Gherkin + skeleton in 2 min.

09:45/flaky checkout.spec.ts · agent fixes, retries ×20.

11:00Review the diff. Approve. PR opened.

12:00Lunch · backlog smaller.

14:00Exploratory session with Playwright MCP.

15:00/bug-from-trace · 3 bugs filed in 5 min.

16:00Deep work — architecture, risk, mentoring.

17:00Out the door. Slept fine.

One session, every job vs one agent, one job

Vibe Coding vs Software Factory.

Pile every role onto one Claude session and you get context chaos — research notes, frontend code, QA assertions, reviewer remarks all bleed into the same buffer. The fix · split roles across small, focused agents in a pipeline. Each agent owns one job, with clean context, and hands off to the next.

Fig 5 · Vibe coding (one session does everything · context chaos) vs Software Factory (one agent per role · clean handoff)

What this means for QA

vibe

Symptom · main session is doing too much

The same Claude turn just researched the Jira, wrote Playwright tests, reviewed your migration, and drafted the PR description. By turn 30 it's forgotten which framework you're using.

factory

Fix · pipeline with explicit roles

Researcher subagent reads AC → Spec subagent writes Gherkin → Tests subagent writes Playwright specs → Validator subagent runs the suite + reports. Each gets its own context.

> Build a software factory pipeline for QA-482.
  Step 1 · researcher subagent · pull AC from Jira, summarise.
  Step 2 · spec subagent · turn AC into Gherkin scenarios.
  Step 3 · test subagent · write Playwright specs from Gherkin.
  Step 4 · validator subagent · run the suite, report pass/fail.
  Each step in its own context · pass only the artefact forward.
  Stop and ask if any step output looks off.

Rule of thumb · if your main session has scrolled past 4 unrelated topics, you've slipped into Vibe Coding. /compact or split into subagents.

When your repo has 12 apps and 8 test suites

Claude Code in a monorepo.

One Claude session can scope multiple packages via /add-dir. Per-package CLAUDE.md overrides the root one. Per-package skills live under packages/<app>/.claude/skills/. Run diff-aware tests so you don't smoke-test the whole world for a 1-file fix.

## Repo · Turborepo monorepo

Layout:
  apps/web/        Next.js app
  apps/admin/      admin dashboard
  apps/api/        Express API
  packages/ui/     shared components
  packages/qa/     shared Playwright fixtures

Each app has its own CLAUDE.md.
ALWAYS read the package-level CLAUDE.md
before editing inside that package.

Diff-aware test command:
  turbo test --filter='[origin/main]'

CI runs only changed packages.

> /add-dir apps/web
> /add-dir packages/qa

> the new apps/web/PricingCard component
  reuses fixtures from packages/qa. Read
  both CLAUDE.md files, then write a Playwright
  spec at apps/web/tests/pricing.spec.ts
  that imports the shared checkoutFixture.
  Run only that one spec:

  turbo test --filter=web -- --grep=pricing

Tip · per-package .claude/skills/ beats root-level — keeps locator policy local.

Monorepo cookbook

Pattern	Recipe
Diff-aware CI	`turbo test --filter='[origin/main]'` · only packages that changed.
Per-package skills	`packages/web/.claude/skills/locator-auditor/` · stays out of `apps/api`.
Shared fixtures	`packages/qa` · import via `@org/qa-fixtures` in every test app.
Cross-package refactor	Run Claude with `/add-dir` on every affected package · one PR, one diff.
Selective Playwright projects	`playwright.config.ts` · one project per app, filter via `--project=web`.

Migration playbook · don't rewrite by hand

From Cypress · Selenium · TestCafe → Playwright.

Four-step playbook. Claude does 90% of the mechanical work · you review the diff. Cite the selenium-to-playwright skill from qaskills.sh in your prompt and it follows the official translation map.

Step 1

Audit

Claude inventories every spec, groups by complexity, flags brittle ones.

▸

Step 2

Convert

Spec-by-spec mechanical translation · locators, waits, hooks.

▸

Step 3

Quarantine

Anything Claude isn't 95% sure about lands in @flaky tag for human review.

▸

Step 4

Cut over

Run both suites on CI for 1 week · drop old when delta = 0.

Locator translation matrix

Cypress / Selenium	Playwright	Why
`cy.get('[data-test=login]')`	`page.getByTestId('login')`	Built-in test-id locator, no CSS lookup.
`cy.contains('Submit').click()`	`page.getByRole('button', { name: 'Submit' }).click()`	Role-based · screen-reader-equivalent.
`cy.intercept('POST', '/api/users')`	`page.route('/api/users', …)`	Native request interception.
`cy.wait('@createUser')`	`await page.waitForResponse(r => r.url().includes('/users'))`	Promise-based · no aliases.
`driver.findElement(By.xpath('//button[…]'))`	`page.getByRole('button', { name: /…/ })`	Auto-wait + a11y tree.
`Thread.sleep(2000)`	`await expect(locator).toBeVisible()`	Auto-waiting · no sleeps.
`beforeEach` hook	`test.beforeEach` fixture · or named fixture	Fixtures compose better.

> migrate cypress/e2e/ to Playwright at tests/e2e/.
  Use the selenium-to-playwright skill from qaskills.sh.
  Rules:
    - Use getByRole / getByLabel / getByTestId first.
    - No waitForTimeout. Replace with expect-based waits.
    - cy.intercept → page.route. cy.task → fixture method.
    - Anything you're not 95% sure about, tag @flaky + add a TODO.
  After conversion, run the new suite locally and report:
    - total specs converted
    - pass count / fail count / @flaky count
    - diff in run-time vs old Cypress suite

pitfall

Aliased intercepts

Cypress @alias doesn't map cleanly · use waitForResponse with URL matcher instead.

pitfall

Custom commands

Cypress Cypress.Commands.add → Playwright fixtures · not 1:1. Build a fixture file.

pitfall

iframe handling

Cypress can't deeply traverse iframes natively; Playwright can via frameLocator. Better outcome.

Numbers that drive QA decisions

Flake rate · MTBF · p95 — let Claude do the math.

Claude can ingest your Playwright JSON reporter output, compute the stability metrics that matter, and rank specs by flake risk. Wire it into CI · fail the build if flake rate goes above 2%.

The four numbers

flake rate

Failed-then-passed-on-retry %

If > 2% on main, your suite is rotting. Track per-spec and per-package.

MTBF

Mean time between flakes

Runs between flake events. Useful for noisy specs · alert when it drops.

p95

p95 duration

95th percentile spec runtime. Bisect anything spiking 2× week-over-week.

pass-rate

Pass rate over 30 runs

If < 98% on a non-WIP spec, quarantine or fix · don't ship more on top.

---
description: Read the last 30 Playwright
  JSON reports, compute flake rate, rank
  flakiest specs, propose fixes for top 5.
argument-hint: [days]
---

You have $ARGUMENTS days of JSON reports
under .playwright/reports/.

For each spec, compute:
  - total runs
  - failed-then-passed-on-retry count
  - flake rate (= retries / runs * 100)
  - p95 duration

Sort by flake rate desc.
For the top 5, read the spec file and
propose 1-line fixes.

Output as a markdown table + a fix list.

# after the playwright job
- name: Flake gate
  run: |
    rate=$(node scripts/flake-rate.js)
    echo "flake rate: ${rate}%"
    if (( $(echo "$rate > 2.0" | bc -l) )); then
      echo "::error::Flake rate ${rate}% exceeds 2%"
      exit 1
    fi

# scripts/flake-rate.js
const { suites } = require('./playwright-report.json');
const all = suites.flatMap(s => s.specs);
const retries = all.filter(s => s.tests[0].results.length > 1).length;
console.log(((retries / all.length) * 100).toFixed(2));

> /report-flake 7
  also: produce a 5-bullet exec summary for the QA channel —
  start with the headline number, then the biggest mover,
  then top-3 specs to fix this week. Format for Slack.

When your test framework has no public docs

Teach Claude your in-house harness.

Three ways to teach Claude a framework it's never seen — pick one or stack all three. Pattern works for any internal CLI / DSL / fixture system.

1 · CLAUDE.md

Document the rules

Locator policy, fixture system, naming conventions in a "## AcmeTest" section of CLAUDE.md. Loaded every turn.

2 · examples

Show input → output pairs

Drop 8-12 example specs in .claude/skills/acmetest/examples/. Claude few-shots from them.

3 · custom MCP

Wrap your CLI

~50 lines of JSON-RPC turns acme-test run --spec X into a real MCP tool Claude can call.

## AcmeTest · our in-house test harness

Locators are NEVER raw CSS. Always use:
  acme.locate('@:')

Fixtures live in tests/_fixtures/ and are
auto-injected · NEVER instantiate manually.

Setup hook is beforeScenario, not beforeEach.

Assertions use acme.expect(x).toMatchSnapshot();
DO NOT import @playwright/test.

CLI:
  acme-test run --spec=tests/login.acme
  acme-test list-fixtures
  acme-test snapshot --update --spec=…

Reports land in .acme/reports/<ts>/.
Trace files at .acme/traces/<ts>/<spec>.trace.

Minimal custom MCP wrapper (Node)

import { Server } from '@modelcontextprotocol/sdk/server';
import { exec } from 'node:child_process';
import { promisify } from 'node:util';
const sh = promisify(exec);

const server = new Server({ name: 'acme-test', version: '1.0' });

server.tool('acme_run', {
  description: 'Run an AcmeTest spec and return the JSON report.',
  inputSchema: { type: 'object', properties: { spec: { type: 'string' } }, required: ['spec'] }
}, async ({ spec }) => {
  const { stdout } = await sh(`acme-test run --spec=${spec} --json`);
  return { content: [{ type: 'text', text: stdout }] };
});

server.tool('acme_list_fixtures', { description: 'List all available fixtures.' },
  async () => ({ content: [{ type: 'text', text: (await sh('acme-test list-fixtures')).stdout }] }));

server.start({ transport: 'stdio' });

{
  "mcpServers": {
    "acme-test": {
      "command": "node",
      "args": [".claude/mcp/acme-test/server.js"]
    }
  }
}

Day-by-day playbook for week one

Onboard a new QA to Claude Code in 5 days.

Hand a printable checklist to every new hire. By Friday they ship their first PR. By week two they're authoring custom skills.

Day	Task	Outcome
Mon · Setup	`npm i -g @anthropic-ai/claude-code` · clone the test repo · `/init` · read root `CLAUDE.md` end-to-end · install team skills via `npx @qaskills/cli add <skill>`	Working environment · CLAUDE.md memorised · 6 team skills installed
Tue · Read	Pair with a senior on `/explore` · ask Claude to walk the test directory · run `/flaky tests/checkout.spec.ts` together · watch the workflow	Understands repo shape · knows where fixtures + POM live
Wed · First PR	Pick a Jira ticket from the "good-first-bug" lane · ask Claude to generate Gherkin + Playwright skeleton · finish, run, push · `/review` before opening the PR	First PR opened with green CI
Thu · Tools	Author a personal custom command in `.claude/commands/me/<name>.md` · try Plan mode (`⇧Tab`) on a scarier task · run `/security-review` on yesterday's PR	Knows custom commands · plan mode · security gates
Fri · Shadow	Shadow the bug-bash · use Vision (drop screenshots) to repro 3 bugs · file each via Jira MCP · pair on `/ultrareview` for the senior's PR	Familiar with Vision · Jira MCP · review workflows

The buddy prompt template

> /btw I'm a new QA on this repo. I'm trying to understand
  why tests/checkout/payment.spec.ts uses a custom
  fixture instead of the shared one in packages/qa.
  Read both files, give me a 4-bullet explanation a junior
  tester can follow. Do not change any code.

# /btw runs the question as a side-thread so the senior's
# main session isn't polluted.

stumble

"It edited the wrong file"

Forgot to /clear between tasks · old context bled in. Always clear between unrelated scopes.

stumble

"My CI bill went up"

Ran --continue on yesterday's huge session. Use /compact instead, or start fresh.

stumble

"The agent is hallucinating tests"

No CLAUDE.md rules on locators. Spend 10 min writing them · save 10 hours of fights.

Week-2 KPIs (manager checklist)

✓Authored at least one custom slash command in .claude/commands/

✓Shipped 3+ PRs with /review + /security-review green

✓Filed 5+ bugs with Vision + Jira MCP · no manual repro steps

✓Can explain to a peer what a Skill is and which 3 their team relies on

✓Knows when to use plan mode vs auto-accept vs worktrees

Hands-on · run these now

Five drills. Do them in order.

015 min

First contact

Pick any repo. Run claude. Type /init. Read the generated CLAUDE.md. Edit it to reflect your rules.

0210 min

Codegen a spec

Ask Claude to drive saucedemo.com via Playwright MCP, complete a checkout, and emit a spec. Run it. Expect green on first try.

0310 min

Tame a flake

Find a spec with a waitForTimeout. Ask Claude to remove all sleeps and prove stability with --repeat-each=20.

0410 min

Bug from screenshot

Drop a UI bug screenshot. Ask Claude to repro at the right viewport, capture a trace, and draft a Jira-ready bug write-up.

0520 min

Portfolio scaffold

Run the mega-prompt from the project chapter. End with a working localhost:3000 and a passing Playwright suite.

06stretch

Author /flaky

Build the custom /flaky command in .claude/commands/. Commit it. Run it on a real spec. Share with a teammate.

Capstone · channels → skill → site → live URL

Your channels become your portfolio.

You already produce QA content across 6 channels. We'll teach Claude to read them, package you as a Skill, generate a portfolio site, and ship it to app.thetestingacademy.com/masterclass/ClaudeCode.html.

GitHub LinkedIn Blog YouTube Instagram Medium

Four prompts. Zero hand-coding. Ends with a live URL + Playwright suite + Lighthouse gate + CI pipeline.

Test engineer.
Builder of QA crews.

240k+ engineers learn QA from my YouTube. I break, write, ship, repeat.

PlaywrightAISeleniumK6

The Testing Academy240k YouTube · 12k newsletter.

Open-sourcegithub.com/promode · 30+ repos.

TalksSelenium Conf · STPCon · meetups.

WritingMedium · personal blog · LinkedIn.

Step 1 · package yourself as a Skill

Aggregate 6 channels into one Skill.

The Skill becomes the single source of truth — every later prompt reads from it. Whenever you publish new content, you re-run the Skill and the site updates itself.

> create a new skill at .claude/skills/pramod-me/SKILL.md that
  packages me as data. Aggregate the following channels and produce
  a single data.json file in the same folder:

  - GitHub      → https://github.com/promode
  - LinkedIn    → https://www.linkedin.com/in/thetestingacademy
  - Blog        → https://scrolltest.com
  - YouTube     → https://www.youtube.com/@TheTestingAcademy
  - Instagram   → https://instagram.com/thetestingacademy
  - Medium      → https://medium.com/@thetestingacademy

  For each channel:
    1. Fetch the public page with WebFetch.
    2. Extract: handle, follower count if visible, top 6 items
       (repo / post / video / reel / article) with title, url, date.
    3. Pull a one-line bio + headshot URL where available.

  Then in SKILL.md frontmatter:
    name: pramod-me
    description: Use whenever the user needs current data about
      Pramod Dutta — bio, top repos, latest videos, recent talks,
      newest blog posts — to render a personal site or CV.
    allowed-tools: WebFetch, Read, Write, Bash

  Procedure section must:
    - Read data.json if <= 7 days old, otherwise refresh from sources.
    - Expose 5 helpers: getBio(), getTopRepos(), getLatestVideos(),
      getRecentPosts(), getTalks().
    - Cache to data.json. Pretty-print, ASCII-only, sorted by date desc.

  Verify by running the skill end-to-end and pasting a summary of
  the JSON it produced. Stop if any fetch fails — ask before retrying.

Step 2 · scaffold + content + style

One mega-prompt → full Next.js site.

> using the pramod-me skill, scaffold a Next.js 15 portfolio at
  ./qa-portfolio. Requirements:

  STACK
    - Next 15 (app router), TypeScript strict, Tailwind, MDX.
    - Inter (sans) + JetBrains Mono. Teal accent #1a7c79.
    - Light theme, doc-style layout. No purple gradients, no neon.

  ROUTES
    /              hero · headline · 3 chip tags · CTA "watch on YouTube"
    /work          top 6 repos pulled from getTopRepos()
    /talks         talks + slide decks
    /writing       latest blog + Medium articles via getRecentPosts()
    /videos        embed latest 6 YouTube videos via getLatestVideos()
    /social        LinkedIn / Instagram cards, last 3 posts each
    /contact       email · calendar embed · X / GitHub links

  COMPONENTS
    - <ChannelStrip /> at footer · 6 icons → 6 URLs.
    - <Card /> with title, date, source-badge, hover lift.
    - <Hero /> · pulls from getBio() at build time.

  DATA
    - Read from .claude/skills/pramod-me/data.json.
    - getStaticProps for /work /talks /writing /videos.
    - ISR every 24h on prod.

  SEO + META
    - OpenGraph image generated dynamically per route.
    - JSON-LD Person schema on home.
    - sitemap.xml + robots.txt.

  After scaffold finishes, run pnpm dev and confirm the
  home page renders without console errors. Take a screenshot at
  390x844 and 1440x900, save them as docs/hero-mobile.png
  and docs/hero-desktop.png. Then stop and report.

Step 3 · Playwright + Lighthouse + GHA

Cover every route. Gate every merge.

> add a Playwright suite + a CI pipeline to ./qa-portfolio.

  PLAYWRIGHT
    - Install @playwright/test + @axe-core/playwright.
    - 3 projects: mobile (390x844 · Pixel), tablet (768x1024 · iPad),
      desktop (1440x900 · Chromium).
    - Suites:
        tests/smoke/      every route returns 200 + has <h1>
        tests/a11y/       axe.run() on every route, 0 critical issues
        tests/visual/     screenshot per route, threshold 0.2
        tests/links/      crawl all links from /, fail on 404
        tests/seo/        meta description + og:image present per route
    - Trace on retry, screenshot always, video on failure.
    - HTML reporter on CI.

  LIGHTHOUSE
    - lhci autorun on prod URL after deploy.
    - Gates: perf 90, a11y 95, best-practices 95, seo 100.
    - Fail the job below threshold.

  CI · .github/workflows/qa.yml
    on: pull_request, push to main
    jobs:
      lint        · eslint + prettier
      typecheck   · tsc --noEmit
      e2e         · playwright on 3 viewports
      lighthouse  · lhci autorun
      build       · pnpm build
    matrix the e2e job by project. shard 4 ways.

  Run the full suite locally. If anything fails, fix it.
  When green, commit each layer as its own conventional commit.

Step 4 · Vercel + Testing Academy upload

From `localhost` to thetestingacademy.com.

> deploy ./qa-portfolio to vercel.
  - run vercel login if needed (hand
    me the OTP prompt).
  - project name: pramod-qa
  - link to repo github.com/promode/qa-portfolio
  - env vars (encrypted):
      ANTHROPIC_API_KEY
      YOUTUBE_API_KEY
      MEDIUM_RSS_URL
  - run vercel --prod.
  - print preview + prod URLs.
  - run playwright suite against
    the PROD url; paste HTML report.
  - finally, attach custom domain:
      pramod.thetestingacademy.com

> publish this masterclass deck to
  app.thetestingacademy.com/masterclass
  /ClaudeCode.html.

  Steps:
  - rename index.html → ClaudeCode.html.
  - SSH/SFTP to the app host using
    creds from ~/.ssh/tta_deploy.
  - place under
    /var/www/app/masterclass/.
  - chmod 644, chown www-data.
  - reload nginx.
  - verify with curl + a Playwright
    snapshot of the live URL.
  - commit the deck source to
    github.com/thetestingacademy/
    masterclass-decks on a new
    branch · open a PR.

Live URL after this prompt	Owner	Purpose
pramod.thetestingacademy.com	Vercel	Personal portfolio · auto-rebuild on push.
app.thetestingacademy.com/masterclass/ClaudeCode.html	Nginx	This deck · shareable for the class.
github.com/promode/qa-portfolio	GitHub	Source · CI gates every PR.

Live demo · fill the form → see your portfolio appear

Try the portfolio generator right here.

Fill in your channels. Click Generate. A complete single-file portfolio renders in the preview pane on the right. Download it, deploy it. This is the same HTML Claude Code produces in the capstone — only here it runs in your browser so you can preview it instantly.

Identity

Photo (optional · used as avatar) Full name Handle Tagline Location Short bio

Stats

Followers Repos Talks Years XP

Channels

GitHub URL LinkedIn URL Blog URL YouTube URL Instagram URL Medium URL X / Twitter URL Email Skill chips (comma-separated) Pinned (3 lines · title | one-line desc | url)

Press Generate to render.

portfolio.html · live preview

After download · ship it to Vercel

# 1. install vercel cli once
npm i -g vercel

# 2. make a folder, drop the file in
mkdir my-qa-portfolio && cd my-qa-portfolio
mv ~/Downloads/portfolio.html ./index.html

# 3. push to vercel · prod
vercel --prod

# follow prompts → pick project name → done.
# you get a https URL in ~20s.

# add a custom domain
vercel domains add pramod.thetestingacademy.com
vercel alias set <deployment-url> pramod.thetestingacademy.com

# add CI: redeploy on every push
vercel link
git init && git add . && git commit -m "init"
gh repo create pramod-qa --public --source=. --push

# vercel auto-detects the repo and
# rebuilds on every push to main.

Pro tip · the generator runs in this page — no server, no signup. Same template Claude Code produces during the capstone.

Print-friendly reference · pin this above your desk

Claude Code · full cheat sheet.

Every keybinding, slash command, skill knob, hook tag, MCP transport, env var. Lifted from Claude Code v2.1+ release notes. Use ⌘P to print this page; the grid below collapses cleanly to A4.

⌨ Keyboard shortcuts

General controls

CtrlCCancel input / generation

CtrlDExit session

CtrlLClear prompt input + full-screen redraw

CtrlOToggle transcript viewer (verbose tool usage); cycles focus in fullscreen

CtrlUClear entire input buffer

CtrlYRestore cleared input buffer

CtrlRReverse search history

CtrlGOpen in editor (same as CtrlX CtrlE)

CtrlX CtrlKKill all background agents (twice to confirm)

CtrlBBackground running tasks

EscEscRewind or summarize

Mode switching

⇧TabCycle permission modes (default → acceptEdits → plan → …)

Mac · Option as Meta

⌥MSwitch model

⌥TToggle extended thinking

⌥0Toggle fast mode

Input

\EnterNewline

vVim visual mode (char select + operators)

VVim visual-line mode

Prefixes

/Slash command

!Direct bash

@File mention + autocomplete

#Append note to CLAUDE.md

/ Slash commands · complete list

Session

/helpList every command + skill

/initCreate CLAUDE.md from repo

/clearWipe conversation, keep cwd

/compactSummarise history, free context

/resumeResume old session by id

/statusSession state + token budget

/costSpend so far this session

/exportSave conversation to file

/renameRename current session

/btw [q]Side question without polluting context

/insightsAnalyze sessions report

/extra-usageExtra usage when rate-limited

/usagePer-category cost breakdown

Config & tools

/configOpen settings · theme, model, perms

/modelSwitch model mid-session

/permissionsManage allowed tools / commands

/add-dirBring another folder into scope

/memoryEdit CLAUDE.md inline

/hooksManage hooks

/mcpManage MCP servers

/agentsManage agent configs

/skillsList available skills

/keybindingsCustomize keyboard shortcuts

/terminal-setupConfigure terminal keybindings

/scroll-speedAdjust output scroll speed

/ideIDE integrations status

/voicePush-to-talk voice dictation

/doctorDiagnose installation

/upgradeUpdate CLI to latest

/release-notesWhat's new in your version

/feedbackSubmit feedback (alias /bug)

/desktopContinue in Desktop app

/login · /logoutAuth handling

Review

/review [PR]Review PR locally

/ultrareview [PR#]Cloud multi-agent review

/security-reviewScan diff for vulnerabilities

/code-reviewEffort levels: low / med / high

/pr-commentsPull PR review comments

Scheduling & remote

/loop [interval] [prompt]Recurring task

--remoteWeb session via claude.ai

cat file | claude -pPipe input · headless one-shot

⚙ Config & env

Config files

~/.claude/settings.jsonUser settings

.claude/settings.jsonProject shared settings

.claude/settings.local.jsonLocal only · gitignored

~/.claude.jsonOAuth · MCP · state

.mcp.jsonProject MCP servers

managed-settings.d/Drop-in policy fragments

Key settings

modelOverridesMap model picker → custom IDs

autoMode.hard_denyUnconditional auto-mode deny rules

hooks: ifConditional hooks (permission rule syntax)

DISABLE_PROMPT_CACHINGWarn at startup if cache disabled

Monitor toolStream events from background scripts

PermissionDeniedHook · auto-model denial

showThinkingSummariesOpt-in (off by default)

hooks: "defer"Pause headless → resume later

type: "mcp_tool"Hook step invokes MCP tool

continueOnBlockHook config · keep running after blocked tool

disableSkillShellExecBlock !`cmd`

refreshIntervalRe-run custom status N sec

Key env vars

ANTHROPIC_API_KEYAPI key

ANTHROPIC_MODELDefault model

ANTHROPIC_BASE_URLProxy / gateway override

ANTHROPIC_BETASAdditional beta headers

ANTHROPIC_CUSTOM_MODEL_OPTIONCustom /model entry

MAX_THINKING_TOKENS0 = off

ENABLE_PROMPT_CACHING_1HOpt into 1h cache TTL

Memory & CLAUDE.md

./CLAUDE.mdProject (team-shared)

./CLAUDE.local.mdLocal personal notes · gitignored

~/.claude/CLAUDE.mdPersonal · all projects

/etc/claude-code/CLAUDE.mdManaged policy (Linux/WSL · org-wide)

MCP transport flags

--transport httpRemote HTTP (recommended)

--transport stdioLocal process

--transport sseRemote SSE

⚡ Skills & agents

Built-in skills

Skill toolDiscovers built-in slash commands (/init, /review, /security-review…)

/code-reviewCode review · low / med / high effort

/batchLarge parallel changes · 5–30 worktrees

/debug [desc]Troubleshoot from debug log

/loop [interval]Recurring scheduled task

/claude-apiLoad API + SDK reference

Custom skill locations

.claude/skills/<name>/Project skills

~/.claude/skills/<name>/Personal skills

Skill frontmatter

descriptionAuto-invocation trigger phrase

allowed-toolsSkip permission prompts

modelOverride model for skill

effortOverride effort level

paths: [globs]Path-specific (YAML list)

context: forkRun in subagent

$ARGUMENTSUser input placeholder

${CLAUDE_SKILL_DIR}Skill's own dir

${CLAUDE_EFFORT}Current effort level (skill var)

!`cmd`Dynamic context injection

plugin bin/Ship executables for Bash tool

Built-in agents

ExploreRead-only Haiku · locator agent

PlanResearch for plan mode

GeneralFull tools · complex tasks

BashTerminal separate context

Agent frontmatter

permissionModedefault / acceptEdits / plan / dontAsk / bypassPermissions

isolation: worktreeRun in git worktree

memory: user|project|localPersistent memory scope

background: trueBackground task

maxTurnsLimit agentic turns

initialPromptAuto-submit first turn

SendMessageResume agents (replaces resume)

@agent-nameMention named subagent

Tip · ⌘P / Ctrl P to print this card. Survives A4 with 4-column → 2-column auto-collapse.

Don't do these. Ever.

⊘ no

Letting the agent run on main

Use worktrees or a feature branch. Never let auto-mode write to your protected branch.

⊘ no

Trusting "I ran the tests" without proof

Ask for the exit code or the report path. Agents can hallucinate green.

⊘ no

One giant prompt for everything

Break the task. Plan first. Execute second. Verify third. Smaller turns = better diffs.

⊘ no

Skipping CLAUDE.md

Without it the agent guesses your conventions and you fight it every turn. Spend 10 min, save 10 hours.

⊘ no

Pasting secrets in prompts

Use env vars. Use 1Password CLI. Never put a token where the agent — or its logs — can see it raw.

⊘ no

Auto-merging Claude's PRs

You read the diff. Always. The agent is the writer; you are the editor.

The new QA toolbelt.

cli

Claude Code

The conductor. Lives in your terminal, reads your repo, fires every other tool.

runner

Playwright

The hands. Drives browsers and APIs. Default for E2E + API.

mcp

Playwright MCP

Bridge between agent and runner. Auto-locators via a11y tree.

mcp

Atlassian MCP

Jira + Confluence. Read AC, file bugs, post run reports.

vcs

GitHub + gh

PRs, checks, releases. Claude calls gh directly — no extra MCP.

host

Vercel

Preview-per-PR. Run Playwright against the preview before merge.

obs

Trace + Lighthouse

Visual evidence + perf gates. Claude attaches both to every PR.

brain

You

Strategy. Risk. Judgment. The only thing the agent can't replace.

Where to go next.

docs

docs.claude.com / code

Official reference. Always the source of truth for flags + tools.

repo

github · anthropics / claude-code

Issues, recipes, plugin authoring. Star it.

mcp

modelcontextprotocol.io

Spec + server registry. Find an MCP for almost anything.

play

playwright.dev

Auto-waiting, fixtures, traces. Master these and Claude works for you.

comm

The Testing Academy

Newsletter, courses, the community where we keep learning together.

try

Your repo · tonight

Open a real QA repo. Type claude. Type /init. Begin.

END · Q&A · thetestingacademy.com

END · OF · TRANSMISSION · THANK · YOU ·

Now stop
typing tests.
Start
writing intent.

SpeakerPramod Dutta

BrandThe Testing Academy

Mailthetestingacademy@gmail.com

StatusClass dismissed.

Claude Code for QA.

The 30-second pitch

Four things you walk away able to do tonight.

Flaky test?

Jira ticket vague?

Visual bug from support?

Migration anxiety?

Where Claude Code sits in your stack.

Read

Plan

Act

Verify

Ship

Setup · macOS / Linux / WSL

Five lines. One terminal.

A session has four moving parts.

Working dir

CLAUDE.md

Read / Edit / Write / Bash

Permission · Plan · Worktree

Three ways to talk to the agent.

Plain English

Slash command

Shell escape

The complete cheat-sheet

Every slash command a QA touches.

Need the full cheat-sheet?

Read · Edit · Write · Bash.Plus Grep · Glob · WebFetch. Everything else is sugar on top.

cat with a brain

exact-string swap

new files

your shell

The single file that changes everything

Teach the agent your house rules once.

What goes inside CLAUDE.md

Nine sections every QA repo needs.

Why each section matters

Hire specialists, not generalists.

Explore

Plan

code-reviewer

e2e-runner

tdd-guide

security-reviewer

Two ways to scale your main agent

Subagent vs Agent Team.

When to pick which

From contractors → task board → engineering team

The three levels of Claude Code agents.

When to pick which level

What is a Skill

Skills are playbooks Claude reads on demand.

Reusable Expertise

Consistent Results

Composable

Share & Collaborate

How a Skill runs · 5 steps

You request a task

Claude selects a Skill

Skill executes

Results produced

Claude responds

A real skill · flake-hunter

QA-relevant built-ins

Canonical docs · build your own

docs.claude.com · Skills

anthropics/skills

Agent Skills overview

Build your own skill in under 10 minutes

A Skill is a markdown file with a recipe.

Frontmatter fields

Scaffold faster · use skill-creator

Reference links

Don't write skills from scratch · install them

Skill marketplaces · qaskills.sh + skills.sh.

qaskills.sh — QA Skills Directory

skills.sh — Agent Skills Directory

QA-relevant skills worth installing first day

Use marketplace skill

Write your own

Read · Edit · Write · Bash.
Plus Grep · Glob · WebFetch. Everything else is sugar on top.

GitHub via `gh`