The Ultimate Guide to Playwright Website Testing

Playwright has become one of the strongest choices for end-to-end testing because it treats the browser like a real user environment instead of a mocked HTTP client. It can drive Chromium, Firefox, and WebKit; record traces; emulate mobile devices; intercept network calls; test multiple tabs; and run the same suite locally, in CI, or through an AI-assisted MCP workflow.

This guide covers a production-ready Playwright setup for website testing: the latest CLI workflow, local debugging, GitHub Actions, the Playwright MCP Server, and complex login scenarios with identity providers such as Microsoft Entra ID, Okta, Auth0, Keycloak, and similar SSO platforms.

1. Why Playwright for Website Testing?

Playwright is useful when your tests need to verify the actual browser experience:

Navigation, redirects, cookies, storage, sessions, and tabs
Cross-browser behavior across Chromium, Firefox, and WebKit
Component interactions that depend on JavaScript and browser APIs
Screenshots, videos, traces, and HTML reports for debugging
Authentication flows that involve redirects to external identity providers
CI/CD quality gates before deployment

It is especially strong for modern applications built with frameworks such as Astro, Next.js, React, Vue, Svelte, Angular, Spring Boot frontends, Laravel, Django, Rails, and static-site generators.

2. Install Playwright with the Latest CLI

For a new project, use the latest initializer:

npm init playwright@latest

For an existing project:

npm install -D @playwright/test
npx playwright install

On Linux CI runners, install browser system dependencies as well:

npx playwright install --with-deps

The initializer typically creates:

playwright.config.ts
tests/
tests-examples/

Common CLI commands:

Command	Purpose
`npx playwright test`	Run the full test suite
`npx playwright test --ui`	Open the interactive test runner
`npx playwright test --headed`	Run with visible browsers
`npx playwright test --debug`	Step through tests with inspector
`npx playwright codegen https://example.com`	Generate tests by recording browser actions
`npx playwright show-report`	Open the HTML report
`npx playwright show-trace trace.zip`	Inspect a trace file

For day-to-day development, --ui, --debug, and codegen are the fastest way to build reliable tests without guessing selectors.

3. Recommended Project Structure

A maintainable Playwright suite should separate test intent from reusable helpers:

tests/
  smoke/
    homepage.spec.ts
    navigation.spec.ts
  auth/
    entra-login.spec.ts
    okta-login.spec.ts
  checkout/
    cart.spec.ts
  fixtures/
    authenticated-page.ts
  pages/
    LoginPage.ts
    DashboardPage.ts
  storage/
    user.json
playwright.config.ts

Use this structure as the suite grows:

smoke/ for fast critical-path checks
auth/ for identity-provider login and session tests
pages/ for page-object abstractions when flows become repetitive
fixtures/ for shared authenticated contexts and test setup
storage/ for generated storage state files, usually ignored by Git

Keep tests readable. A test should explain the user journey, not the low-level browser mechanics.

4. Core Configuration

A production Playwright configuration usually defines:

The base URL for local and CI environments
Browser projects
Retries in CI
Trace/video/screenshot behavior
Web server startup for local app testing
Reporters for humans and CI systems

Example concepts to include in playwright.config.ts:

import { defineConfig, devices } from "@playwright/test";

export default defineConfig({
  testDir: "./tests",
  timeout: 30_000,
  expect: {
    timeout: 5_000,
  },
  retries: process.env.CI ? 2 : 0,
  workers: process.env.CI ? 2 : undefined,
  reporter: process.env.CI ? [["github"], ["html"]] : "html",
  use: {
    baseURL: process.env.BASE_URL || "http://localhost:4321",
    trace: "on-first-retry",
    screenshot: "only-on-failure",
    video: "retain-on-failure",
  },
  projects: [
    { name: "chromium", use: { ...devices["Desktop Chrome"] } },
    { name: "firefox", use: { ...devices["Desktop Firefox"] } },
    { name: "webkit", use: { ...devices["Desktop Safari"] } },
    { name: "mobile-chrome", use: { ...devices["Pixel 5"] } },
  ],
  webServer: {
    command: "npm run dev",
    url: "http://localhost:4321",
    reuseExistingServer: !process.env.CI,
  },
});

For a static site, you may prefer testing the production build:

npm run build
npx serve dist
npx playwright test

Testing the built output catches asset, routing, sitemap, search-index, and static-rendering problems that dev servers can hide.

5. Writing Reliable Tests

Good Playwright tests use user-facing selectors first:

await page.getByRole("link", { name: "Docs" }).click();
await expect(
  page.getByRole("heading", { name: "Documentation" })
).toBeVisible();

Selector priority:

getByRole() for accessible UI
getByLabel() for form fields
getByText() for stable visible text
getByTestId() for elements without accessible names
CSS selectors only when necessary

Avoid brittle selectors such as deeply nested CSS paths or generated class names. If a test fails because a button moved in the DOM, the selector was probably too coupled to implementation details.

6. Authentication Strategy: Do Not Test the Identity Provider Every Time

Enterprise login is the hardest part of browser testing. Microsoft Entra ID, Okta, Auth0, Keycloak, Ping Identity, and similar providers introduce redirects, MFA prompts, bot detection, device trust policies, conditional access rules, and rate limits.

The key principle:

Test your application’s authentication integration, but do not make every test depend on a live identity-provider login.

Use three layers:

Layer	Purpose	Frequency
Mocked or seeded auth state	Fast app behavior tests	Every PR
Real login smoke test	Verify SSO integration still works	Limited PR/nightly
Manual or secure environment validation	MFA, device trust, conditional access	Release or scheduled

Most suites should authenticate once, save the browser storage state, and reuse it across tests.

7. Storage State for Authenticated Sessions

Playwright can save cookies and local storage after login:

await page.context().storageState({ path: "tests/storage/user.json" });

Then tests can reuse that state:

test.use({ storageState: "tests/storage/user.json" });

Recommended flow:

Run a setup project that logs in once.
Save the storage state.
Run authenticated tests using that storage state.
Regenerate state when it expires.

Never commit real session cookies or tokens. Add generated storage files to .gitignore unless they contain only local mock data.

Microsoft Entra ID login flows commonly include:

Redirect from the application to login.microsoftonline.com
Username entry
Password entry
MFA or number matching
“Stay signed in?” prompt
Redirect back to the application callback URL
Application session creation

Practical recommendations:

Use a dedicated test tenant or test app registration.
Use a dedicated test user with least privilege.
Disable MFA only in non-production test tenants when policy allows it.
Prefer workload-safe test accounts and short-lived secrets.
Store credentials in CI secrets, not in source code.
Use a nightly real-login test if Conditional Access makes PR testing unstable.

For apps using OAuth/OIDC, many teams test most behavior by creating a valid app session through a backend test endpoint or seeded database state, then reserve one Playwright test for the full Entra redirect flow.

If MFA is required, do not automate personal MFA devices. Use a controlled test policy, a service-owned test account, or a dedicated pre-authenticated test environment.

Okta flows are similar but often include organization-specific policies:

Custom Okta domain redirects
Identifier-first login
Password entry
Okta Verify push, TOTP, WebAuthn, or SMS factors
App assignment checks
Redirect back through OIDC or SAML

Recommendations:

Use a dedicated Okta application for tests.
Use a dedicated test group and test user.
Keep app assignment and factor policies deterministic.
Avoid shared human accounts.
Prefer API-created users for isolated test environments.
Run full Okta login smoke tests separately from fast PR checks.

For SAML applications, validate both sides: the browser redirect flow and the application session that is created after the SAML response is consumed.

10. Handling MFA, Captchas, and Conditional Access

Some authentication steps are intentionally hard to automate. That is a security feature, not a Playwright limitation.

Use this decision model:

Scenario	Recommended approach
MFA disabled in test tenant	Automate full login in setup
MFA required by policy	Run limited scheduled tests or use controlled test factors
Captcha appears	Do not bypass; use a test environment without captcha
Conditional Access varies by IP/device	Use stable CI runners or a dedicated test policy
Passwordless/WebAuthn required	Prefer mocked app sessions for PR tests and manual release validation
External IdP rate limits login	Reuse storage state and reduce login frequency

Do not weaken production authentication to make tests easier. Instead, design a test environment with explicit policies for automation.

11. GitHub Actions for Playwright

A practical GitHub Actions workflow should:

Install dependencies from a lockfile
Install Playwright browsers
Build or start the application
Run tests
Upload HTML reports and traces on failure

Example workflow:

name: Playwright Tests

on:
  pull_request:
  push:
    branches:
      - main

jobs:
  playwright:
    runs-on: ubuntu-latest
    timeout-minutes: 30

    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Setup Node
        uses: actions/setup-node@v4
        with:
          node-version: 22
          cache: npm

      - name: Install dependencies
        run: npm ci

      - name: Install Playwright browsers
        run: npx playwright install --with-deps

      - name: Run Playwright tests
        run: npx playwright test
        env:
          BASE_URL: http://localhost:4321
          TEST_USERNAME: ${{ secrets.TEST_USERNAME }}
          TEST_PASSWORD: ${{ secrets.TEST_PASSWORD }}

      - name: Upload Playwright report
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: playwright-report
          path: playwright-report/
          retention-days: 7

If your Playwright config starts the app using webServer, the workflow does not need a separate server step. If you start the app manually, make sure the workflow waits until the URL is ready before running tests.

12. Sharding for Faster CI

Large suites should be sharded across multiple jobs:

strategy:
  fail-fast: false
  matrix:
    shardIndex: [1, 2, 3, 4]
    shardTotal: [4]

steps:
  - run: npx playwright test --shard=${{ matrix.shardIndex }}/${{ matrix.shardTotal }}

Sharding reduces feedback time, but it requires tests to be independent. Avoid shared mutable users, shared carts, shared orders, or shared global state unless each shard receives isolated test data.

13. Playwright MCP Server

The Playwright MCP Server exposes browser automation capabilities through the Model Context Protocol. It lets compatible AI agents inspect pages, click elements, type text, capture snapshots, and verify flows using a real browser session.

Install and run it with the current package:

npx @playwright/mcp@latest

Typical MCP client configuration:

{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": ["@playwright/mcp@latest"]
    }
  }
}

Use cases:

Ask an agent to explore a website and identify broken flows
Generate Playwright test drafts from real interactions
Debug selector problems with accessibility snapshots
Reproduce bugs in a browser instead of describing them manually
Validate UI changes during development

For CI, keep deterministic Playwright tests as the source of truth. Use MCP to accelerate exploration, debugging, and test authoring.

14. AI-Assisted Test Authoring Workflow

A strong workflow combines MCP exploration with committed Playwright tests:

Start the local app.
Connect an MCP-capable agent to Playwright.
Ask the agent to navigate the user journey.
Convert the observed flow into a Playwright test.
Replace fragile selectors with role-based locators.
Run the test locally with npx playwright test --debug.
Commit the deterministic test.
Run it in GitHub Actions.

MCP is excellent for discovery. Your repository should still contain plain Playwright tests that humans can review, version, and run without an AI agent.

15. Testing Complex User Journeys

Complex scenarios need deliberate test-data design:

Scenario	Pattern
Login then dashboard	Save storage state in setup
Multi-role authorization	Create separate storage states per role
Checkout or payment	Use sandbox payment providers
File upload	Store test fixtures in the repo
Email verification	Use test inbox APIs or backend test hooks
Multi-tab flows	Use Playwright context and page events
External redirects	Assert both redirect URL and final app state
Feature flags	Set flags explicitly per test environment

The best tests are independent, repeatable, and explicit about the state they require.

16. Security Best Practices

End-to-end tests often touch sensitive systems. Treat test automation like production code:

Store secrets only in GitHub Actions secrets or a secure vault.
Use dedicated test accounts with least privilege.
Rotate test credentials regularly.
Never commit storage state containing real tokens.
Avoid logging passwords, cookies, authorization headers, or ID tokens.
Separate production and test identity-provider tenants when possible.
Use short-lived environments for high-risk flows.
Review traces before uploading them publicly, because traces can contain URLs, form values, screenshots, and network metadata.

For open-source repositories, be extra careful: pull requests from forks do not receive normal repository secrets, and they should not run privileged login flows.

17. Debugging Failures

When a Playwright test fails, inspect artifacts in this order:

HTML report
Trace viewer
Screenshot
Video
Console logs
Network requests

Useful commands:

npx playwright show-report
npx playwright show-trace test-results/path-to-trace.zip
npx playwright test tests/auth/entra-login.spec.ts --headed --debug

The trace viewer is usually the fastest path to the root cause because it shows DOM snapshots, actions, network events, console output, and timing.

18. Common Anti-Patterns

Avoid these mistakes:

Logging in through the IdP before every test
Reusing one shared account across all parallel tests
Depending on test execution order
Using waitForTimeout() instead of web-first assertions
Committing real cookies or tokens
Testing production with destructive test data
Making PR checks depend on MFA prompts
Ignoring accessibility selectors
Treating MCP exploration as a replacement for committed tests

If the suite is flaky, slow, or hard to debug, the problem is usually test data, authentication design, or selectors.

19. Recommended Test Pyramid for Playwright

For most teams:

Many unit and component tests
A focused set of Playwright smoke tests
A smaller set of authenticated role-based journeys
A tiny number of full external IdP login tests
Scheduled deep regression suites

Playwright is powerful, but it should not carry every kind of test. Use it where browser realism matters.

Final Thoughts

Playwright is more than a browser automation library. With the latest CLI, GitHub Actions integration, trace-first debugging, and the Playwright MCP Server, it becomes a complete workflow for building, exploring, validating, and continuously testing real websites.

The most important design choice is authentication strategy. For providers such as Microsoft Entra ID and Okta, automate what is stable, isolate what is risky, and avoid turning every test into a live SSO challenge. A fast suite reuses trusted state; a secure suite protects credentials and tokens; a reliable suite keeps full IdP login checks focused and intentional.