What is Visual Regression Testing?
Visual regression testing is the practice of automatically capturing screenshots of your web application and comparing them against previously approved baseline images to detect unintended visual changes. While functional tests verify that a button works, visual tests verify that the button looks right — correct color, correct position, correct size, no overlapping elements.
Traditional functional test suites can all pass — every assertion green — while the UI is visibly broken. Imagine a CSS change that shifts a navigation bar 40px to the left, or a font-weight change that makes a hero headline barely readable, or a layout reflow on a 375px viewport that stacks elements on top of each other. None of these would fail a Selenium or Playwright assertion that only checks element presence or text content. Visual regression tests catch exactly these categories of defect.
Real examples of UI bugs that slip through functional tests
- Overlapping text: A CSS z-index change causes a modal's close button to render behind the modal backdrop — the button is present in the DOM and clickable via script, but invisible to a real user.
- Missing images: A broken CDN path causes product images to fall back to empty alt text boxes. The product name text still passes assertion.
- Broken layout on small viewport: A flex container overflows at 375px, stacking elements vertically in an unintended order. Desktop tests pass; mobile users see a broken page.
- Color contrast regression: A design system update changes a button's background from #2563eb to #60a5fa — both are "blue," but the lighter shade now fails WCAG AA contrast requirements against white text.
- Font rendering: A web font fails to load, falling back to system serif — the page works functionally but looks completely wrong.
Architecture Overview
Understanding how visual testing tools work under the hood helps you configure them correctly and debug failures effectively. The typical visual testing pipeline follows this flow:
(Selenium / Playwright / Cypress)
(Eyes / Percy)
(Full page / viewport)
(Cloud service)
(AI / pixel)
(Dashboard / PR comment)
Your test code instructs the SDK to take a checkpoint screenshot at a named step. The SDK sends the screenshot to a cloud service that stores it and compares it against the baseline for that step name. The diff engine produces a result — pass if within tolerance, fail if visual changes are detected. Results are surfaced in the tool's dashboard and, when integrated with GitHub/GitLab, as PR comments or status checks.
Applitools Eyes Setup with Selenium Python
Applitools Eyes is the most feature-rich commercial visual testing platform. It offers AI-powered comparison, cross-browser rendering via Ultrafast Test Cloud, and a powerful dashboard for managing baselines.
Installation
pip install eyes-selenium
Basic test structure
from selenium import webdriver
from applitools.selenium import Eyes, Target
class TestVisualLogin:
def setup_method(self):
self.driver = webdriver.Chrome()
self.eyes = Eyes()
self.eyes.api_key = "YOUR_APPLITOOLS_API_KEY"
def test_login_page_visual(self):
self.driver.get("https://example.com/login")
# Open Eyes session — (app name, test name, viewport size)
self.eyes.open(
driver=self.driver,
app_name="My Web App",
test_name="Login Page Visual",
viewport_size={"width": 1280, "height": 800}
)
# Take a full-page checkpoint
self.eyes.check_window("Login Page - Initial State")
self.driver.find_element("id", "email").send_keys("user@example.com")
self.driver.find_element("id", "password").send_keys("Password123")
# Take another checkpoint after filling the form
self.eyes.check_window("Login Page - Form Filled")
# Close Eyes and get the test result
results = self.eyes.close(raise_ex=False)
assert results.is_passed, f"Visual differences detected: {results.url}"
def teardown_method(self):
self.eyes.abort_if_not_closed()
self.driver.quit()
The key methods are:
eyes.open()— Starts an Eyes test session, sets the baseline key (app name + test name + viewport)eyes.check_window()— Takes a full-page screenshot at this checkpoint and compares to baselineeyes.close(raise_ex=False)— Ends the session and returns results without throwing; you handle the assertioneyes.abort_if_not_closed()— Cleanup in teardown — closes any open session in case of test failure
Checking a specific region
from applitools.selenium import Target, Region
# Check only the header element
self.eyes.check("Header Region",
Target.region(self.driver.find_element("css selector", "header.site-header"))
)
# Check a specific coordinate region (x, y, width, height)
self.eyes.check("Banner",
Target.region(Region(0, 0, 1280, 200))
)
Applitools AI Match Levels
The match level controls how strictly Applitools compares the screenshot to the baseline. Choosing the right level for each test is critical to avoiding both false positives and missed regressions.
| Match Level | What It Checks | Best Used For | Tolerance |
|---|---|---|---|
| Exact | Pixel-perfect match — every pixel must be identical | Static images, fixed assets, canvas elements | Zero tolerance |
| Strict | Human-visible changes — catches anything a user would notice | Most UI components — buttons, forms, headers | Low — ignores sub-pixel antialiasing |
| Layout | Structure and layout only — ignores text content and colors | Pages with dynamic content (names, dates, prices) | High — content-agnostic |
| Content | Text presence and position — ignores styling differences | Localization tests, font fallback detection | Medium |
| Ignore Colors | Structure and layout without color comparison | Dark mode vs light mode comparisons | High |
from applitools.selenium import MatchLevel
# Set match level on the Eyes instance (applies to all checks)
self.eyes.match_level = MatchLevel.LAYOUT
# Or set it per-checkpoint using Target
self.eyes.check("Dashboard",
Target.window().fully().match_level(MatchLevel.STRICT)
)
Why Layout mode reduces flakiness: Pages with timestamps, user names, dynamic prices, or advertisement content will always fail Strict comparison after the first run because the content changes. Layout mode verifies that the structural elements are in the right positions without caring about what text they contain. This is the mode I reach for on most dashboard and listing pages where the data changes but the layout should not.
Baseline Management
The baseline is the "approved correct" screenshot that all future runs compare against. Baseline management is where visual testing workflows live or die.
First run — establishing the baseline
The very first time Applitools encounters a new test name and viewport combination, it has no baseline. It accepts the screenshot automatically and creates the baseline. Subsequent runs compare against this accepted screenshot. This means the first run always "passes" — you should review it manually to confirm the initial state is correct before relying on it as a reference.
Accepting and rejecting diffs
When a visual difference is detected, the test shows as "Unresolved" in the Applitools dashboard. You review the diff side-by-side and either:
- Accept (thumbs up): The change was intentional — update the baseline to the new screenshot
- Reject (thumbs down): The change is a bug — mark as failed, team fixes the code
Branching baselines for feature branches
Applitools supports branching baselines that mirror your Git branches. Set the branch name in your test configuration:
# Set branch from environment variable (CI provides this)
import os
self.eyes.branch_name = os.environ.get("BRANCH_NAME", "main")
self.eyes.parent_branch_name = "main"
When a feature branch test first runs, Applitools copies the baseline from the parent branch. Changes made and accepted on the feature branch only affect that branch's baseline — merging the branch to main prompts a baseline merge as well. This prevents feature branches from polluting the main baseline.
Percy (BrowserStack) Setup
Percy is BrowserStack's visual testing platform. It integrates tightly with GitHub and GitLab pull request workflows, making it a popular choice for teams already using BrowserStack for cross-browser testing.
Installation
pip install percy-selenium
Basic Percy test with Selenium
from selenium import webdriver
from percy import percy_snapshot
class TestPercyVisual:
def setup_method(self):
self.driver = webdriver.Chrome()
def test_homepage_visual(self):
self.driver.get("https://example.com")
# Take a Percy snapshot — name is the baseline key
percy_snapshot(self.driver, "Homepage")
# Navigate to login
self.driver.find_element("link text", "Sign In").click()
percy_snapshot(self.driver, "Login Page")
def teardown_method(self):
self.driver.quit()
Percy requires the PERCY_TOKEN environment variable to be set. Percy handles screenshot upload, cross-browser rendering, and comparison in their cloud. There is no client-side baseline comparison — everything happens server-side and results appear in the Percy dashboard and as PR comments.
Running Percy tests
# Set your Percy token
export PERCY_TOKEN=your_percy_token_here
# Percy wraps your test command
npx percy exec -- pytest tests/visual/
Percy GitHub Integration
Percy's most compelling feature for modern teams is its automatic GitHub pull request integration. Once you install the Percy GitHub App on your repository:
- Every PR that triggers Percy tests gets a status check: percy/web
- The status check shows the count of visual changes: "8 visual changes found"
- Clicking "Details" opens the Percy dashboard filtered to that PR's build
- Reviewers see the before/after diff inline and can approve changes with a click
- Once all changes are approved, the Percy status check turns green
- You can configure branch protection to require the Percy check before merging
This workflow integrates visual review into the code review process. A designer or QA engineer can review and approve visual changes in the Percy UI without touching the codebase — the developer gets a clear signal that the visual changes are intentional and approved.
Cypress + Percy Integration
Percy has a first-class Cypress integration that feels native to the Cypress ecosystem:
# Install Percy Cypress SDK
npm install --save-dev @percy/cypress @percy/cli
# In cypress/support/e2e.js (or index.js for older Cypress)
import '@percy/cypress';
// cypress/e2e/visual.cy.js
describe('Visual Regression Tests', () => {
beforeEach(() => {
cy.visit('https://example.com');
});
it('captures homepage visual snapshot', () => {
cy.get('[data-testid="hero-section"]').should('be.visible');
cy.percySnapshot('Homepage - Hero Section');
});
it('captures product listing visual snapshot', () => {
cy.visit('/products');
cy.get('.product-grid').should('be.visible');
cy.percySnapshot('Products Page', { widths: [375, 768, 1280] });
});
it('captures navigation states', () => {
cy.get('.nav-toggle').click();
cy.get('.site-nav').should('have.class', 'open');
cy.percySnapshot('Navigation - Mobile Open');
});
});
# Run with Percy
npx percy exec -- cypress run
The widths option in cy.percySnapshot() tells Percy to render the snapshot at multiple viewport widths in a single run, giving you responsive coverage from a single test call.
Playwright + Applitools
Applitools has a dedicated Playwright SDK that uses the Ultrafast Grid — instead of running your browser-level screenshots through the driver, Applitools renders your page's DOM snapshot in their cloud across all configured browsers simultaneously. One test run, multiple browser results.
pip install eyes-playwright
from playwright.sync_api import sync_playwright
from applitools.playwright import Eyes, Target, Configuration
from applitools.common import BatchInfo, BrowserType
def test_playwright_visual():
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()
eyes = Eyes()
config = Configuration()
config.batch = BatchInfo("Playwright Visual Batch")
# Add browsers for Ultrafast Grid cross-browser rendering
config.add_browser(1280, 800, BrowserType.CHROME)
config.add_browser(1280, 800, BrowserType.FIREFOX)
config.add_browser(375, 812, BrowserType.SAFARI)
eyes.set_configuration(config)
eyes.open(page, "My App", "Playwright Full Page Test")
page.goto("https://example.com")
# Check the entire page including below-the-fold content
eyes.check("Full Page", Target.window().fully())
page.click('[data-testid="cta-button"]')
eyes.check("After CTA Click", Target.window())
results = eyes.close(raise_ex=False)
browser.close()
assert results.is_passed
Responsive Visual Testing
Testing only at 1280×800 gives false confidence. Real users access your app on phones, tablets, and wide monitors. Both Applitools and Percy support multi-viewport testing in a single run.
# Applitools — add multiple browsers/viewports to Configuration
from applitools.common import BrowserType, DeviceName, ScreenOrientation
config.add_browser(375, 812, BrowserType.CHROME) # Mobile portrait
config.add_browser(768, 1024, BrowserType.CHROME) # Tablet portrait
config.add_browser(1280, 800, BrowserType.CHROME) # Desktop
config.add_browser(1920, 1080, BrowserType.CHROME) # Wide desktop
# Add emulated mobile devices
config.add_device_emulation(DeviceName.iPhone_X, ScreenOrientation.PORTRAIT)
config.add_device_emulation(DeviceName.iPad_Pro, ScreenOrientation.LANDSCAPE)
# Percy — specify widths directly in snapshot call
percy_snapshot(driver, "Homepage Responsive",
widths=[375, 768, 1024, 1280, 1920]
)
Visual Testing in CI — GitHub Actions
Visual regression tests only add continuous value when they run on every pull request automatically. Here are complete GitHub Actions workflows for both Applitools and Percy.
Applitools in GitHub Actions
# .github/workflows/visual-applitools.yml
name: Visual Tests — Applitools
on: [pull_request]
jobs:
visual-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install dependencies
run: pip install pytest selenium eyes-selenium
- name: Install Chrome
uses: browser-actions/setup-chrome@latest
- name: Run visual tests
env:
APPLITOOLS_API_KEY: ${{ secrets.APPLITOOLS_API_KEY }}
BRANCH_NAME: ${{ github.head_ref }}
run: pytest tests/visual/ -v --tb=short
Percy in GitHub Actions
# .github/workflows/visual-percy.yml
name: Visual Tests — Percy
on: [pull_request]
jobs:
percy-visual:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: '20'
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install dependencies
run: |
npm install -g @percy/cli
pip install pytest selenium percy-selenium
- name: Install Chrome
uses: browser-actions/setup-chrome@latest
- name: Run Percy visual tests
env:
PERCY_TOKEN: ${{ secrets.PERCY_TOKEN }}
run: npx percy exec -- pytest tests/visual/ -v
Tool Comparison — Applitools vs Percy vs BackstopJS vs Chromatic
| Feature | Applitools Eyes | Percy | BackstopJS | Chromatic |
|---|---|---|---|---|
| Comparison Engine | AI (Visual AI) | Pixel diff + rendering | Pixel diff (Resemble.js) | Pixel diff (Storybook-native) |
| Pricing (free tier) | Free: 1 user, limited checkpoints | Free: 5,000 screenshots/month | Free (self-hosted) | Free: 5,000 snapshots/month |
| Framework Support | Selenium, Playwright, Cypress, WebdriverIO, Appium | Selenium, Playwright, Cypress, WebdriverIO, Storybook | Puppeteer, Playwright, Selenium | Storybook (primary), Playwright |
| Cross-browser Cloud | Yes — Ultrafast Grid | Yes — BrowserStack cloud | No — local browsers only | Limited (Chrome/Firefox) |
| CI Integration | GitHub, GitLab, Jenkins, CircleCI | GitHub, GitLab, Bitbucket (native PR comments) | Any (local report generation) | GitHub, GitLab (Storybook PRs) |
| Dynamic Content Handling | Excellent — Layout/Content match levels | Good — ignore regions in config | Manual — ignore regions in JSON config | Limited — best for component isolation |
| Best For | Enterprise, full application visual QA | Teams on BrowserStack, PR-centric review | Budget-conscious teams, self-hosted | Storybook component libraries |
Best Practices
Visual testing brings significant value but also unique challenges. These practices come from running visual test suites in production CI pipelines across multiple projects.
1. Use Layout match level for dynamic content
Any page with timestamps, user-generated content, live prices, or advertisements should use Layout match level. Strict comparison on dynamic content produces a constant stream of false positives that will erode team trust in the visual suite within weeks.
2. Define ignore regions for truly unavoidable dynamic elements
# Applitools — ignore a specific element
from applitools.selenium import Target, FloatingRegion
self.eyes.check("Dashboard",
Target.window()
.ignore(self.driver.find_element("id", "live-price-ticker"))
.ignore(self.driver.find_element("css selector", ".ad-banner"))
.match_level(MatchLevel.STRICT)
)
3. Run visual tests on every PR, not just main
Visual regressions are easiest to attribute and fix at PR time. If you only run visual tests on main after merge, you will spend significant time bisecting commits to find which change caused the regression. Catching it on the PR that introduced it costs 10 minutes; finding it after merge can cost hours.
4. Maintain separate baselines per environment
Your staging environment may have different test data, dark mode settings, or feature flags than production. Comparing screenshots from staging against a production baseline will produce false failures. Use Applitools' branch/environment configuration or Percy's parallel builds to maintain separate baselines for dev, staging, and production environments.
5. Review visual diffs before approving PRs
Make visual review part of your PR review checklist — not just code review. A PR that changes CSS should have its Percy or Applitools dashboard link checked by a reviewer with design context, not just a developer looking at code diffs.
6. Integrate visual test results into your definition of done
Visual tests should be a required status check for merging PRs — alongside unit tests and integration tests. Treat an unresolved visual change the same as a failing unit test: the PR does not merge until it is reviewed and either fixed or intentionally accepted as a design change.
Back to Blog