Back to All Articles
DevOps

Docker for QA Engineers — Containerised Test Environments

Honnesh Muppala May 5, 2026 15 min read

Why QA Engineers Need Docker

"It works on my machine" is the phrase that ends QA credibility. A test that passes locally but fails in CI — or passes in CI but fails in the developer's environment — is a test no one trusts. Docker eliminates this class of problem by packaging the test runner, its dependencies, and its runtime environment into a single portable unit called a container.

For QA engineers specifically, Docker solves four recurring problems:

Learning Docker as a QA engineer does not require deep platform knowledge. You need to understand a handful of concepts and about a dozen commands — this guide covers all of them.

Architecture Diagram

Docker Client
(docker CLI)
Docker Daemon
(dockerd)
Local Host
Images
Containers
Networks
Volumes
Container Registry
(Docker Hub / ECR)

The Docker Client (the docker command you type) sends instructions to the Docker Daemon running on your machine. The Daemon manages local images (blueprints), running containers (instances of images), networks (how containers communicate), and volumes (persistent storage). Images are pulled from or pushed to a container registry such as Docker Hub.

Core Concepts

Image — A read-only template that defines the container's filesystem: the OS base layer, installed packages, application code, and startup command. Think of an image as a snapshot. It is immutable — running a container from it never changes the image itself.

Container — A running instance of an image. Multiple containers can run from the same image simultaneously, each with its own isolated filesystem, network, and process space. Containers start in milliseconds and can be stopped and deleted instantly.

Dockerfile — A text file containing instructions for building a custom image. Each instruction creates a layer. Docker caches layers, so if you change only your application code, only the final layers are rebuilt — making iterative builds fast.

docker-compose — A tool for defining and running multi-container Docker applications. A single docker-compose.yml file declares all services (app, database, cache), their configurations, and how they connect to each other. One command (docker compose up) starts the entire stack.

Volume — A Docker-managed directory that persists beyond the container's lifetime. Mount a volume to write test reports, database data, or logs outside the container where they can be read by the host machine or CI system.

Network — By default, containers in the same Compose project are placed on a shared bridge network and can reach each other by service name as a hostname. Your test runner can connect to a database at postgres:5432 simply because the database service is named postgres in docker-compose.yml.

Essential Docker CLI Commands for QA

# Pull an image from Docker Hub
docker pull python:3.12-slim

# Run a container interactively (for debugging)
docker run -it python:3.12-slim /bin/bash

# Run your test container and remove it after exit (--rm prevents clutter)
docker run --rm qa-tests

# List running containers
docker ps

# List all containers including stopped ones
docker ps -a

# View logs from a running or stopped container
docker logs my-container-name
docker logs -f my-container-name   # -f = follow (live stream)

# Execute a command inside a running container (e.g. for debugging)
docker exec -it my-container-name /bin/bash

# Stop a running container
docker stop my-container-name

# Remove a stopped container
docker rm my-container-name

# Remove an image from local storage
docker rmi python:3.12-slim

# Inspect a container's environment and config
docker inspect my-container-name

# See all local images
docker images

# Clean up all stopped containers, dangling images, unused networks
docker system prune

Dockerfile for a Python Test Runner

Here is a production-ready Dockerfile for a pytest-based test suite. It follows best practices: slim base image, non-root user, layer caching for fast rebuilds.

# Dockerfile
FROM python:3.12-slim

# Metadata
LABEL maintainer="Honnesh Muppala <hello@honneshraju.com>"
LABEL description="QA test runner for API and integration tests"

# Set working directory inside container
WORKDIR /app

# Install system dependencies (for browsers, drivers, etc.)
RUN apt-get update && apt-get install -y \
    curl \
    wget \
    gnupg \
    && rm -rf /var/lib/apt/lists/*

# Copy requirements first for layer caching
# If requirements.txt hasn't changed, pip install is skipped on rebuild
COPY requirements.txt .

# Install Python dependencies
RUN pip install --no-cache-dir -r requirements.txt

# Copy the rest of the test suite
COPY . .

# Create reports directory
RUN mkdir -p /app/reports

# Run as non-root user for security
RUN useradd -m testrunner && chown -R testrunner:testrunner /app
USER testrunner

# Default command: run all tests and write HTML report
CMD ["pytest", "tests/", "--html=reports/report.html", "--self-contained-html", "-v"]

A matching requirements.txt:

pytest==8.1.1
pytest-html==4.1.1
requests==2.31.0
selenium==4.20.0
webdriver-manager==4.0.1

Building and Running Your Test Container

# Build the image and tag it
docker build -t qa-tests:latest .

# Run the tests (container auto-removes after exit)
docker run --rm qa-tests:latest

# Mount a local reports/ directory to extract HTML report from container
docker run --rm \
  -v "$(pwd)/reports:/app/reports" \
  qa-tests:latest

# Pass environment variables (target URL, credentials)
docker run --rm \
  -e BASE_URL=https://staging.example.com \
  -e API_KEY=abc123 \
  -v "$(pwd)/reports:/app/reports" \
  qa-tests:latest

# Override the default CMD to run a specific test file
docker run --rm qa-tests:latest pytest tests/test_login.py -v
From my DevOps work at Viasat: We containerised our entire API test suite early in the project. Before Docker, "onboarding" a new QA engineer onto the test suite took a full day — installing the right Python version, resolving pip conflicts with other projects, configuring environment variables. After containerising, onboarding took five minutes: git clone, docker build, docker run. The container held everything. This was especially valuable because our team was spread across the US and Europe — environment inconsistencies across time zones were a constant source of false failures that Docker eliminated completely.

Docker Compose for Test Stacks

Real integration tests require real services: databases, caches, message brokers. Docker Compose starts the entire test stack with a single command and tears it all down when you are done.

# docker-compose.yml — minimal structure
version: "3.8"
services:
  app:
    image: myapp:latest
    ports:
      - "8000:8000"
    environment:
      DATABASE_URL: postgresql://testuser:testpass@postgres:5432/testdb
    depends_on:
      postgres:
        condition: service_healthy

  postgres:
    image: postgres:15-alpine
    environment:
      POSTGRES_USER: testuser
      POSTGRES_PASSWORD: testpass
      POSTGRES_DB: testdb
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U testuser"]
      interval: 5s
      timeout: 5s
      retries: 5

  tests:
    build: .
    environment:
      BASE_URL: http://app:8000
    volumes:
      - ./reports:/app/reports
    depends_on:
      - app
    command: pytest tests/ --html=reports/report.html -v

Run everything with:

docker compose up --build --exit-code-from tests

The --exit-code-from tests flag makes Compose exit with the test container's exit code — zero for pass, non-zero for fail — which propagates correctly to CI.

Full Compose Example: FastAPI + PostgreSQL + pytest

# docker-compose.test.yml
version: "3.8"
services:

  postgres:
    image: postgres:15-alpine
    environment:
      POSTGRES_USER: qa
      POSTGRES_PASSWORD: qa_secret
      POSTGRES_DB: qa_testdb
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U qa"]
      interval: 5s
      retries: 10

  fastapi-app:
    build:
      context: ./app
      dockerfile: Dockerfile
    ports:
      - "8000:8000"
    environment:
      DATABASE_URL: postgresql://qa:qa_secret@postgres:5432/qa_testdb
      ENV: test
    depends_on:
      postgres:
        condition: service_healthy
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 5s
      retries: 10

  qa-tests:
    build:
      context: ./tests
      dockerfile: Dockerfile
    environment:
      BASE_URL: http://fastapi-app:8000
      DB_HOST: postgres
      DB_PORT: 5432
      DB_USER: qa
      DB_PASS: qa_secret
      DB_NAME: qa_testdb
    volumes:
      - ./test-reports:/app/reports
    depends_on:
      fastapi-app:
        condition: service_healthy
    command: >
      pytest tests/integration/
        --html=/app/reports/integration-report.html
        --self-contained-html
        -v

Selenium Grid in Docker

Docker is the easiest way to run Selenium Grid — no manual driver installation, no Chrome version mismatches, clean teardown after every run.

# docker-compose.selenium.yml
version: "3.8"
services:

  selenium-hub:
    image: selenium/hub:4.20.0
    ports:
      - "4442:4442"
      - "4443:4443"
      - "4444:4444"

  chrome-node:
    image: selenium/node-chrome:4.20.0
    shm_size: '2gb'
    depends_on:
      - selenium-hub
    environment:
      SE_EVENT_BUS_HOST: selenium-hub
      SE_EVENT_BUS_PUBLISH_PORT: 4442
      SE_EVENT_BUS_SUBSCRIBE_PORT: 4443
    deploy:
      replicas: 3    # 3 parallel Chrome nodes

  firefox-node:
    image: selenium/node-firefox:4.20.0
    shm_size: '2gb'
    depends_on:
      - selenium-hub
    environment:
      SE_EVENT_BUS_HOST: selenium-hub
      SE_EVENT_BUS_PUBLISH_PORT: 4442
      SE_EVENT_BUS_SUBSCRIBE_PORT: 4443
    deploy:
      replicas: 2    # 2 parallel Firefox nodes

Connect your Selenium tests to the Grid with RemoteWebDriver:

from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

# Connect to the Grid hub
driver = webdriver.Remote(
    command_executor="http://localhost:4444/wd/hub",
    options=webdriver.ChromeOptions()
)

driver.get("https://example.com")
print(driver.title)
driver.quit()

Volumes for Test Artifacts

By default, files written inside a container disappear when the container exits. Mount a host directory as a volume to persist test reports, screenshots, and logs:

# Mount local ./reports to container's /app/reports
docker run --rm \
  -v "$(pwd)/reports:/app/reports" \
  qa-tests:latest

# In Compose
volumes:
  - ./reports:/app/reports       # Test HTML reports
  - ./screenshots:/app/screenshots   # Selenium failure screenshots
  - ./logs:/app/logs             # Application and test logs

Named volumes persist across container restarts and are managed by Docker (not tied to a host directory path):

volumes:
  postgres_data:     # Named volume for database

services:
  postgres:
    image: postgres:15
    volumes:
      - postgres_data:/var/lib/postgresql/data

Docker in GitHub Actions

GitHub Actions has built-in Docker support. Use services: to spin up containers alongside your test runner job:

# .github/workflows/test.yml
name: Integration Tests

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest

    services:
      postgres:
        image: postgres:15-alpine
        env:
          POSTGRES_USER: qa
          POSTGRES_PASSWORD: qa_secret
          POSTGRES_DB: qa_testdb
        options: >-
          --health-cmd pg_isready
          --health-interval 5s
          --health-timeout 5s
          --health-retries 10
        ports:
          - 5432:5432

      redis:
        image: redis:7-alpine
        ports:
          - 6379:6379

    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.12'
          cache: 'pip'

      - name: Install dependencies
        run: pip install -r requirements.txt

      - name: Run integration tests
        env:
          DATABASE_URL: postgresql://qa:qa_secret@localhost:5432/qa_testdb
          REDIS_URL: redis://localhost:6379
        run: pytest tests/integration/ -v --html=reports/report.html

      - name: Upload test report
        uses: actions/upload-artifact@v4
        if: always()
        with:
          name: test-report
          path: reports/

Docker for Mobile Testing

Running Android emulators in Docker is technically possible (using nested virtualisation or software rendering) but comes with significant limitations: performance is poor, GPU acceleration is unavailable, and the setup is fragile. The practical reality is that Docker works well for emulation-light mobile testing scenarios but not for full Android emulator workflows.

Realistic options by use case:

From Fire TV testing at Amazon: We ran Appium tests against Fire TV devices using a hybrid approach — the Appium server ran in a Docker container on a test node, but it connected to physical Fire TV devices via USB using Android Debug Bridge over the network. Docker gave us clean Appium server environments and easy version management, but the devices themselves were always physical hardware. Attempting to emulate Fire OS in Docker was never a serious option — the overhead was too high and the device-specific behaviour too difficult to replicate.

Useful Docker Images for QA

These official images should be in every QA engineer's toolkit:

Comparison: Docker vs Other Approaches

Aspect Docker Virtual Machines Local Env CI Cloud Runners
Isolation Process-level (namespaces) Full OS isolation (hypervisor) None Full (ephemeral VMs)
Startup speed Milliseconds Minutes Instant 30–90 seconds
Resource overhead Minimal (shared kernel) High (full OS per VM) None Managed by provider
Reproducibility Exact — pinned image versions High — full OS snapshot Low — varies by machine High — fresh runner each run
Setup complexity Low — Dockerfile + Compose High — VM provisioning Medium — manual install Low — provider managed
Cost Free (Docker Desktop for enterprise: paid) Free — high hardware cost Free Per-minute billing
Best for QA All test types — ideal OS-specific tests, legacy apps Rapid local development CI pipelines, scale

Best Practices for Docker in QA

1. Always pin image versions

Never use :latest in a Dockerfile or Compose file that runs in CI. python:latest today is 3.12. Next month it is 3.13. Pin to python:3.12-slim and update deliberately, not accidentally. Version pinning prevents silent dependency changes from breaking your tests weeks after the image upstream changes.

2. Use --rm for test container cleanup

Always pass --rm when running ad-hoc test containers. Without it, every docker run leaves a stopped container consuming disk space. Run docker system prune periodically to clean up orphaned images and networks.

3. Multi-stage builds for smaller test images

Use multi-stage Dockerfiles to separate build dependencies from runtime. Your test image does not need gcc or build-essential at runtime — only at pip install time. A smaller image pulls faster in CI and has a smaller attack surface.

# Multi-stage: build stage installs, runtime stage runs tests
FROM python:3.12-slim AS builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --user --no-cache-dir -r requirements.txt

FROM python:3.12-slim AS runtime
WORKDIR /app
COPY --from=builder /root/.local /root/.local
COPY . .
ENV PATH=/root/.local/bin:$PATH
CMD ["pytest", "tests/"]

4. Never run tests against production containers

Production containers should be locked down with read-only filesystems, non-root users, and minimal permissions. Test containers often need write access to reports directories, writable temp space, and debug logging. Keep them separate — run tests against dedicated staging or CI-specific containers, never the production image with test flags bolted on.

5. Use healthcheck and depends_on: condition: service_healthy

Race conditions where the test runner starts before the database is ready are one of the most common Docker Compose CI failures. Always define a healthcheck for every service that other services depend on, and use condition: service_healthy in depends_on. This eliminates the class of failures caused by sleep 5 hacks in CI scripts.


Back to Blog
From Experience — Viasat: At Viasat, our IFC test automation pipeline runs inside Kubernetes-based simulator environments. Every software release triggers a Jenkins pipeline that builds the test container, deploys it against the simulated airline environment, and runs the full regression suite before any build reaches a lab rack. When connectivity gaps appeared in the Kubernetes network config, the Jenkins pipeline was the first signal — tests started failing on specific network paths that looked fine from manual inspection. The CI/CD pipeline caught a network misconfiguration that would have caused customer-facing failures on delivery day.