Performance Testing with Locust — Python Load Testing Guide

What is Locust?

Locust is an open-source Python-based load-testing framework where you define user behaviour in plain Python code. There are no XML configuration files, no proprietary GUIs to fight with, no DSL to learn — just Python classes and functions that any developer or QA engineer already understands.

The name comes from the locust insect: a swarm of locusts descending on a system. Each virtual user in Locust is a Python greenlet (lightweight coroutine via the gevent library), which means a single machine can simulate thousands of concurrent users without the thread-per-user overhead of tools like JMeter.

Locust ships with a built-in web UI that lets you start and stop tests, adjust user counts in real time, and watch live charts of request rates, response times, and failure counts. For CI pipelines, a headless mode produces CSV output that scripts can threshold-check automatically.

Key strengths of Locust: it runs everywhere Python runs, it is trivially extensible (custom protocols, custom reporters, event hooks), and its test code lives in your git repository alongside application code — making it the natural choice for Python shops.

Architecture Diagram

Locust Master
(orchestrator)

→

Locust Workers
(greenlets)

→

HTTP Client
(requests session)

→

Target API
(system under test)

→

Results Collector

→

CSV / Web UI Dashboard

In single-machine mode the master and worker are the same process. In distributed mode the master coordinates multiple worker machines, aggregating their metrics into a unified view. Each worker runs a pool of greenlets — one per virtual user — executing task functions concurrently via cooperative multitasking.

Installation

Locust requires Python 3.9 or later. Install it via pip:

# Install Locust
pip install locust

# Verify installation
locust --version
# locust 2.x.x

# Add to your requirements.txt
locust==2.31.0

No additional dependencies are needed for basic HTTP testing. For advanced use cases (gRPC, WebSocket, custom protocols), additional libraries can be installed and integrated via Locust's event hooks.

First Locustfile

A locustfile is just a Python file. By convention it is named locustfile.py. Here is a complete example that tests a REST API with login and browse actions:

# locustfile.py
from locust import HttpUser, task, between
import json

class WebUser(HttpUser):
    """Simulates a single virtual user interacting with the web application."""

    # Wait between 1 and 3 seconds between tasks (think time)
    wait_time = between(1, 3)

    def on_start(self):
        """Called once when a virtual user starts. Use for login/auth."""
        response = self.client.post(
            "/api/auth/login",
            json={"email": "testuser@example.com", "password": "TestPass123"},
            name="POST /auth/login [on_start]"
        )
        if response.status_code == 200:
            token = response.json().get("token")
            # Set auth header for all subsequent requests
            self.client.headers.update({"Authorization": f"Bearer {token}"})
        else:
            # Stop this user if login fails
            self.environment.runner.quit()

    @task(3)
    def browse_products(self):
        """Weight 3 — runs 3x more often than add_to_cart."""
        self.client.get("/api/v1/products?page=1&limit=20")

    @task(2)
    def search_products(self):
        """Weight 2 — runs 2x as often as add_to_cart."""
        self.client.get("/api/v1/products?q=laptop&category=electronics")

    @task(1)
    def add_to_cart(self):
        """Weight 1 — least frequent action."""
        self.client.post(
            "/api/v1/cart/items",
            json={"productId": "P-1234", "quantity": 1}
        )

    def on_stop(self):
        """Called when a virtual user stops. Use for cleanup / logout."""
        self.client.post("/api/auth/logout")

Run it with:

locust -f locustfile.py --host https://api.example.com

Then open http://localhost:8089 in your browser to access the web UI.

Task Weighting — Realistic User Mix

In real applications, not all actions are equally common. Users browse far more than they purchase. The @task(weight) decorator lets you control how often Locust executes each task relative to others.

class EcommerceUser(HttpUser):
    wait_time = between(2, 5)

    @task(10)
    def view_homepage(self):
        self.client.get("/")

    @task(7)
    def browse_category(self):
        self.client.get("/category/electronics")

    @task(4)
    def view_product(self):
        self.client.get("/products/B09XYZ123")

    @task(2)
    def add_to_wishlist(self):
        self.client.post("/api/wishlist", json={"productId": "B09XYZ123"})

    @task(1)
    def checkout(self):
        self.client.post("/api/orders", json={"cartId": "cart-456"})

In this example, for every checkout that happens, the homepage is viewed 10 times. This ratio mirrors a typical e-commerce conversion funnel and produces realistic server-side load patterns — including representative cache hit ratios and database query distributions.

TaskSets — Grouping Related Actions

TaskSet lets you group related tasks together and assign them to user classes. This is useful when you want to model different user personas — for example, admin users who perform different actions than regular users.

from locust import HttpUser, TaskSet, task, between

class AdminTaskSet(TaskSet):

    def on_start(self):
        self.client.post("/admin/login", json={
            "username": "admin",
            "password": "AdminPass123"
        })

    @task(3)
    def view_dashboard(self):
        self.client.get("/admin/dashboard")

    @task(2)
    def list_users(self):
        self.client.get("/admin/users?page=1")

    @task(1)
    def export_report(self):
        self.client.get("/admin/reports/export?format=csv")

    def on_stop(self):
        self.client.post("/admin/logout")


class AdminUser(HttpUser):
    tasks = [AdminTaskSet]
    wait_time = between(3, 8)
    weight = 1   # 1 admin for every 10 regular users

From my API testing work at Virtusa: We used separate TaskSet classes to model three distinct user personas in our REST API load tests — anonymous public users, authenticated retail customers, and internal API integrations. Each persona had different endpoint access patterns and authentication methods. Separating them into TaskSets made the locustfile readable and made it easy to adjust the ratio of each persona type (via weight) to match our observed production traffic split.

Sequential Tasks — User Journeys

Random task selection is realistic for browsing, but some flows are inherently sequential: you cannot checkout before you add to cart. SequentialTaskSet executes tasks in the order they are defined.

from locust import HttpUser, SequentialTaskSet, task, between

class CheckoutJourney(SequentialTaskSet):

    @task
    def step_1_browse(self):
        self.client.get("/products")

    @task
    def step_2_view_item(self):
        self.client.get("/products/B09XYZ123")

    @task
    def step_3_add_to_cart(self):
        response = self.client.post("/api/cart", json={"productId": "B09XYZ123"})
        self.cart_id = response.json().get("cartId")

    @task
    def step_4_view_cart(self):
        self.client.get(f"/api/cart/{self.cart_id}")

    @task
    def step_5_checkout(self):
        self.client.post(f"/api/orders", json={
            "cartId": self.cart_id,
            "paymentMethod": "card"
        })
        self.interrupt()   # End this journey, start fresh


class ShopUser(HttpUser):
    tasks = [CheckoutJourney]
    wait_time = between(1, 4)

Wait Times — Simulating Think Time

Wait times control how long a virtual user pauses between tasks. Skipping think time produces unrealistic burst load and inflated throughput numbers. Locust provides several built-in strategies:

from locust import between, constant, constant_pacing

# Random wait between 1 and 5 seconds (most realistic for user browsing)
wait_time = between(1, 5)

# Fixed 2-second pause after every task
wait_time = constant(2)

# Constant pacing: each task + wait = exactly 4 seconds regardless of task duration
# Good for modelling a fixed transactions-per-second rate
wait_time = constant_pacing(4)

# Custom wait time via a function (e.g., normally distributed think time)
import random
def wait_time(self):
    return max(0.5, random.gauss(2.0, 0.5))  # Mean 2s, stdev 0.5s

constant_pacing is particularly useful when you want to target a specific throughput. If your task takes 0.5 seconds and you set constant_pacing(2), Locust waits 1.5 seconds — making the cycle exactly 2 seconds per user, which means 0.5 transactions per second per user. Scale users accordingly.

Web UI

The Locust web UI is served at http://localhost:8089 by default. To bind it to a specific interface:

locust -f locustfile.py \
  --host https://api.example.com \
  --web-host 0.0.0.0 \
  --web-port 8089

From the web UI you can:

Set user count and spawn rate — e.g. ramp to 200 users at 10 users/second.
Watch live charts — Requests per second and response time percentiles update every second.
Inspect failure details — The Failures tab lists each unique failure with its error message and count.
Adjust user count mid-run — Increase or decrease concurrency without stopping the test.
Download CSV exports — Raw statistics, request history, and failure data available for offline analysis.

Headless / CLI Mode

For CI pipelines, run Locust without the web UI using --headless:

# Run headless — 100 users, spawn at 10/s, run for 2 minutes
locust -f locustfile.py \
  --host https://api.example.com \
  --headless \
  --users 100 \
  --spawn-rate 10 \
  --run-time 2m \
  --csv=results/locust_report

# Output files:
# results/locust_report_stats.csv       — per-endpoint statistics
# results/locust_report_failures.csv    — failure details
# results/locust_report_history.csv     — metrics over time

After the run, check thresholds in a Python script:

import csv, sys

FAILURE_THRESHOLD = 1.0   # Max 1% failures
P95_THRESHOLD = 500       # Max 500 ms at 95th percentile

with open("results/locust_report_stats.csv") as f:
    reader = csv.DictReader(f)
    for row in reader:
        if row["Name"] == "Aggregated":
            total = int(row["Request Count"])
            failures = int(row["Failure Count"])
            p95 = float(row["95%"])
            failure_pct = (failures / total) * 100

            print(f"Failure rate: {failure_pct:.2f}% | p95: {p95}ms")

            if failure_pct > FAILURE_THRESHOLD:
                print(f"FAIL: failure rate {failure_pct:.2f}% exceeds {FAILURE_THRESHOLD}%")
                sys.exit(1)
            if p95 > P95_THRESHOLD:
                print(f"FAIL: p95 {p95}ms exceeds {P95_THRESHOLD}ms")
                sys.exit(1)

print("PASS: all thresholds met")
sys.exit(0)

Distributed Mode

A single Locust worker machine can typically simulate 500–2000 concurrent users before becoming CPU-bound. For higher loads, run multiple worker machines coordinated by a master.

# Start the master (no users run here, only coordination)
locust -f locustfile.py --master --host https://api.example.com

# Start worker machines (repeat on each worker box)
locust -f locustfile.py --worker --master-host 192.168.1.10

Docker Compose setup for local distributed testing:

# docker-compose.yml
version: "3.8"
services:
  master:
    image: locustio/locust:2.31.0
    ports:
      - "8089:8089"
    volumes:
      - ./:/mnt/locust
    command: -f /mnt/locust/locustfile.py --master --host https://api.example.com

  worker:
    image: locustio/locust:2.31.0
    volumes:
      - ./:/mnt/locust
    command: -f /mnt/locust/locustfile.py --worker --master-host master
    depends_on:
      - master
    deploy:
      replicas: 4     # 4 worker containers, each running greenlets

Run with docker compose up --scale worker=4 to spin up four worker containers. The master web UI aggregates statistics from all workers into a single unified view.

Custom Load Shapes

For advanced scenarios — realistic ramp-up, stepped load, spike-and-recover — subclass LoadTestShape to define a custom profile:

from locust import LoadTestShape

class StepLoadShape(LoadTestShape):
    """
    Step load: increase users every 30 seconds, then hold at peak for 2 minutes,
    then ramp down.
    """
    step_time = 30       # Seconds per step
    step_load = 50       # Users added each step
    spawn_rate = 10      # Users spawned per second
    time_limit = 300     # Total test duration

    stages = [
        {"duration": 30,  "users": 50},
        {"duration": 60,  "users": 100},
        {"duration": 90,  "users": 150},
        {"duration": 210, "users": 150},  # Hold at 150 for 2 minutes
        {"duration": 240, "users": 75},
        {"duration": 300, "users": 0},    # Ramp down
    ]

    def tick(self):
        run_time = self.get_run_time()
        for stage in self.stages:
            if run_time < stage["duration"]:
                return stage["users"], self.spawn_rate
        return None   # Return None to stop the test

GitHub Actions CI Integration

# .github/workflows/locust.yml
name: Locust Performance Test

on:
  push:
    branches: [main]
  schedule:
    - cron: '0 2 * * *'    # Nightly at 2am UTC
  workflow_dispatch:

jobs:
  locust:
    runs-on: ubuntu-latest

    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.12'
          cache: 'pip'

      - name: Install Locust
        run: pip install locust==2.31.0

      - name: Run Locust headless
        run: |
          mkdir -p results
          locust -f locustfile.py \
            --host ${{ secrets.PERF_TARGET_URL }} \
            --headless \
            --users 100 \
            --spawn-rate 10 \
            --run-time 3m \
            --csv=results/locust

      - name: Check thresholds
        run: python scripts/check_thresholds.py

      - name: Upload CSV results
        uses: actions/upload-artifact@v4
        if: always()
        with:
          name: locust-results-${{ github.run_number }}
          path: results/
          retention-days: 30

Practical tip from QA experience: At Virtusa, we ran our Locust performance suite nightly against the staging environment and compared p95 values with the previous run's CSV output. A simple Python script flagged any regression greater than 15% — meaning if yesterday's p95 was 300 ms and today's is 345 ms or higher, the build marked as unstable and pinged the Slack channel. This gave us early warning of performance regressions before they reached production without requiring a manual review of every report.

Tool Comparison: Locust vs Alternatives

Feature	Locust	JMeter	k6	Gatling
Scripting language	Python	XML / Groovy	JavaScript	Java / Scala DSL
Scripting ease	Very easy (Python)	Moderate (XML verbose)	Easy (modern JS)	Medium (JVM required)
Built-in web UI	Yes — live charts	Yes — desktop GUI	No (needs Grafana)	No (HTML report only)
Distributed mode	Yes — built-in master/worker	Yes — built-in	k6 Cloud / manual	Enterprise edition only
CI integration	Excellent — CSV + exit codes	Good — JTL files	Excellent — JSON output	Excellent — Maven plugin
Concurrency model	Greenlets (gevent)	OS threads	Go goroutines	Akka actors
Protocol support	HTTP + custom via Python	HTTP, JDBC, FTP, MQTT	HTTP, WebSocket, gRPC	HTTP, WebSocket, gRPC
Learning curve for Python devs	Minimal — standard Python	High — XML + Java	Medium — requires JS	High — JVM + DSL

Best Practices for Locust Performance Testing

1. Start with realistic think times

Use between(1, 5) as a minimum starting point. Review your application analytics to find the actual distribution of time users spend between actions. A checkout flow with no think time will generate 10x the real throughput and mask actual bottlenecks.

2. Use `on_start` for authentication

Never embed credentials in task functions. The on_start method runs once per virtual user at startup, making it the right place to authenticate and store session tokens. This keeps your task functions clean and ensures every virtual user is properly authenticated before generating load.

3. Check both p95 and error rate — never just the mean

The mean response time is often a misleading metric. A system under load can show a low mean while 5% of requests time out. Define thresholds for both the 95th percentile and the failure rate. If your SLA is "99% of requests under 500 ms," your threshold must be p99 < 500, not the mean.

4. Run baselines before feature deployments

Run your Locust suite against staging before every significant feature deployment. Save the CSV output with a timestamp and commit reference. This gives you a before/after comparison and makes it possible to pinpoint exactly which deployment introduced a regression.

5. Keep locustfiles in version control with your application code

Performance tests are first-class code. They should live in the same repository as the application, be reviewed in pull requests, and evolve as the API evolves. A locustfile that tests endpoints that no longer exist is misleading — stale performance tests are worse than no performance tests.

Back to Blog

From Experience — Virtusa: During a major e-commerce client engagement at Virtusa, JMeter was our load testing tool for checkout flows. We ran baseline tests at 50 concurrent users and ramped gradually to 500. The checkout API consistently degraded at 300 users due to a database connection pool exhaustion — a failure mode that unit tests or functional tests would never have surfaced. Catching it in load testing rather than on a peak sales day was a significant win. The fix was a one-line connection pool config change; finding the root cause was the hard part.