Performance Testing with k6

What is k6?

k6 is an open-source, developer-centric performance testing tool built by Grafana Labs. Unlike older tools such as JMeter that rely on XML configuration files and a GUI, k6 lets you write your performance tests in plain JavaScript — the same language most modern development teams already use daily.

k6 runs as a single static binary from the command line, produces clean terminal output, and integrates naturally into Git workflows and CI/CD pipelines. It is designed to be used by developers and QA engineers alike, with a shallow learning curve for anyone who already knows JavaScript.

At its core, k6 simulates virtual users (VUs) that execute your test script concurrently against a target system. It collects detailed metrics — response times, error rates, throughput — and can enforce pass/fail criteria called thresholds to gate deployments in CI.

k6 vs Alternatives

Feature	k6	JMeter	Gatling	Locust
Language	JavaScript (ES6)	XML / GUI	Scala / DSL	Python
CLI-first	Yes	Partial	Yes	Yes
Git-friendly	Yes (plain JS files)	Poor (binary XML)	Yes	Yes
Learning curve	Low (JS)	Medium (GUI)	High (Scala)	Low (Python)
Browser support	k6 Browser (experimental)	Yes (plugins)	No	No
Cloud execution	k6 Cloud (native)	BlazeMeter (3rd party)	Gatling Enterprise	Locust Cloud
Real-time dashboards	Grafana (native)	Requires plugin	Built-in HTML	Built-in web UI
Resource usage	Very low (Go runtime)	High (JVM)	Medium (JVM)	Medium (Python)

The most important distinction is resource efficiency. k6 is written in Go and compiles JavaScript using the Goja engine. It can simulate thousands of virtual users on a single laptop without the memory overhead of JVM-based tools. JMeter requires a full Java runtime and can consume gigabytes of RAM for large-scale tests — k6 achieves the same load with a fraction of the resources.

Architecture

Understanding the k6 execution pipeline helps you design tests correctly and interpret results accurately.

Test Script
(JS file)

→

k6 Engine
(Go runtime)

→

Virtual Users
(VU 1…N)

→

Target API / App
(HTTP requests)

→

Metrics Collection
(built-in + custom)

→

k6 Cloud /
Grafana

JSON / InfluxDB
output

Each virtual user runs the default function in your script in a tight loop for the duration of the test. VUs are isolated — they do not share state, which mirrors real-world concurrent users accurately. The engine aggregates metrics across all VUs and evaluates thresholds at the end.

Installation

k6 distributes as a single binary with no runtime dependencies. Choose the method for your OS:

macOS (Homebrew)

brew install k6

Ubuntu / Debian

sudo gpg -k
sudo gpg --no-default-keyring \
  --keyring /usr/share/keyrings/k6-archive-keyring.gpg \
  --keyserver hkp://keyserver.ubuntu.com:80 \
  --recv-keys C5AD17C747E3415A3642D57D77C6C491D6AC1D69

echo "deb [signed-by=/usr/share/keyrings/k6-archive-keyring.gpg] \
  https://dl.k6.io/deb stable main" \
  | sudo tee /etc/apt/sources.list.d/k6.list

sudo apt-get update
sudo apt-get install k6

Windows (Chocolatey)

choco install k6

Verify the installation with:

k6 version
# k6 v0.50.0 (go1.22.1, darwin/arm64)

First k6 Script

A k6 test is a JavaScript module with a default exported function. Every virtual user calls this function repeatedly for the test duration.

// load-test.js
import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
    vus: 10,          // 10 concurrent virtual users
    duration: '30s',  // run for 30 seconds
};

export default function () {
    const res = http.get('https://test.k6.io');

    check(res, {
        'status is 200': (r) => r.status === 200,
        'response time < 500ms': (r) => r.timings.duration < 500,
        'body contains welcome': (r) => r.body.includes('Collection of simple web-pages'),
    });

    sleep(1); // think time between iterations
}

Run it with:

k6 run load-test.js

k6 prints a real-time summary table to your terminal as the test runs, followed by a full metrics summary when it completes. The key metrics in the output are:

http_req_duration — response time percentiles (avg, min, med, max, p90, p95)
http_req_failed — rate of failed requests
vus — active virtual users at any point
iterations — total number of times the default function executed

Virtual Users and Duration

The simplest way to control load in k6 is with the vus and duration options. This runs a flat load — the specified number of VUs for the entire duration.

export const options = {
    vus: 50,
    duration: '2m',
};

You can also use the iterations option instead of duration to run each VU a fixed number of times:

export const options = {
    vus: 10,
    iterations: 100,  // 100 total iterations across all VUs
};

For more precise control over how VUs are distributed across time, use executors. The default executor is shared-iterations when iterations are set, or constant-vus when duration is set.

Stages and Ramp-Up

Real-world traffic does not spike from zero to full load instantly. Stages let you define a ramp-up, sustain, and ramp-down profile that mimics realistic traffic patterns — and more importantly, gives your system time to warm up so you do not hit cold-start penalties in your baseline.

// stages-test.js
import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
    stages: [
        { duration: '1m', target: 20 },   // ramp up to 20 VUs over 1 minute
        { duration: '3m', target: 20 },   // sustain 20 VUs for 3 minutes
        { duration: '1m', target: 100 },  // spike to 100 VUs
        { duration: '2m', target: 100 },  // sustain spike
        { duration: '1m', target: 0 },    // ramp down to 0
    ],
};

export default function () {
    const res = http.get('https://api.example.com/products');
    check(res, { 'status 200': (r) => r.status === 200 });
    sleep(1);
}

For more control, you can use the ramping-vus executor explicitly, which supports startVUs and graceful ramp-down:

export const options = {
    scenarios: {
        ramp_load: {
            executor: 'ramping-vus',
            startVUs: 0,
            stages: [
                { duration: '2m', target: 50 },
                { duration: '5m', target: 50 },
                { duration: '2m', target: 0 },
            ],
            gracefulRampDown: '30s',
        },
    },
};

The gracefulRampDown period allows in-flight iterations to complete naturally before VUs are killed, avoiding false connection-reset errors in your metrics.

Thresholds

Thresholds are the pass/fail criteria of your k6 test. If a threshold is breached, k6 exits with a non-zero status code — which causes CI pipelines to fail the build. This is the mechanism that turns performance tests into quality gates.

export const options = {
    stages: [
        { duration: '2m', target: 50 },
        { duration: '5m', target: 50 },
        { duration: '1m', target: 0 },
    ],
    thresholds: {
        // 95th percentile response time must be under 500ms
        'http_req_duration': ['p(95)<500'],

        // 99th percentile must be under 1.5 seconds
        'http_req_duration': ['p(95)<500', 'p(99)<1500'],

        // Error rate must stay below 1%
        'http_req_failed': ['rate<0.01'],

        // Checks must pass at least 99% of the time
        'checks': ['rate>0.99'],
    },
};

You can apply thresholds to specific URLs or tags rather than the aggregate by tagging requests:

export default function () {
    http.get('https://api.example.com/health', {
        tags: { name: 'health_check' },
    });
    http.get('https://api.example.com/search?q=test', {
        tags: { name: 'search' },
    });
}

export const options = {
    thresholds: {
        // Only apply this threshold to the search endpoint
        'http_req_duration{name:search}': ['p(95)<800'],
        'http_req_duration{name:health_check}': ['p(95)<100'],
    },
};

From Experience at Viasat: During satellite link performance validation, we set aggressive thresholds for our API gateway endpoints — p95 under 800ms and error rate under 0.5%. These were non-negotiable because satellite communication latency already adds overhead at the network layer; any additional application latency compounded the user experience problem. Integrating k6 thresholds into our Jenkins pipeline meant that a code deployment that degraded API response time was automatically blocked before it reached production. This caught two regressions in a single quarter that would have reached customers under the old manual performance testing process.

Checks vs Assertions

k6's check() function is different from traditional test assertions. A failed check does not stop the test or abort the VU iteration — it records a failure and continues. This is intentional: performance tests should keep running under load even when individual requests fail, so you get a complete picture of system behaviour under stress.

import { check } from 'k6';
import http from 'k6/http';

export default function () {
    const loginRes = http.post('https://api.example.com/auth/login', {
        email: 'user@example.com',
        password: 'Password123',
    });

    const loginPassed = check(loginRes, {
        'login status 200':       (r) => r.status === 200,
        'token present':          (r) => JSON.parse(r.body).token !== undefined,
        'response time < 300ms':  (r) => r.timings.duration < 300,
    });

    // Only proceed to protected endpoint if login succeeded
    if (loginPassed) {
        const token = JSON.parse(loginRes.body).token;
        const profileRes = http.get('https://api.example.com/user/profile', {
            headers: { Authorization: `Bearer ${token}` },
        });

        check(profileRes, {
            'profile status 200':  (r) => r.status === 200,
            'profile has name':    (r) => JSON.parse(r.body).name !== undefined,
        });
    }
}

Aspect	check()	Traditional assertion (throw)
On failure	Records failure, continues	Stops execution immediately
Good for	Load & performance tests	Functional / unit tests
CI gate	Via thresholds on checks rate	Via exit code on first failure
Metrics	check_rate tracked per check name	Pass / fail only
Multiple conditions	All evaluated per iteration	Stops at first failure

Custom Metrics

k6 provides four custom metric types that let you track application-specific data beyond the built-in HTTP metrics:

import http from 'k6/http';
import { Counter, Gauge, Rate, Trend } from 'k6/metrics';
import { sleep } from 'k6';

// Counter: monotonically increasing count
const loginErrors = new Counter('login_errors');

// Rate: percentage of true values (0-1)
const cacheHitRate = new Rate('cache_hit_rate');

// Trend: time-series of values (percentiles calculated)
const itemLoadTime = new Trend('item_load_time', true); // true = milliseconds

// Gauge: current value at any point in time
const activeCartItems = new Gauge('active_cart_items');

export default function () {
    const res = http.post('https://api.example.com/auth/login', {
        email: 'perf-test@example.com',
        password: 'TestPass123',
    });

    if (res.status !== 200) {
        loginErrors.add(1);
    }

    // Record whether we got a cache hit from response header
    cacheHitRate.add(res.headers['X-Cache'] === 'HIT');

    const itemRes = http.get('https://api.example.com/items/42');
    itemLoadTime.add(itemRes.timings.duration);

    activeCartItems.add(Math.floor(Math.random() * 5) + 1);

    sleep(1);
}

export const options = {
    thresholds: {
        'login_errors': ['count<10'],       // fewer than 10 login errors total
        'cache_hit_rate': ['rate>0.80'],    // cache hit rate above 80%
        'item_load_time': ['p(95)<400'],    // p95 item load under 400ms
    },
};

Environment Variables

Hard-coding URLs, credentials, or environment names in test scripts is bad practice. k6 provides the __ENV object and the --env CLI flag for injecting configuration at runtime:

// env-test.js
import http from 'k6/http';
import { check } from 'k6';

const BASE_URL = __ENV.BASE_URL || 'https://staging.api.example.com';
const API_KEY  = __ENV.API_KEY  || '';

export default function () {
    const res = http.get(`${BASE_URL}/health`, {
        headers: { 'X-API-Key': API_KEY },
    });
    check(res, { 'healthy': (r) => r.status === 200 });
}

Run with environment variables:

# Staging
k6 run --env BASE_URL=https://staging.api.example.com \
        --env API_KEY=stg_key_abc123 \
        env-test.js

# Production (read-only load test)
k6 run --env BASE_URL=https://api.example.com \
        --env API_KEY=$PROD_API_KEY \
        env-test.js

In GitHub Actions, pass secrets as environment variables so they are never committed to source control:

- name: Run k6 performance test
  env:
    BASE_URL: https://staging.api.example.com
    API_KEY: ${{ secrets.STAGING_API_KEY }}
  run: k6 run --env BASE_URL=$BASE_URL --env API_KEY=$API_KEY load-test.js

k6 Cloud

k6 Cloud is the managed cloud execution platform from Grafana Labs. It lets you run tests distributed across multiple geographic regions, view real-time results in a web dashboard, and store historical test runs for trend analysis — without managing your own infrastructure.

# Authenticate once
k6 login cloud --token <your-k6-cloud-token>

# Run your existing script on k6 Cloud
k6 run --cloud load-test.js

You can also configure cloud-specific options in your script:

export const options = {
    cloud: {
        projectID: 3456789,
        name: 'Checkout API Load Test - Sprint 42',
        distribution: {
            'amazon:us:ashburn':   { loadZone: 'amazon:us:ashburn',   percent: 50 },
            'amazon:eu:dublin':    { loadZone: 'amazon:eu:dublin',    percent: 30 },
            'amazon:ap:singapore': { loadZone: 'amazon:ap:singapore', percent: 20 },
        },
    },
    stages: [
        { duration: '2m', target: 200 },
        { duration: '5m', target: 200 },
        { duration: '1m', target: 0 },
    ],
    thresholds: {
        'http_req_duration': ['p(95)<600'],
        'http_req_failed':   ['rate<0.01'],
    },
};

Output Formats

k6 supports multiple output destinations for metrics, making it easy to feed data into your existing observability stack:

JSON output

k6 run --out json=results.json load-test.js

InfluxDB + Grafana stack

# Run k6 with InfluxDB output
k6 run --out influxdb=http://localhost:8086/k6 load-test.js

# docker-compose.yml for local stack
version: '3'
services:
  influxdb:
    image: influxdb:1.8
    ports: ['8086:8086']
    environment:
      INFLUXDB_DB: k6

  grafana:
    image: grafana/grafana:latest
    ports: ['3000:3000']
    environment:
      GF_AUTH_ANONYMOUS_ENABLED: 'true'
      GF_AUTH_ANONYMOUS_ORG_ROLE: Admin

Import the official k6 Grafana dashboard (ID: 2587) to get instant visualisations of response time percentiles, VU ramp, error rate, and throughput with no manual configuration.

GitHub Actions CI Integration

The most impactful thing you can do with k6 is run it automatically on every deployment to a staging environment. This catches performance regressions before they reach production.

# .github/workflows/performance.yml
name: Performance Tests

on:
  push:
    branches: [main, staging]
  pull_request:
    branches: [main]

jobs:
  k6-load-test:
    name: Run k6 Load Test
    runs-on: ubuntu-latest

    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Run k6 local load test
        uses: grafana/k6-action@v0.3.1
        with:
          filename: tests/performance/load-test.js
          flags: >-
            --env BASE_URL=https://staging.api.example.com
            --out json=results.json
        env:
          API_KEY: ${{ secrets.STAGING_API_KEY }}

      - name: Upload k6 results
        uses: actions/upload-artifact@v4
        if: always()
        with:
          name: k6-results
          path: results.json

      - name: Comment PR with results summary
        if: github.event_name == 'pull_request'
        uses: actions/github-script@v7
        with:
          script: |
            const fs = require('fs');
            // Parse results.json for summary metrics and post as PR comment
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: '### k6 Performance Test Results\nSee artifacts for full details.'
            });

From Experience at Amazon: When I was working on IoT device validation workflows, we used k6 to load-test the device registration API that handled onboarding of Fire TV and Echo devices. Each device model launch meant tens of thousands of concurrent registration requests in the first hour of availability. By running k6 in our pre-release pipeline with a 500 VU ramp test against staging, we identified a database connection pool exhaustion issue under load — an issue that would have caused registration failures for real customers on launch day. The fix was a two-line configuration change; finding it via load testing saved what could have been a significant incident.

Best Practices

1. Always ramp up and down

Never start a test at full load. A sudden spike of virtual users creates an unrealistic cold-start scenario. Use stages to simulate realistic traffic growth.

2. Use think time (sleep) between iterations

Real users pause between actions. Without sleep(), your VUs hammer the API as fast as possible, which is not realistic for user simulation tests. A sleep(1) to sleep(3) is typical. For pure throughput tests (stress testing API limits), omit sleep.

3. Parametrize test data

Do not use the same user credentials or IDs for every VU iteration. Use k6's SharedArray for efficient data loading:

import { SharedArray } from 'k6/data';
import papaparse from 'https://jslib.k6.io/papaparse/5.1.1/index.js';

const users = new SharedArray('users', function () {
    return papaparse.parse(open('./test-users.csv'), { header: true }).data;
});

export default function () {
    const user = users[__VU % users.length];
    // use user.email, user.password
}

4. Set meaningful thresholds before you run the test

Define your SLA requirements upfront. If you set thresholds after seeing results, you are measuring, not enforcing. Agree on acceptable p95 response times with your team before writing the test.

5. Tag requests for granular analysis

Tag every distinct API endpoint so you can apply per-endpoint thresholds and see granular performance breakdown in dashboards.

6. Test in an environment that mirrors production

Performance test results on a shared, undersized staging environment are misleading. Either use a dedicated performance environment sized to match production, or account for the difference when interpreting results.

7. Run soak tests for memory leaks

A soak test runs at moderate load (50–70% of peak) for an extended period (1–8 hours). This reveals memory leaks, connection pool exhaustion, and slow disk fill-up that short tests never catch.

export const options = {
    stages: [
        { duration: '5m',  target: 50 },  // ramp up
        { duration: '4h',  target: 50 },  // soak at 50 VUs for 4 hours
        { duration: '5m',  target: 0 },   // ramp down
    ],
};

8. Store scripts in version control

k6 scripts are plain JavaScript files. Commit them alongside your application code. This means performance test changes go through the same code review process as application changes — catching regressions in your test scripts before they produce misleading results.

Back to Blog