Back to All Articles
Automation

BDD with Behave and Python — Complete Guide

Honnesh Muppala May 5, 2026 14 min read

What is Behave?

Behave is the most widely used BDD framework for Python. It implements the full Gherkin specification — the same plain-English syntax used by Cucumber (Java) and SpecFlow (.NET) — making it the natural choice for Python teams that want behavior-driven testing without switching languages.

Like all Gherkin-based frameworks, Behave bridges the gap between business requirements and automated tests by expressing scenarios in Given / When / Then language that product owners, developers, and QA engineers can all read, write, and validate together. The Gherkin feature files serve as living documentation: when the tests pass, the documentation is accurate.

How Behave compares to alternatives

Architecture Diagram

Feature Files
Gherkin / .feature
Behave Runner
behave CLI
Step Definitions
@given @when @then
Page Objects
Selenium / Requests
App / Browser
Web / API / Mobile
The context object flows through every layer, carrying shared state between steps and hooks.

Installation & Project Structure

# Install Behave and required libraries
pip install behave selenium webdriver-manager

# Or install from requirements.txt
pip install -r requirements.txt

Behave requires a specific directory layout. Feature files and their step definitions are linked by the runner — no explicit import is needed:

my-project/
├── features/
│   ├── login.feature            # Gherkin feature files
│   ├── checkout.feature
│   ├── steps/
│   │   ├── login_steps.py       # Step definitions for login.feature
│   │   ├── checkout_steps.py
│   │   └── common_steps.py      # Shared steps used by multiple features
│   └── environment.py           # Hooks (before_all, before_scenario, etc.)
├── pages/
│   ├── login_page.py            # Page Object Model classes
│   └── checkout_page.py
├── utils/
│   └── driver_factory.py        # WebDriver setup logic
├── requirements.txt
└── behave.ini                   # Behave configuration

Configure Behave with a behave.ini file in the project root:

[behave]
format = pretty
outfile = reports/behave-results.txt
tags = ~@wip           ; Exclude work-in-progress scenarios by default
stop = false           ; Continue running after first failure

Feature Files

Behave feature files are identical in syntax to Cucumber. They live in the features/ directory with a .feature extension:

# features/login.feature
Feature: User Login
  As a registered user
  I want to log in to the application
  So that I can access my account

  Background:
    Given the application is running
    And the browser is open to the login page

  @smoke
  Scenario: Successful login
    When the user enters username "alice@example.com"
    And the user enters password "SecurePass123"
    And the user clicks "Login"
    Then the user should be on the dashboard
    And the page title should contain "Dashboard"

  @regression @auth
  Scenario: Failed login with wrong password
    When the user enters username "alice@example.com"
    And the user enters password "WrongPassword"
    And the user clicks "Login"
    Then an error alert should display "Invalid email or password"

  @regression @auth
  Scenario Outline: Login with multiple user roles
    When the user enters username "<username>"
    And the user enters password "<password>"
    And the user clicks "Login"
    Then the user should land on "<destination>"

    Examples:
      | username              | password     | destination       |
      | alice@example.com     | AlicePass1   | /dashboard        |
      | admin@example.com     | AdminPass99  | /admin            |
      | viewer@example.com    | ViewPass7    | /dashboard        |

Step Definitions

Step definitions live in features/steps/. They are decorated with @given, @when, @then, or the universal @step. The decorator string is a regex or a plain text pattern that matches the Gherkin step:

# features/steps/login_steps.py
from behave import given, when, then, step
from pages.login_page import LoginPage

@given('the application is running')
def step_app_is_running(context):
    # context.base_url is set in environment.py
    assert context.base_url, "Base URL must be configured"

@given('the browser is open to the login page')
def step_open_login_page(context):
    context.login_page = LoginPage(context.browser)
    context.login_page.open(context.base_url)

@when('the user enters username "{username}"')
def step_enter_username(context, username):
    context.login_page.enter_username(username)

@when('the user enters password "{password}"')
def step_enter_password(context, password):
    context.login_page.enter_password(password)

@when('the user clicks "{button_text}"')
def step_click_button(context, button_text):
    context.login_page.click_button(button_text)

@then('the user should be on the dashboard')
def step_on_dashboard(context):
    assert '/dashboard' in context.browser.current_url, \
        f"Expected /dashboard, got: {context.browser.current_url}"

@then('the page title should contain "{expected}"')
def step_page_title_contains(context, expected):
    assert expected in context.browser.title, \
        f"Expected '{expected}' in title, got: '{context.browser.title}'"

@then('an error alert should display "{expected_message}"')
def step_error_alert_displays(context, expected_message):
    actual = context.login_page.get_error_text()
    assert actual == expected_message, \
        f"Expected '{expected_message}', got '{actual}'"

@then('the user should land on "{path}"')
def step_user_lands_on(context, path):
    assert context.browser.current_url.endswith(path), \
        f"Expected URL ending with '{path}', got: {context.browser.current_url}"

The @step decorator

Use @step when a step is used in different positions (Given / When / Then) across different scenarios. This avoids duplicating the same step definition for each keyword:

@step('the "{page}" page is displayed')
def step_page_displayed(context, page):
    assert page.lower() in context.browser.current_url.lower()

environment.py — Hooks

The environment.py file in the features/ directory is where you define Behave's lifecycle hooks. It is the equivalent of Cucumber's @Before / @After hooks and pytest's conftest.py:

# features/environment.py
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager

def before_all(context):
    """Runs once before all tests. Initialize global config."""
    context.base_url = "https://staging.example.com"
    context.implicit_wait = 10

def before_scenario(context, scenario):
    """Runs before each scenario. Start a fresh browser session."""
    options = webdriver.ChromeOptions()
    options.add_argument("--headless")
    options.add_argument("--no-sandbox")
    options.add_argument("--disable-dev-shm-usage")
    options.add_argument("--window-size=1280,720")

    context.browser = webdriver.Chrome(
        service=Service(ChromeDriverManager().install()),
        options=options
    )
    context.browser.implicitly_wait(context.implicit_wait)

def after_scenario(context, scenario):
    """Runs after each scenario. Capture screenshot on failure, quit browser."""
    if scenario.status == "failed":
        # Save screenshot for debugging
        screenshot_name = f"reports/screenshots/{scenario.name.replace(' ', '_')}.png"
        context.browser.save_screenshot(screenshot_name)
        print(f"Screenshot saved: {screenshot_name}")

    context.browser.quit()

def after_all(context):
    """Runs once after all tests. Final cleanup."""
    print("All scenarios complete.")
    # Close any remaining sessions, write summary logs, etc.

The Context Object

The context object is the central mechanism for sharing state in Behave. It is passed as the first argument to every step function and every hook. You can add any attribute to it at any point and read it in any subsequent step:

# Set in environment.py (before_scenario):
context.browser = webdriver.Chrome(...)
context.base_url = "https://staging.example.com"

# Set in a Given step:
context.login_page = LoginPage(context.browser)

# Set in a When step:
context.user = context.login_page.get_logged_in_user()

# Read in a Then step:
assert context.user['name'] == 'Alice'

# Scope rules:
# context.browser       — set in before_scenario, available until after_scenario
# context.feature.xxx   — set in before_feature, scoped to current feature file
# context.scenario.xxx  — set in before_scenario, scoped to current scenario only
From Experience at Viasat: The context object pattern in Behave is elegant but requires discipline. At Viasat, we established a team convention: all browser and page object references were set in before_scenario (never in steps), and steps only read from context or set scenario-specific data. This prevented the most common Behave pitfall — steps that work in isolation but fail when run together because a previous step polluted the context with unexpected state.

Scenario Outline for Data-Driven Tests

Scenario Outline combined with an Examples table makes data-driven testing clean and readable in Behave:

# features/search.feature
  @regression
  Scenario Outline: Search returns relevant results
    Given the user is on the search page
    When the user searches for "<query>"
    Then at least <min_results> results should be displayed
    And the first result title should contain "<expected_text>"

    Examples:
      | query          | min_results | expected_text     |
      | laptop         | 5           | Laptop            |
      | wireless mouse | 3           | Mouse             |
      | hdmi cable     | 10          | HDMI              |
# Step definition handling integer parameter
from behave import then

@then('at least {min_results:d} results should be displayed')
def step_results_count(context, min_results):
    result_items = context.browser.find_elements(By.CSS_SELECTOR, '[data-testid="result-item"]')
    assert len(result_items) >= min_results, \
        f"Expected at least {min_results} results, got {len(result_items)}"

Behave supports Python format specifiers in step patterns: {name:d} matches an integer, {name:f} matches a float, {name} matches any quoted string.

Tags: Filtering and Organising Scenarios

Tags in Behave work at the Feature, Scenario, or Scenario Outline level. They can be combined with boolean logic on the CLI:

# Feature file tags
@smoke
Feature: Login

  @regression @auth
  Scenario: Valid login

  @wip
  Scenario: New OAuth flow (not ready for CI)
# Run commands
# Only @smoke tests
behave --tags @smoke

# @regression but not @wip
behave --tags "@regression and not @wip"

# Either @smoke or @critical
behave --tags "@smoke or @critical"

# All tests tagged @slow (for nightly run)
behave --tags @slow

# Exclude @slow for fast feedback
behave --tags "not @slow"

Full Example: Feature + Steps + Page Object

Here is a complete working example tying a feature file, step definitions, and a Page Object class together for a login flow:

# features/auth.feature
Feature: Authentication

  @smoke
  Scenario: Admin can log in and see the admin panel
    Given the user navigates to the login page
    When the user logs in as "admin@example.com" with password "AdminPass99"
    Then the user should see the admin panel heading
# pages/login_page.py
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

class LoginPage:
    URL_PATH = "/login"

    EMAIL_INPUT    = (By.ID, "email")
    PASSWORD_INPUT = (By.ID, "password")
    LOGIN_BTN      = (By.CSS_SELECTOR, "[data-testid='login-btn']")
    ERROR_MSG      = (By.CSS_SELECTOR, "[data-testid='error-alert']")
    ADMIN_HEADING  = (By.CSS_SELECTOR, "[data-testid='admin-heading']")

    def __init__(self, driver):
        self.driver = driver
        self.wait = WebDriverWait(driver, 10)

    def open(self, base_url):
        self.driver.get(base_url + self.URL_PATH)

    def enter_username(self, email):
        field = self.wait.until(EC.visibility_of_element_located(self.EMAIL_INPUT))
        field.clear()
        field.send_keys(email)

    def enter_password(self, password):
        field = self.driver.find_element(*self.PASSWORD_INPUT)
        field.clear()
        field.send_keys(password)

    def click_login(self):
        self.driver.find_element(*self.LOGIN_BTN).click()

    def get_error_text(self):
        return self.wait.until(
            EC.visibility_of_element_located(self.ERROR_MSG)
        ).text

    def is_admin_panel_visible(self):
        return self.wait.until(
            EC.visibility_of_element_located(self.ADMIN_HEADING)
        ).is_displayed()
# features/steps/auth_steps.py
from behave import given, when, then
from pages.login_page import LoginPage

@given('the user navigates to the login page')
def step_navigate_login(context):
    context.login_page = LoginPage(context.browser)
    context.login_page.open(context.base_url)

@when('the user logs in as "{email}" with password "{password}"')
def step_login_as(context, email, password):
    context.login_page.enter_username(email)
    context.login_page.enter_password(password)
    context.login_page.click_login()

@then('the user should see the admin panel heading')
def step_admin_panel_visible(context):
    assert context.login_page.is_admin_panel_visible(), \
        "Admin panel heading was not visible after login"

Reporting

Behave supports multiple output formats. The most useful for CI integration are JSON (machine-readable) and the behave-html-formatter for human-readable reports:

# Run and output JSON (for parsing / Allure ingestion)
behave --format json --outfile reports/results.json

# Run with multiple formats simultaneously
behave --format pretty --format json --outfile reports/results.json

# Install the HTML formatter
pip install behave-html-formatter

# Run with HTML output
behave --format behave_html_formatter:HTMLFormatter --outfile reports/report.html

# Allure integration
pip install allure-behave
behave -f allure_behave.formatter:AllureFormatter -o allure-results/
allure serve allure-results/
From Experience at Amazon: At Amazon, our Behave suite used the Allure formatter to generate test reports that were published to S3 after every CI run. The Allure report's timeline view and category breakdown (by feature, tag, and step) made it easy for the whole team — including non-technical product managers — to understand the test results without reading Python code. Allure bridged the gap between automated test output and business-level reporting, which is one of the core promises of BDD.

GitHub Actions CI

# .github/workflows/behave.yml
name: Behave BDD Tests

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]

jobs:
  behave-tests:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.12'
          cache: pip

      - name: Install dependencies
        run: pip install -r requirements.txt

      - name: Install Chrome
        uses: browser-actions/setup-chrome@latest

      - name: Run Behave smoke tests
        run: behave --tags "@smoke and not @wip" --format json --outfile reports/results.json

      - name: Run full regression
        if: github.ref == 'refs/heads/main'
        run: behave --tags "not @wip" --format json --outfile reports/full-results.json

      - name: Upload test report
        uses: actions/upload-artifact@v4
        if: always()
        with:
          name: behave-reports
          path: reports/

      - name: Upload screenshots on failure
        uses: actions/upload-artifact@v4
        if: failure()
        with:
          name: failure-screenshots
          path: reports/screenshots/

Framework Comparison

Feature Behave pytest-bdd Cucumber (Java)
Language Python Python Java
Setup complexity Very low (pip install behave) Low (pip install pytest-bdd) Medium (Maven + POM)
Gherkin compliance Full (all keywords) Full (all keywords) Full (all keywords)
State sharing context object pytest fixtures Dependency injection (PicoContainer)
pytest integration None (standalone) Native (is a pytest plugin) Via junit platform
Reporting JSON, HTML, Allure pytest HTML, Allure HTML, JSON, Allure, JUnit
Parallel execution Via behave-parallel Via pytest-xdist Via JUnit 5 parallel
Best for Python teams, simple BDD setup Teams already using pytest Java enterprise teams

Best Practices

1. Write business-readable steps, not technical ones

The defining quality of a good Gherkin step is that a non-technical stakeholder can read it and understand what the system is supposed to do. Steps that expose implementation details break this contract:

# Bad — implementation detail leaking into Gherkin
When I find element by CSS ".btn-primary" and click it

# Good — business intent is clear
When the user submits the registration form

2. Keep step definitions thin, page objects thick

Step definitions should read like a translation layer between Gherkin and Python — each step should contain one or two method calls to a page object, nothing more. All the Selenium interaction (locators, waits, assertions) belongs in the page object:

# Thin step definition (correct)
@when('the user logs in as "{email}" with password "{password}"')
def step_login(context, email, password):
    context.login_page.login(email, password)   # All logic in page object

# Fat step definition (wrong — Selenium in the step)
@when('the user logs in as "{email}" with password "{password}"')
def step_login(context, email, password):
    context.browser.find_element(By.ID, "email").send_keys(email)
    context.browser.find_element(By.ID, "password").send_keys(password)
    context.browser.find_element(By.ID, "login-btn").click()

3. Treat the Gherkin file as the specification, not the afterthought

Write the feature file before writing any code — step definitions or page objects. This forces you to think about the user journey and the expected outcomes in business terms before thinking about implementation. This is the core promise of BDD: specification by example.

4. Use Background sparingly

Background steps run before every scenario in a feature file. Use them only for truly universal preconditions (like "the app is running" or "the browser is open to the login page"). If a Background step only applies to 3 out of 5 scenarios, move it to those scenarios explicitly.

5. Exclude @wip scenarios from CI

Scenarios tagged @wip should always be excluded from automated CI runs. Use behave --tags "not @wip" as your default CI command. @wip scenarios are for local development iteration — they will fail, and they should not block pull requests.


Back to Blog
From Experience — Virtusa: Leading a team of 270 testers at Virtusa, we standardised on Appium for real Android device testing and Selenium WebDriver for web regression. The biggest challenge wasn't the tooling — it was consistency across a team that size. We enforced a strict Page Object Model convention and a pre-merge locator review checklist. Within two sprints, flaky test rates dropped significantly and the team achieved a 20% efficiency gain across regression cycles. At that scale, test architecture decisions matter far more than individual test quality.