Back to All Articles
Automation

Appium Mobile Automation with Python — Complete Guide

Honnesh Muppala May 5, 2026 18 min read

What is Appium?

Appium is an open-source, cross-platform mobile automation framework that lets you write automated tests for native, hybrid, and mobile web applications on Android and iOS using the same WebDriver protocol that Selenium uses for web browsers. The name comes from "App" + "Selenium" — and the analogy is accurate: just as Selenium drives web browsers, Appium drives mobile devices and emulators.

What makes Appium fundamentally different from device-specific automation tools is that it requires no modification to the application under test. You do not need access to the app's source code, you do not need to recompile it with test hooks, and you do not need to embed any testing library into the APK or IPA. Appium uses the platform's native accessibility APIs — UIAutomator2 on Android, XCUITest on iOS — to interact with the same accessibility layer that screen readers use. This means Appium tests against the exact production binary that end users will run.

Appium 2.x, the current major version, introduced a modular plugin-driver architecture that changed how Appium is installed and managed. Instead of Appium shipping all browser and platform drivers in a single monolithic package, in Appium 2.x you install the Appium server separately and then install only the drivers you need. The UIAutomator2 driver for Android and the XCUITest driver for iOS are installed as separate npm packages via the Appium CLI. This modular approach reduces attack surface, improves startup time, and allows driver versions to be updated independently of the core server.

Appium supports an impressive range of platforms beyond just Android and iOS. The Windows App Driver extends Appium to native Windows applications. There are community-maintained drivers for macOS native applications, for Flutter apps (appium-flutter-driver), and for Samsung's Tizen OS. The common thread is the W3C WebDriver protocol — any platform that has a WebDriver-compliant server can be controlled from any WebDriver-compliant client, including the Appium Python Client.

My personal mobile automation experience spans two companies. At Virtusa, I built an Appium + Python framework for testing a fintech Android app — covering account creation flows, transaction history pagination, biometric authentication prompts, and push notification handling. The framework used pytest fixtures for session management and ran on a real Samsung Galaxy device connected to a Linux CI server. At Viasat, I worked on automation for inflight entertainment apps running on Amazon Fire OS tablets (a forked Android), which required specific capability configuration to target Fire OS's accessibility service rather than standard UIAutomator2. Testing on Fire OS devices taught me the importance of capability flags and how Appium's architecture allows it to adapt to Android variants without code changes.

Appium Architecture

Understanding Appium's layered architecture is essential for configuring it correctly, diagnosing connection failures, and understanding why certain operations behave differently on Android versus iOS. The communication flow spans five distinct layers, each with its own protocol and responsibility.

pytest / Test Script
Python code
Appium Python Client
W3C WebDriver HTTP
Appium Server
Port 4723
UIAutomator2 / XCUITest
Platform driver
Android / iOS Device
Real or emulated

Starting from the left: your Python test script imports the Appium Python Client library and creates a webdriver.Remote session by sending an HTTP POST request to the Appium Server. This request includes a JSON body describing the desired capabilities — which platform, which device, which app to launch. The Appium Server (a Node.js process) receives this request and acts as the translation layer. It parses the capabilities, determines which driver to use (UIAutomator2 for Android, XCUITest for iOS), starts a session with that driver, and returns a session ID to your Python client.

The driver layer (UIAutomator2 or XCUITest) communicates directly with the device or emulator using platform-specific mechanisms. UIAutomator2 uses Android's Accessibility Service — a system service that provides a structured tree of all UI elements on screen, their properties, and the ability to send synthetic interaction events. XCUITest is Apple's own testing framework, distributed as part of Xcode, which hooks into iOS's accessibility infrastructure. Neither framework requires the app to be modified because they operate at the OS level, reading the accessibility hierarchy that every app exposes automatically.

This architecture has important implications. Because Appium speaks W3C WebDriver protocol, your Appium Python Client code looks very similar to Selenium code — find_element, send_keys, click, WebDriverWait — all the same APIs. The differences emerge in locator strategies (mobile has ACCESSIBILITY_ID, ANDROID_UIAUTOMATOR, IOS_PREDICATE) and in gesture APIs (swipe, scroll, multi-touch) that have no web equivalent.

Installation & Setup

Appium 2.x setup involves more steps than typical Python packages because it spans Node.js (for the Appium server), Android SDK (for UIAutomator2 and the emulator), and Python (for the client). Follow these steps in order to ensure a complete working environment.

Step 1: Install the Appium 2.x Server

Appium runs as a Node.js process. Install Node.js first (v18 or later), then install Appium globally via npm:

# Check Node.js version (18+ required)
node --version

# Install Appium 2.x server globally
npm install -g appium@latest

# Verify installation
appium --version

Step 2: Install Platform Drivers

In Appium 2.x, drivers are installed separately. Install only the drivers you need to keep the installation lean and avoid version conflicts:

# Install UIAutomator2 driver for Android automation
appium driver install uiautomator2

# Install XCUITest driver for iOS automation (macOS only)
appium driver install xcuitest

# Verify installed drivers
appium driver list --installed

Step 3: Set up Android SDK

The Android SDK is required even for real device testing — the UIAutomator2 driver communicates with devices via ADB (Android Debug Bridge), which is part of the SDK platform-tools:

# Install Android Studio (includes SDK) or install SDK tools standalone
# Set environment variables (add to ~/.zshrc or ~/.bashrc):
export ANDROID_HOME=$HOME/Library/Android/sdk
export PATH=$PATH:$ANDROID_HOME/platform-tools
export PATH=$PATH:$ANDROID_HOME/emulator

# Verify ADB is working
adb version

# List connected devices (emulator or real device)
adb devices

Step 4: Install Python Appium Client

# Create virtual environment
python -m venv venv
source venv/bin/activate

# Install Appium Python client and test dependencies
pip install Appium-Python-Client
pip install pytest
pip install pytest-html

pip freeze > requirements.txt

Step 5: Start Appium Server

# Start Appium server with verbose logging (useful for debugging)
appium server --port 4723 --log-level info

# For local development, relaxed security allows more flexibility
appium --relaxed-security --port 4723

# Verify server is running — open in browser:
# http://localhost:4723/status

The Appium Desktop application (available from the Appium GitHub releases page) provides a GUI alternative to command-line server management, with a built-in Appium Inspector for element discovery. For CI environments, always use the command-line server started as a background process or Docker container.

Step 6: Run the Doctor

Appium provides a diagnostic tool called appium-doctor (installed separately) that checks your environment for all required dependencies and configurations:

npm install -g @appium/doctor
appium-doctor --android  # Check Android setup
appium-doctor --ios      # Check iOS setup (macOS only)

AppiumOptions & Capabilities

Capabilities tell the Appium server what kind of session you want to create — which platform, which device, which app, and how to configure the session. In Appium 2.x, the modern approach is to use the typed Options classes (UiAutomator2Options, XCUITestOptions) rather than the deprecated DesiredCapabilities dictionary. The typed options classes provide IDE autocomplete, validation, and clear documentation of available settings.

Android Capabilities with UiAutomator2Options

from appium.options import UiAutomator2Options

options = UiAutomator2Options()

# --- Required capabilities ---
options.platform_name = "Android"
options.device_name = "emulator-5554"       # adb device serial or emulator name
options.automation_name = "UIAutomator2"    # Must be UIAutomator2 for Android

# --- App specification (choose ONE approach) ---
# Option A: Launch an installed app by package and activity
options.app_package = "com.example.myapp"
options.app_activity = ".MainActivity"
# Option B: Install and launch from an APK file path
# options.app = "/path/to/your/app.apk"

# --- Session management ---
options.no_reset = True           # Keep app data between sessions (faster)
options.full_reset = False        # Don't uninstall/reinstall app
options.new_command_timeout = 300 # Kill session if no command for 5 minutes

# --- Android-specific conveniences ---
options.auto_grant_permissions = True  # Auto-accept Android permission dialogs
options.unicode_keyboard = True        # Use Appium keyboard for special chars
options.reset_keyboard = True          # Restore original keyboard after test

# --- Appium server URL ---
APPIUM_SERVER = "http://127.0.0.1:4723"

iOS Capabilities with XCUITestOptions

from appium.options import XCUITestOptions

options = XCUITestOptions()

# --- Required capabilities ---
options.platform_name = "iOS"
options.device_name = "iPhone 15"           # Simulator name or real device name
options.automation_name = "XCUITest"

# --- App specification ---
options.bundle_id = "com.example.myapp"     # For installed apps on real device
# options.app = "/path/to/MyApp.app"        # For simulator builds

# --- Code signing (required for real iOS devices) ---
options.xcode_org_id = "YOUR_APPLE_TEAM_ID"
options.xcode_signing_id = "iPhone Developer"

# --- Session management ---
options.no_reset = True
options.new_command_timeout = 300
options.launch_with_idb = False  # Use standard XCUITest launch (not idb)

# --- Simulator-specific ---
options.platform_version = "17.0"  # iOS version for simulator selection

One important distinction between Android and iOS capabilities: Android identifies devices by their ADB serial number (visible in adb devices output — typically emulator-5554 for emulators or an alphanumeric string for real devices). iOS identifies devices by device name (the "My iPhone" name in Settings), and XCUITest uses UDID for real devices. You can find the UDID in Xcode's Devices and Simulators window or with xcrun xctrace list devices.

Your First Android Test

Let us write a complete Android test that opens the Sauce Demo mobile app, navigates to the login screen through the hamburger menu, enters credentials, and verifies a successful login. This test is production-quality — it uses explicit waits, clear variable names, proper setup and teardown, and a meaningful assertion.

import pytest
from appium import webdriver
from appium.options import UiAutomator2Options
from appium.webdriver.common.appiumby import AppiumBy
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

APPIUM_SERVER = "http://127.0.0.1:4723"

class TestAndroidLogin:

    def setup_method(self):
        """Create a new Appium session before each test method."""
        options = UiAutomator2Options()
        options.platform_name = "Android"
        options.device_name = "emulator-5554"
        options.app_package = "com.saucelabs.mydemoapp.android"
        options.app_activity = ".MainActivity"
        options.automation_name = "UIAutomator2"
        options.no_reset = True
        options.auto_grant_permissions = True

        self.driver = webdriver.Remote(APPIUM_SERVER, options=options)
        self.wait = WebDriverWait(self.driver, 15)

    def test_login_with_valid_credentials(self):
        """A registered user should see their name after logging in."""

        # Tap the hamburger menu icon to open navigation drawer
        menu_btn = self.wait.until(
            EC.element_to_be_clickable(
                (AppiumBy.ACCESSIBILITY_ID, "open menu")
            )
        )
        menu_btn.click()

        # Tap 'Log In' from the navigation menu
        login_menu_item = self.wait.until(
            EC.element_to_be_clickable(
                (AppiumBy.XPATH,
                 "//android.widget.TextView[@text='Log In']")
            )
        )
        login_menu_item.click()

        # Enter username
        username_field = self.wait.until(
            EC.visibility_of_element_located(
                (AppiumBy.ACCESSIBILITY_ID, "Username input field")
            )
        )
        username_field.send_keys("bod@example.com")

        # Enter password
        self.driver.find_element(
            AppiumBy.ACCESSIBILITY_ID, "Password input field"
        ).send_keys("10203040")

        # Tap login button
        self.driver.find_element(
            AppiumBy.ACCESSIBILITY_ID, "Login button"
        ).click()

        # Verify login succeeded — welcome text contains the user's first name
        welcome_text = self.wait.until(
            EC.presence_of_element_located(
                (AppiumBy.XPATH,
                 "//android.widget.TextView[@text='Bob']")
            )
        )
        assert welcome_text.is_displayed(), (
            "Login failed — expected to see 'Bob' welcome message on the home screen"
        )

    def test_login_with_invalid_password_shows_error(self):
        """Wrong password should show an error alert."""
        menu_btn = self.wait.until(
            EC.element_to_be_clickable(
                (AppiumBy.ACCESSIBILITY_ID, "open menu")
            )
        )
        menu_btn.click()

        self.wait.until(
            EC.element_to_be_clickable(
                (AppiumBy.XPATH, "//android.widget.TextView[@text='Log In']")
            )
        ).click()

        self.wait.until(
            EC.visibility_of_element_located(
                (AppiumBy.ACCESSIBILITY_ID, "Username input field")
            )
        ).send_keys("bod@example.com")

        self.driver.find_element(
            AppiumBy.ACCESSIBILITY_ID, "Password input field"
        ).send_keys("wrongpassword")

        self.driver.find_element(
            AppiumBy.ACCESSIBILITY_ID, "Login button"
        ).click()

        # Verify error message appears
        error_msg = self.wait.until(
            EC.visibility_of_element_located(
                (AppiumBy.XPATH,
                 "//android.widget.TextView[contains(@text, 'Provided credentials')]")
            )
        )
        assert error_msg.is_displayed(), "Expected error message for invalid credentials"

    def teardown_method(self):
        """Close the Appium session after each test method."""
        if hasattr(self, "driver") and self.driver:
            self.driver.quit()

Note the use of AppiumBy rather than Selenium's By for mobile locators. AppiumBy extends Selenium's By and adds mobile-specific strategies like ACCESSIBILITY_ID, ANDROID_UIAUTOMATOR, IOS_PREDICATE, and IOS_CLASS_CHAIN. For strategies that exist in both (XPath, ID, CSS), either AppiumBy or By works interchangeably.

iOS Testing Basics

iOS automation with Appium follows the same pattern as Android but with crucial differences in locator strategies and some capability configurations. XCUITest — Apple's official UI testing framework — is more restrictive than UIAutomator2 in how it exposes the accessibility tree, which affects which locators you can reliably use.

from appium import webdriver
from appium.options import XCUITestOptions
from appium.webdriver.common.appiumby import AppiumBy
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

APPIUM_SERVER = "http://127.0.0.1:4723"

class TestiOSLogin:

    def setup_method(self):
        options = XCUITestOptions()
        options.platform_name = "iOS"
        options.device_name = "iPhone 15"
        options.platform_version = "17.0"
        options.bundle_id = "com.example.myapp"
        options.automation_name = "XCUITest"
        options.no_reset = True
        options.new_command_timeout = 300

        self.driver = webdriver.Remote(APPIUM_SERVER, options=options)
        self.wait = WebDriverWait(self.driver, 15)

    def test_login_ios(self):
        # Accessibility ID works on iOS the same as Android
        username_field = self.wait.until(
            EC.visibility_of_element_located(
                (AppiumBy.ACCESSIBILITY_ID, "usernameField")
            )
        )
        username_field.send_keys("testuser@example.com")

        # iOS Predicate String — powerful iOS-specific selector
        password_field = self.driver.find_element(
            AppiumBy.IOS_PREDICATE,
            "type == 'XCUIElementTypeSecureTextField' AND name == 'passwordField'"
        )
        password_field.send_keys("Password123")

        # iOS Class Chain — another powerful iOS-specific selector
        login_button = self.driver.find_element(
            AppiumBy.IOS_CLASS_CHAIN,
            "**/XCUIElementTypeButton[`name == 'loginButton'`]"
        )
        login_button.click()

        # Verify navigation to home screen
        home_header = self.wait.until(
            EC.presence_of_element_located(
                (AppiumBy.ACCESSIBILITY_ID, "homeTitle")
            )
        )
        assert home_header.is_displayed(), "Expected home screen after login"

    def teardown_method(self):
        if hasattr(self, "driver") and self.driver:
            self.driver.quit()

iOS predicate strings use Apple's NSPredicate syntax — the same syntax used in Core Data queries. They allow filtering by element type, name, label, value, and boolean properties with operators like AND, OR, CONTAINS, BEGINSWITH. iOS Class Chain is a lighter-weight alternative that traverses the accessibility hierarchy using a path syntax. Both are significantly more reliable than XPath on iOS, where XPath queries can be extraordinarily slow because XCUITest has to construct the full accessibility tree to evaluate them.

Mobile Locator Strategies

Mobile locator strategies differ meaningfully from web locators because mobile apps do not have HTML, CSS, or JavaScript — they have native UI frameworks (Android Views / Jetpack Compose on Android, UIKit / SwiftUI on iOS). The accessibility tree, rather than the DOM, is the structure that Appium queries.

Strategy Android iOS Reliability
ACCESSIBILITY_ID contentDescription accessibilityLabel High — preferred for both platforms
ID resource-id (com.pkg:id/name) Not applicable High — Android only
XPATH Both platforms Both platforms Low — slow and brittle; avoid
CLASS_NAME android.widget.Button XCUIElementTypeButton Medium — useful when few elements of that type
ANDROID_UIAUTOMATOR UISelector / UIScrollable Not applicable High — powerful Android-specific selector
IOS_PREDICATE Not applicable NSPredicate string High — fast and powerful iOS selector
IOS_CLASS_CHAIN Not applicable Class hierarchy path High — fastest iOS selector for deep hierarchies

ACCESSIBILITY_ID should be your first choice for both platforms. On Android, it maps to the contentDescription attribute of a View. On iOS, it maps to the accessibilityLabel property. When your development team sets meaningful accessibility identifiers — which they should do anyway for screen reader support — your test locators are both accessible and stable.

UIAutomator2 UISelector is particularly powerful for Android because it supports scrollable container traversal, which is essential for finding elements that are not currently on screen:

# By resource-id (most stable Android selector after contentDescription)
element = self.driver.find_element(
    AppiumBy.ID, "com.example.myapp:id/loginButton"
)

# By contentDescription (ACCESSIBILITY_ID)
element = self.driver.find_element(
    AppiumBy.ACCESSIBILITY_ID, "Login button"
)

# UIAutomator2 UISelector — very powerful, Android only
element = self.driver.find_element(
    AppiumBy.ANDROID_UIAUTOMATOR,
    'new UiSelector().resourceId("com.example.myapp:id/usernameInput").instance(0)'
)

# UIAutomator2 — find by text (case-sensitive)
element = self.driver.find_element(
    AppiumBy.ANDROID_UIAUTOMATOR,
    'new UiSelector().text("Sign In")'
)

# UIAutomator2 — find by text containing string
element = self.driver.find_element(
    AppiumBy.ANDROID_UIAUTOMATOR,
    'new UiSelector().textContains("Sign")'
)

# UIScrollable — scroll to find an element not on screen
element = self.driver.find_element(
    AppiumBy.ANDROID_UIAUTOMATOR,
    'new UiScrollable(new UiSelector().scrollable(true))'
    '.scrollIntoView(new UiSelector().text("Privacy Settings"))'
)

XPath on mobile is the locator of last resort. Unlike web XPath which evaluates against the HTML DOM (which browsers optimise heavily), mobile XPath evaluates against a serialised XML representation of the accessibility tree that Appium constructs on demand. On a complex screen with 200+ elements, this can take 5-10 seconds per XPath query. Use XPath only when nothing else works — specifically for text matching when ACCESSIBILITY_ID is not available and UISelector text matching does not fit the pattern.

From the Field — Virtusa: When I joined the mobile automation project at Virtusa, the existing Appium scripts relied almost entirely on XPath. A test suite of 80 tests was taking over 40 minutes to execute — on a powerful MacBook Pro with a connected real Samsung device. After profiling Appium Server logs, I found that XPath element lookups were consuming over 60% of execution time. Converting the locators to ACCESSIBILITY_ID and ANDROID_UIAUTOMATOR UISelector reduced the suite runtime to under 18 minutes. The lesson: mobile locator strategy choice has a much larger performance impact than on web, because accessibility tree serialisation is computationally expensive compared to DOM querying.

Gestures — Swipe, Scroll, Tap

Mobile interactions go beyond clicking and typing — users swipe to navigate, scroll through lists, long-press for context menus, and pinch to zoom. Appium supports all these gestures through the W3C Actions API, which replaced the deprecated TouchAction API in Appium 2.x. The W3C Actions API models touch input as a pointer device with move, down, and up events, combined with optional pauses for timing control.

from selenium.webdriver.common.actions import interaction
from selenium.webdriver.common.actions.action_builder import ActionBuilder
from selenium.webdriver.common.actions.pointer_input import PointerInput


def swipe_up(driver, start_x_pct=0.5, start_y_pct=0.8, end_y_pct=0.2, duration_ms=600):
    """Swipe from bottom to top to scroll content upward."""
    size = driver.get_window_size()
    start_x = int(size['width'] * start_x_pct)
    start_y = int(size['height'] * start_y_pct)
    end_y   = int(size['height'] * end_y_pct)

    touch = PointerInput(interaction.POINTER_TOUCH, "touch")
    actions = ActionBuilder(driver, mouse=touch)
    actions.pointer_action.move_to_location(start_x, start_y)
    actions.pointer_action.pointer_down()
    actions.pointer_action.pause(duration_ms / 1000)
    actions.pointer_action.move_to_location(start_x, end_y)
    actions.pointer_action.pointer_up()
    actions.perform()


def swipe_down(driver, start_x_pct=0.5, start_y_pct=0.2, end_y_pct=0.8, duration_ms=600):
    """Swipe from top to bottom to pull-to-refresh or navigate back."""
    size = driver.get_window_size()
    start_x = int(size['width'] * start_x_pct)
    start_y = int(size['height'] * start_y_pct)
    end_y   = int(size['height'] * end_y_pct)

    touch = PointerInput(interaction.POINTER_TOUCH, "touch")
    actions = ActionBuilder(driver, mouse=touch)
    actions.pointer_action.move_to_location(start_x, start_y)
    actions.pointer_action.pointer_down()
    actions.pointer_action.pause(duration_ms / 1000)
    actions.pointer_action.move_to_location(start_x, end_y)
    actions.pointer_action.pointer_up()
    actions.perform()


def swipe_left(driver, start_y_pct=0.5, start_x_pct=0.8, end_x_pct=0.2):
    """Swipe left to navigate to next item in a carousel or dismiss."""
    size = driver.get_window_size()
    y = int(size['height'] * start_y_pct)
    start_x = int(size['width'] * start_x_pct)
    end_x   = int(size['width'] * end_x_pct)

    touch = PointerInput(interaction.POINTER_TOUCH, "touch")
    actions = ActionBuilder(driver, mouse=touch)
    actions.pointer_action.move_to_location(start_x, y)
    actions.pointer_action.pointer_down()
    actions.pointer_action.move_to_location(end_x, y)
    actions.pointer_action.pointer_up()
    actions.perform()


def tap(driver, x, y):
    """Tap at absolute screen coordinates."""
    touch = PointerInput(interaction.POINTER_TOUCH, "touch")
    actions = ActionBuilder(driver, mouse=touch)
    actions.pointer_action.move_to_location(x, y)
    actions.pointer_action.pointer_down()
    actions.pointer_action.pointer_up()
    actions.perform()


def long_press(driver, element, duration_ms=1500):
    """Long press on an element to trigger context menu or selection mode."""
    loc_x = element.location['x'] + element.size['width'] // 2
    loc_y = element.location['y'] + element.size['height'] // 2

    touch = PointerInput(interaction.POINTER_TOUCH, "touch")
    actions = ActionBuilder(driver, mouse=touch)
    actions.pointer_action.move_to_location(loc_x, loc_y)
    actions.pointer_action.pointer_down()
    actions.pointer_action.pause(duration_ms / 1000)
    actions.pointer_action.pointer_up()
    actions.perform()

For Android, UIAutomator2 provides a higher-level scroll-to-element API that is more reliable than coordinate-based swiping, because it adapts to the actual scroll distance needed rather than swiping a fixed amount and hoping the element is now visible:

# UIScrollable: scroll until a specific text is visible (Android only)
element = self.driver.find_element(
    AppiumBy.ANDROID_UIAUTOMATOR,
    'new UiScrollable(new UiSelector().scrollable(true))'
    '.scrollIntoView(new UiSelector().text("Privacy & Security"))'
)
element.click()

# Scroll to an element by its resource-id
element = self.driver.find_element(
    AppiumBy.ANDROID_UIAUTOMATOR,
    'new UiScrollable(new UiSelector().scrollable(true)).scrollIntoView('
    'new UiSelector().resourceId("com.example.app:id/settingsButton"))'
)

# Mobile-specific scroll command (works on both Android and iOS)
self.driver.execute_script("mobile: scroll", {
    "direction": "down",
    "element": element_to_scroll_inside.id
})

pytest Integration

Structuring Appium tests with pytest fixtures gives you the same benefits as with Selenium — clean setup/teardown, fixture sharing, and parametrize for data-driven scenarios. The key decision is the fixture scope. Because starting an Appium session (launching the app, connecting to the device, starting UIAutomator2 server) takes 10-30 seconds, function-scoped sessions are very expensive. Class-scoped fixtures with per-test state reset are the practical compromise.

# conftest.py
import pytest
from appium import webdriver
from appium.options import UiAutomator2Options

APPIUM_SERVER = "http://127.0.0.1:4723"

@pytest.fixture(scope="class")
def driver():
    """
    Class-scoped driver: Appium session starts once per test class.
    Individual tests are responsible for resetting app state via reset_app_state fixture.
    """
    options = UiAutomator2Options()
    options.platform_name = "Android"
    options.device_name = "emulator-5554"
    options.app_package = "com.example.app"
    options.app_activity = ".MainActivity"
    options.automation_name = "UIAutomator2"
    options.no_reset = True
    options.auto_grant_permissions = True
    options.new_command_timeout = 300

    driver = webdriver.Remote(APPIUM_SERVER, options=options)
    driver.implicitly_wait(0)  # Always disable implicit waits
    yield driver
    driver.quit()


@pytest.fixture(autouse=True)
def reset_app_state(driver):
    """
    Auto-use fixture: runs before and after every test method.
    Terminates the app after each test to ensure a clean starting state.
    """
    # Ensure app is in foreground before test
    driver.activate_app("com.example.app")
    yield
    # Terminate app after test to reset state
    try:
        driver.terminate_app("com.example.app")
    except Exception:
        pass  # App may have already been terminated by the test


@pytest.fixture(scope="session")
def take_screenshot_on_fail(driver, request):
    """Session-level fixture for failure screenshots."""
    yield
    if request.node.rep_call.failed:
        import os
        import time
        os.makedirs("screenshots", exist_ok=True)
        filename = f"screenshots/{request.node.name}_{int(time.time())}.png"
        driver.save_screenshot(filename)
        print(f"\nScreenshot: {filename}")
# tests/test_login.py
import pytest
from appium.webdriver.common.appiumby import AppiumBy
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

@pytest.mark.usefixtures("driver")
class TestLogin:

    def test_valid_login(self, driver):
        wait = WebDriverWait(driver, 15)
        wait.until(
            EC.element_to_be_clickable((AppiumBy.ACCESSIBILITY_ID, "open menu"))
        ).click()
        wait.until(
            EC.element_to_be_clickable(
                (AppiumBy.XPATH, "//android.widget.TextView[@text='Log In']")
            )
        ).click()
        driver.find_element(
            AppiumBy.ACCESSIBILITY_ID, "Username input field"
        ).send_keys("bod@example.com")
        driver.find_element(
            AppiumBy.ACCESSIBILITY_ID, "Password input field"
        ).send_keys("10203040")
        driver.find_element(AppiumBy.ACCESSIBILITY_ID, "Login button").click()
        welcome = wait.until(
            EC.presence_of_element_located(
                (AppiumBy.XPATH, "//android.widget.TextView[@text='Bob']")
            )
        )
        assert welcome.is_displayed()

Run the tests exactly as you would any pytest suite:

# Run all mobile tests
pytest tests/ -v

# Run with HTML report
pytest tests/ --html=reports/mobile_report.html --self-contained-html

# Run specific class
pytest tests/test_login.py::TestLogin -v

Real Device vs Emulator

The choice between testing on real devices and emulators/simulators is not binary — in practice, most mature mobile testing strategies use both, for different purposes at different stages of the development cycle. Understanding the trade-offs of each environment helps you make an informed decision about when to use which.

Aspect Real Device Emulator / Simulator
Performance accuracy Accurate — real hardware May be faster or slower than target device
Cost High (device purchase + maintenance or cloud fees) Free (local) or low-cost (cloud simulators)
Network testing Real cellular, WiFi, carrier-specific behaviour Simulated — cannot test carrier features
Camera / GPS / sensors Real hardware sensors Simulated (limited accuracy)
Battery testing Yes — real battery drain data Limited to emulated battery state
Crash reproduction Exact reproduction of real user crashes May not reproduce hardware-specific crashes
CI/CD integration Complex (USB hubs, device farms) Simple — AVD and Simulator start headlessly
Best used for Release testing, exploratory testing, performance Development, CI/CD regression, fast iteration

For CI/CD pipelines, Android emulators (AVDs — Android Virtual Devices) started via the Android Emulator command-line tool are the most practical option. The reactivecircus/android-emulator-runner GitHub Action makes it straightforward to start an AVD, run tests, and shut it down within a workflow. For iOS, simulators run only on macOS — they cannot run on Linux or Windows GitHub Actions runners, which means iOS CI requires either self-hosted macOS runners or cloud CI services like Bitrise or Xcode Cloud.

# List all connected devices and running emulators
adb devices

# Get the model name of a connected device
adb -s SERIAL shell getprop ro.product.model

# Get Android version of a connected device
adb -s SERIAL shell getprop ro.build.version.release

# Start a specific AVD from the command line
$ANDROID_HOME/emulator/emulator -avd "Pixel_8_API_34" -no-window -no-audio &

# Wait for AVD to boot
adb wait-for-device
adb shell getprop sys.boot_completed  # Returns "1" when fully booted

Cloud device farms — BrowserStack, Sauce Labs, LambdaTest — offer real device access in the cloud. You connect your Appium scripts to their Appium server endpoint instead of your local server, pass cloud-specific capabilities, and your tests run on real physical devices in their data centres. This is the recommended approach for pre-release testing across multiple device/OS version combinations that would be impractical to maintain locally.

Using Appium Inspector

Appium Inspector is a GUI tool for exploring the accessibility tree of your mobile app — the mobile equivalent of a browser's DevTools Elements panel. It connects to a running Appium server, launches a session with your specified capabilities, and renders a visual representation of the app alongside the accessibility tree. You click any element to see all its attributes: resource-id, contentDescription, class, bounds, text, enabled, and more.

Getting Started with Appium Inspector

Download the latest release from the Appium Inspector GitHub releases page. It is available as a standalone app for macOS, Windows, and Linux. After installing, the setup is straightforward:

1. Start your Appium server: appium server --port 4723

2. Open Appium Inspector and configure the server host as 127.0.0.1 and port as 4723.

3. In the "Desired Capabilities" panel (or use the JSON editor), enter your capabilities — at minimum: platformName, deviceName, appPackage, appActivity, and automationName.

4. Click "Start Session". Appium Inspector launches the app on your device or emulator and displays a live screenshot alongside the element tree.

5. Click any element in the screenshot or the tree. The right panel shows all attributes. The "Locator Suggestion" area generates a recommended locator based on the element's attributes — typically preferring ACCESSIBILITY_ID or ID when available.

6. Use the "Search for element" feature to test a locator before writing it into your test code. Enter your strategy and value, and Inspector highlights which elements match — ensuring your locator is both correct and unique before you ever run a test.

From the Field — Virtusa Mobile Testing: In my mobile testing work at Virtusa, Appium Inspector saved hours of locator hunting every sprint. The workflow I standardised: open Inspector, navigate to the screen you need to automate, click the element, copy the resource-id or contentDescription from the Attributes panel, then immediately use the Search feature to confirm it finds exactly one element. This two-step verification — find, then confirm uniqueness — prevents the most common mobile locator bugs before they ever make it into test code. Without Inspector, you are essentially guessing locators from APK source code or from adb uiautomator dump output, which is far slower and error-prone.

Preventing Flaky Mobile Tests

Mobile tests are more prone to flakiness than web tests for several reasons: apps perform more complex asynchronous operations, OS-level dialogs appear unexpectedly, animations cause timing issues, and hardware sensors introduce variability. Understanding the common causes and their solutions is essential for building a stable mobile automation suite.

Timing Issues — Always Use Explicit Waits

The most common cause of mobile test flakiness is interacting with elements before the app has finished rendering them. Always use explicit waits:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

wait = WebDriverWait(driver, 15)

# Wait for element visibility before interacting
element = wait.until(
    EC.visibility_of_element_located((AppiumBy.ACCESSIBILITY_ID, "submitButton"))
)
element.click()

Keyboard Covering Elements

When a text input has focus, the software keyboard slides up and may cover other elements. Always dismiss the keyboard before interacting with elements below it:

# Hide software keyboard (works on both Android and iOS)
driver.hide_keyboard()

# On Android, you can also press the back button to dismiss keyboard
driver.press_keycode(4)  # Android keycode for Back button

Permission Dialogs

Android apps frequently show permission dialogs (camera, location, notifications). Unexpected dialogs intercept taps and cause failures. Configure auto_grant_permissions=True in your capabilities to let Appium automatically accept all permissions before they appear. For more granular control, use adb pm grant commands in your setup to grant specific permissions directly without dialogs.

App Not Fully Launched

Even after Appium starts the session and returns control to your script, the app may still be loading its initial state — fetching data, checking authentication, playing a splash animation. Always wait for a specific element that indicates the app is ready before running the first test action:

# Wait for the main navigation element that appears only when app is fully ready
wait = WebDriverWait(driver, 30)  # Generous timeout for app launch
wait.until(
    EC.presence_of_element_located(
        (AppiumBy.ACCESSIBILITY_ID, "bottomNavigationBar")
    )
)

Retry on Flaky Tests

For tests that are intermittently flaky despite correct waits — often due to device-level variability — the pytest-rerunfailures plugin provides automatic retry functionality:

# Install the retry plugin
pip install pytest-rerunfailures

# Run tests with automatic retry on failure
pytest tests/ --reruns 2 --reruns-delay 3

# Retry only specific tests marked as flaky
# @pytest.mark.flaky(reruns=3, reruns_delay=5)
def test_camera_permission_flow(driver):
    ...

Best Practices

Mobile automation presents unique challenges that web automation does not — hardware variability, OS-level interruptions, animation timing, and the constraint of working with a single shared device in many CI setups. These ten practices are drawn from real experience building and maintaining production Appium suites.

1. Use ACCESSIBILITY_ID as your primary locator strategy. It maps to contentDescription on Android and accessibilityLabel on iOS — both properties that developers set anyway for screen reader support. When these are meaningful and unique, your tests are the most stable they can be.

2. Always use explicit waits — never time.sleep() and never implicit waits. Mobile operations are inherently asynchronous. A tap that triggers a network request may cause a 50ms or a 3-second delay depending on network conditions. Explicit waits adapt; fixed sleeps do not.

3. Test on multiple Android API versions. API 28 (Android 9), API 30 (Android 11), API 33 (Android 13) all have different permission models, UI behaviour changes, and gesture navigation differences. A test that passes on API 33 may fail on API 28 due to permission dialog differences. Maintain at least two AVD configurations with different API levels.

4. Test both portrait and landscape orientations if the app supports rotation. Many layout bugs appear only in landscape mode, and Appium makes it trivial to rotate: driver.orientation = "LANDSCAPE". Add an orientation test at the end of critical flows.

5. Handle system-level interruptions explicitly. Battery low dialogs, incoming calls (on real devices), "Your app stopped" crash dialogs, and notification shade can appear during tests. Design your test framework to detect and dismiss these when they appear unexpectedly, rather than failing cryptically when the next interaction cannot find its target element.

6. Test on real devices before release, not just emulators. Emulators are excellent for development-time testing and CI regression, but real device hardware — GPU variations, memory constraints, touch sensor calibration differences — can reveal bugs that emulators mask. Run at least a smoke suite on a physical device before each release.

7. Use noReset=True during development and full_reset=False to speed up test execution. Reinstalling the app for every test adds 30-60 seconds per session. During development, keeping the installed app and data between sessions is fine. Save full resets for the nightly full regression where clean state matters.

8. Assert on the actual business outcome, not just that a click happened. element.click() succeeding does not mean the action worked. After clicking "Place Order", verify the order confirmation number appears. After tapping "Send", verify the message appears in the conversation. Superficial assertions produce tests that pass even when the app is broken.

9. Log device info in test output for easier failure diagnosis. At the start of each test session, log the device model, OS version, app version, and Appium server version. When a failure report comes in from CI, knowing exactly which combination failed eliminates a lot of guesswork:

def log_device_info(driver):
    """Log device and app info at the start of a test session."""
    info = driver.capabilities
    print(f"Platform: {info.get('platformName')} {info.get('platformVersion')}")
    print(f"Device: {info.get('deviceName')}")
    print(f"App: {info.get('app') or info.get('appPackage')}")
    print(f"Automation: {info.get('automationName')}")

10. Use BrowserStack or Sauce Labs for cross-device coverage. Cloud device farms let you run the same test on 10 different device/OS combinations in the time it would take to run on one local device. For pre-release testing, this coverage is invaluable — your app may work perfectly on a Pixel 8 running Android 14 and crash on a Samsung Galaxy A13 running Android 12, and you will not know until real users report it unless you test on representative devices.

From Experience — Virtusa: Leading a team of 270 testers at Virtusa, we standardised on Appium for real Android device testing and Selenium WebDriver for web regression. The biggest challenge wasn't the tooling — it was consistency across a team that size. We enforced a strict Page Object Model convention and a pre-merge locator review checklist. Within two sprints, flaky test rates dropped significantly and the team achieved a 20% efficiency gain across regression cycles. At that scale, test architecture decisions matter far more than individual test quality.