How QA Fits in Agile
Traditional software development placed QA at the end of the process. Development built the software for weeks or months, and QA received it just before the release deadline to find as many defects as possible before shipping. This model treated quality as a filter — something applied at the exit to catch what development missed.
Agile fundamentally inverts this relationship. In a well-functioning Agile team, QA is not a phase that happens after development — it is a mindset that runs throughout the entire sprint. The QA engineer participates from the moment a story is conceived, challenges requirements for testability before a single line of code is written, tests alongside development as features are built, and contributes to retrospectives to improve the process for next sprint.
This shift from gatekeeper to quality owner is the most important conceptual change for QA engineers moving from Waterfall to Agile. The gatekeeper model gives QA power through approval authority. The quality owner model gives QA influence through early collaboration. The quality owner model is far more effective — and far more demanding. It requires QA engineers to have strong communication skills, domain knowledge, and the ability to work at the speed of the sprint.
The continuous testing mindset means testing never stops. Between sprint ceremonies, during development, in pull request review, in CI pipelines, in exploratory sessions — quality verification is constant rather than periodic. This requires automation for repetitive regression, manual exploratory sessions for new functionality, and API testing woven into the development cycle itself.
Scrum Cycle — Architecture Diagram
The Scrum cycle is a two-week loop. QA participates actively in every phase:
The critical detail in this diagram is "Dev + QA parallel." In Agile, QA does not wait for all stories to be complete before testing begins. As soon as a story is marked ready for testing, QA picks it up — often while development is still working on the next story. This parallel workflow compresses the testing timeline and surfaces bugs while the developer still has the context to fix them quickly.
Scrum Ceremonies for QA
Sprint Planning
Sprint Planning is where the team selects stories from the backlog and commits to completing them in the upcoming sprint. QA's specific contributions in Sprint Planning:
- Effort estimation input: QA estimates testing effort alongside development. A story that looks like 3 points of development work might be 5 points total when testing complexity is included. Complex integration scenarios, test data setup, and environment dependencies all add testing time that is invisible unless QA speaks up in planning.
- Testability risk flagging: "This story requires third-party API access that is not available in our test environment" — surfacing this in planning prevents a sprint-end surprise where a story cannot be verified.
- Dependency identification: Which stories must be completed before others can be tested? QA sees integration dependencies that may not be obvious from a development perspective.
Backlog Refinement
Refinement is where QA delivers the most upstream value. In a refinement session, QA challenges stories for testability — the property that makes a story verifiable in a time-constrained sprint environment.
- Challenging requirements: "This story says the page should load quickly. What is the measurable definition of quickly? Under 2 seconds on a 4G connection? Under 5 seconds on 3G?" Vague requirements cannot be tested.
- Identifying missing acceptance criteria: "What happens if the user's session expires mid-checkout? What should the system show if the payment gateway is down?" These edge cases need to be in the story's acceptance criteria before development begins.
- Raising testability issues: "This story requires a user with admin + read-only permissions simultaneously. That combination does not exist in our test data. We need to resolve that before we can accept this into a sprint."
Daily Standup
QA's standup answers three questions with a QA lens: what did I test yesterday (and what are the results), what am I testing today, and what is blocking me? The key QA-specific items to surface: stories blocked waiting for a bug fix (don't test around a known bug and report false confidence), environment issues that affect multiple stories, and bugs that are high severity enough to warrant immediate developer attention outside the normal sprint flow.
Sprint Review
In the Sprint Review, QA contributes by presenting the test results for the sprint — what passed, what failed, what was not reached, and what bugs were found. This is not just a green/red dashboard; it is context. "We executed 47 test cases, 43 passed, 4 are in 'In Progress' status pending bug fixes. We found 11 bugs; 8 are resolved, 2 are deferred to next sprint, 1 is a P1 being addressed in a hotfix."
Retrospective
Retrospectives are where QA can advocate for process improvements: testing environments that were unstable (action: dedicated environment stability sprint), stories that lacked acceptance criteria (action: mandatory AC before refinement sign-off), automation that would have caught a regression earlier (action: add to automation backlog with priority). QA's retrospective contributions should be concrete and actionable, not complaints.
Acceptance Criteria — The QA Foundation
Acceptance criteria (AC) define the conditions under which a story can be considered done from a product perspective. Well-written AC is the single most important enabler of effective sprint testing. Without clear AC, QA is testing against an implicit standard that may differ between the developer's, QA's, and the product owner's mental models.
Given/When/Then format
BDD-style acceptance criteria use the Given/When/Then structure:
- Given [precondition or starting state]
- When [action the user takes]
- Then [expected outcome]
Example — User login story:
- Given I am on the login page and I have a valid account, When I enter correct credentials and click Login, Then I am redirected to the dashboard and my username is shown in the header.
- Given I am on the login page, When I enter an incorrect password three times, Then my account is locked and I see a message telling me to contact support.
- Given I am logged in on device A, When I log in on device B simultaneously, Then my session on device A is invalidated within 60 seconds.
Testable vs untestable AC
Untestable: "The system should be user-friendly." — No measurable criterion. Cannot be verified.
Testable: "A new user with no prior experience should complete the registration flow in under 3 minutes on first attempt." — Specific, measurable, verifiable.
Untestable: "Checkout should be fast." — Subjective, no benchmark.
Testable: "The checkout completion API call should respond within 2 seconds for 95% of requests under 100 concurrent users." — Measurable, can be tested with performance tooling.
Definition of Done (DoD)
The Definition of Done is the team's agreement on what conditions must be met before a story is considered complete. It is distinct from acceptance criteria — AC is story-specific, DoD applies to every story. QA's contributions to the DoD typically include:
- Unit tests written and passing (developer responsibility, but QA verifies coverage is adequate)
- Integration tests for API contract verified
- All acceptance criteria tested and passing
- At least one exploratory testing session completed for the feature
- Regression suite run; no new failures introduced
- Code reviewed and merged to main branch
- Deployed to staging environment
- Documentation updated if the feature changes user-facing behaviour
- Zero P1 or P2 bugs open against this story
The DoD should be visible to the entire team — posted in the team wiki, in the Jira board description, or on the team's physical wall. When a developer says a story is "done," it means it meets every item on the DoD list, not just that coding is complete.
Shift-Left Testing
Shift-left means moving testing activities earlier in the development lifecycle — toward the left side of the timeline, where requirements and design live, rather than the right side where QA has traditionally sat.
Concrete shift-left practices:
- Requirements review: QA participates in requirements review sessions, flagging ambiguity, missing edge cases, and untestable statements before any development begins.
- Design review: QA reviews technical design documents for testability — are there seams in the architecture where unit testing is possible? Are the API contracts defined clearly enough to test independently?
- Test case drafting during refinement: High-level test scenarios are drafted in the refinement session, before the sprint starts. When development picks up the story, QA already has a draft of what will be tested — this reduces the "what exactly should I test?" ambiguity at the end of the sprint.
- Code review participation: QA reviews pull requests with a testing lens — not for code quality, but for testability, error handling, and logging adequacy. "This function catches the exception but swallows it — when it fails, we won't know why."
- Unit test review: QA verifies that developer-written unit tests cover the business-critical paths, not just the implementation-easy paths.
Three Amigos — The Pre-Sprint Alignment Session
The Three Amigos session (also called "3As" or "Story Kickoff") brings together three perspectives before development begins on a story: the developer who will build it, the QA engineer who will test it, and the product owner who defined it. Each sees the story from a fundamentally different angle, and alignment requires all three perspectives in the same conversation.
What each participant brings:
- Product Owner: The business intent — what user problem does this solve? What does success look like from the user's perspective? What are the non-negotiable behaviours?
- Developer: Technical constraints and implementation approach — what is easy to implement vs complex? What existing systems does this interact with? What edge cases arise from the technical approach?
- QA: Testing perspective — what scenarios need to be covered? What data states need to be tested? What are the failure modes? Are there testability concerns with the proposed implementation?
Running an effective 3As meeting: time-box to 30 minutes, prepare examples and counter-examples before the meeting, end with agreed acceptance criteria that all three parties sign off on, and document the output in the story's AC field immediately.
BDD in Practice — Gherkin and Cucumber
Behaviour-Driven Development (BDD) extends the Given/When/Then acceptance criteria format into executable specifications. The process: acceptance criteria written in Given/When/Then become Gherkin feature files, which are linked to step definitions (code that implements the test logic), which are executed by a framework like Cucumber (JVM/Ruby), Behave (Python), or SpecFlow (.NET).
Writing Gherkin from acceptance criteria is a translation exercise, not a technical skill. Product owners should be able to read (and ideally write) Gherkin. Example conversion:
Feature: User Login
Scenario: Successful login with valid credentials
Given I am on the login page
And I have a valid account with username "user@example.com"
When I enter my credentials and click Login
Then I should be redirected to the dashboard
And I should see "Welcome back" in the header
Scenario: Account lock after three failed attempts
Given I am on the login page
When I enter an incorrect password 3 times
Then I should see "Account locked" message
And I should not be able to attempt login again
Involving the PO in feature file review ensures the executable specification matches the intended behaviour. If the PO cannot read the Gherkin and confirm it represents the feature correctly, the feature file is too technical — rewrite it in plain language. BDD's value is the living documentation it produces, not the automation it enables (though that matters too).
Test Pyramid in Agile
The test pyramid (coined by Mike Cohn) defines the ideal balance of test types: many unit tests at the base, fewer integration tests in the middle, and a small number of end-to-end tests at the top. In Agile, this balance directly affects sprint velocity.
A team that inverts the pyramid — many slow E2E tests, few unit tests — suffers in Agile: slow CI pipelines mean developers wait 45 minutes for feedback on a code change. Flaky E2E tests create false failures that erode trust in the test suite. The CI pipeline becomes a bottleneck rather than an accelerator.
- Unit tests (base): Fast, isolated, run in milliseconds. Developers write these. QA advocates for coverage of business logic branches, not just happy paths. Target: 70%+ of test volume.
- Integration tests (middle): Test API contracts, database interactions, and service-to-service communication. QA typically owns these. Target: 20% of test volume.
- E2E tests (top): Full user journey through the UI. Slow, fragile, expensive to maintain. Reserve for the 10–15 most critical user journeys that cannot be covered lower in the pyramid. Target: 10% of test volume.
In a sprint context, the practical rule: if a test can be written at a lower level of the pyramid, it should be. A login validation test that lives at the unit level provides the same coverage as an E2E login test, runs 1000x faster, and does not require a live environment.
Bug Workflow in Agile
Not every bug has the same lifecycle in Agile. Where a bug lives depends on its severity and when it is found:
- Fix in sprint: P1 bugs (system crash, data loss, critical feature broken) found during the current sprint should be fixed before the sprint closes, even if that means reducing scope. A sprint that ships with a known P1 is a failed sprint.
- Next sprint backlog: P2 bugs found late in the sprint that would require significant rework to fix in the current sprint. Add to the product backlog and prioritise for next sprint. Inform the PO for priority decision.
- Tech debt backlog: Minor bugs, UX issues, performance edge cases that do not affect primary user journeys. Log, label as tech debt, and address in a dedicated tech debt sprint or alongside related features.
Severity classification in Agile context:
- P1 — Blocker: Core user journey broken, no workaround. Sprint must not close without resolution.
- P2 — Critical: Major feature broken, workaround exists but is painful. Address before next release.
- P3 — Major: Feature partially broken, workaround is acceptable. Address within two sprints.
- P4 — Minor: Cosmetic issue, edge case, low-impact. Log and address opportunistically.
Quality Metrics in Agile
Agile QA metrics should be actionable and sprint-cadenced — not annual or quarterly reports, but weekly indicators that guide next sprint decisions:
- Test coverage %: Percentage of stories with documented acceptance criteria tested. Target 100% for in-scope stories.
- Bug escape rate: Bugs found in production divided by bugs found in QA. Should trend toward zero. Any bug found in production is a QA process improvement opportunity.
- Bugs found per sprint by severity: Track P1/P2/P3/P4 counts. A spike in P1 bugs suggests a systemic issue — rushed development, unstable environment, inadequate unit testing.
- Automation coverage growth: Percentage of regression covered by automation, tracked sprint-over-sprint. Target: grow 5–10% per sprint in early automation maturity phases.
- Mean time to detect (MTTD): Average time from a bug being introduced to it being found in QA. Lower MTTD comes from shift-left practices and faster CI runs.
- Test execution time (CI pipeline): How long the full test suite takes to run. If this exceeds 15 minutes, developers are not running the suite before committing — invest in parallelisation or test optimisation.
Mob Testing
Mob testing extends the bug bash concept into a structured, facilitated session where the entire team tests together simultaneously. Unlike a bug bash (which is event-based and often pre-release), mob testing can be run at any point in the sprint and has a stronger facilitation structure.
Setup for a mob testing session:
- One driver operates the keyboard and mouse, following the mob's direction
- One navigator directs the driver — "navigate to the settings page, change the notification preference"
- All others observe, note findings, and suggest next actions
- Roles rotate every 15 minutes
- All bugs are filed in real time by observers
Mob testing benefits: knowledge sharing is the primary value — junior team members observe expert testers' mental models in action; developers see how their features are used in ways they did not anticipate; product owners see real user journey friction they had not considered. The secondary benefit is bug finding — mob testing typically has a 20–30% higher bug-find rate per hour than individual testing on the same feature.
Waterfall QA vs Agile QA vs DevOps QA
| Dimension | Waterfall QA | Agile QA | DevOps QA |
|---|---|---|---|
| When testing happens | After development phase completes — one test phase | Throughout the sprint — parallel with development | Continuously — automated testing in every CI pipeline run |
| Documentation | Extensive — formal test plan, test cases, sign-off documents | Lightweight — acceptance criteria, DoD, sprint test report | Minimal formal docs — test code is the documentation |
| Release cadence | Months — big-bang releases after full test cycle | Sprints (2 weeks) — potentially releasable each sprint | Continuous — multiple releases per day in mature teams |
| Automation reliance | Low — manual testing dominates | Medium-High — automation for regression, manual for new features | Very High — automation is the primary quality gate |
| Bug discovery timing | Late — bugs found weeks after code written | Fast — bugs found same sprint as development | Very fast — bugs found within hours of commit |
| QA's relationship to dev | Separate phase — QA receives a build from dev | Collaborative — QA and dev work in same sprint | Integrated — QA practices embedded in dev workflow |
Best Practices for QA in Agile
- Be in the Three Amigos: If your team does not run 3As sessions, advocate for them. QA participation at story conception reduces rework more than any other single practice.
- Challenge every story without AC: No acceptance criteria means the story is not ready for development. Do not let it enter the sprint. This is a quality advocacy act, not obstruction.
- Automate within the same sprint as development: If a story is developed in Sprint 5, the automation should be in Sprint 5 or at latest Sprint 6. Automation debt compounds rapidly — stories from six months ago are much harder to automate because the context is gone and the codebase has changed.
- Make the Definition of Done visible: Post it on the team wiki, in the Jira board, in the team Slack channel. If developers do not know what "done" means, they cannot meet the standard.
- Treat automation failures as P1 bugs: A failing automated test that the team ignores is worse than no automated test. Address failures immediately — triage as genuine failure (bug) or flakiness (engineering debt to fix).
- Communicate in sprint, not at sprint end: If QA is blocked — environment down, story not testable, bug blocking further testing — surface it in standup, not in the sprint review. Surprises at sprint review indicate a communication process problem.
Back to Blog