Flaky Test Management with Cypress

Intermediate Workshop

Cecelia Martinez

Technical Account Mgr

@ceceliacreates

Getting Started

  1.  Clone repository OR Download ZIP of flake-demo branch
  2.  In project directory, run `yarn install`
  3.  Once installation is complete, run `yarn dev`. This will start the app on your system on localhost:3000
  4. In a new terminal window, run `yarn cypress:open`. This will open the Cypress Test Runner.
  5. Raise your hand in Zoom once you are done!

Getting Started

Delete/comment out line 204 in cypress/support/commands.ts

What is Flake?

What is a flaky test?

A test is considered to be flaky when it can pass and fail across multiple retry attempts without any code changes.

 

For example, a test is executed and fails, then the test is executed again, without any change to the code, but this time it passes.

✅ ? ❌

Flaky Test in Cypress Real World App

github.com/cypress-io/cypress-realworld-app

// Backend source code for "like" functionality
// creates an arbitrary delay on the server to simulate inconsistent response times

   const delay = Math.random() * 5500;
   setTimeout(() => {
     res.sendStatus(200);
   }, delay);

Test case: User A likes a transaction of User B; User B gets notification that User A liked transaction

✅ ? ❌

cy.contains(likesCountSelector, 0);
cy.getBySelLike("like-button").click();
// a successful "like" should disable the button and increment
// the number of likes
cy.getBySelLike("like-button").should("be.disabled");
cy.contains(likesCountSelector, 1);

Test case: User A likes a transaction of User B; User B gets notification that User A liked transaction

✅ ? ❌

Test case: User A likes a transaction of User B; User B gets notification that User A liked transaction

✅ ? ❌

How Flake Impacts a Test Suite

  • Longer deployment process
    • Have to rerun tests or restart CI builds if flake causes a failure
  • Longer debugging process
    • Have to determine if a failure is a "real" failure or flake
  • Reduced confidence
    • Do failures actually represent regressions?
    • Is flake hiding underlying issues in my application or test suite?

Causes of Flake

Flake is caused when:

  • The functionality you are testing is inconsistent
  • Something unrelated to the functionality you are testing is inconsistent

DOM-related Flake

  • Occurs when there is inconsistency with how elements are rendered to the DOM (or how quickly they are rendered)

DOM-related Flake

Examples:

  • Sometimes an element loads within the cy.get() timeout, other times it doesn't
  • Cypress gets an element, but when it tries to click, the element is disabled because the state hasn't updated yet
  • The DOM re-renders between a cy.get() and an action command, causing the selected element to be detached from the DOM
  • Cypress types in an input field, but the application is slow to process the keypress events and the field value doesn't update completely before clicking 'Submit'

Network-related Flake

  • Occurs when a network request responds inconsistently, either an internal API or a a third-party or serverless endpoint

Network-related Flake

Examples:

  • A slow API response results in a DOM element loading outside of the default timeout (like in our example)
  • A slow API response from a third-party login provider causes tests to fail
  • A microservices endpoint has a cold start/start delay, so it fails the first time but passes on subsequent attempts

Environment-related Flake

Examples:

  • Inaccessible or inaccurate environment variables
  • Running tests across different size machines
  • A failed dependency install in the environment
  • Inconsistent data across staging/QA/dev environments

Managing Flake

Network-related Flake

Test-writing Best Practices

🧠 cy.intercept() can spy and stub all network requests

💡Wait for long network requests before proceeding

cy.intercept('GET', '/api/accounts/*').as('getAccount')
cy.visit('/accounts/123')

// Wait up to 30 seconds for request to complete
cy.wait('@getAccount', { timeout: 30000 })
cy.get(selector)
.should('have.text', accountName)

💡Stub inconsistent or unneeded network requests

// Intercept and stub the response as successful

cy.intercept({ method: 'POST', url: '/inconsistent' }, { success: true })

Declaring a spy with cy.intercept()

  • Can pass a URL, method, or routeMatcher
  • If no method is passed, ANY method types will match
cy.intercept(url)
cy.intercept(method, url)
cy.intercept(routeMatcher)

// with routeMatcher
cy.intercept({
  url: 'http://example.com/search*',
  query: { q: 'expected terms' },
}).as('search')
  • routeMatcher is an object used to match which HTTP requests will be handled
  • All properties are optional
    • auth, headers, hostname, https, method, middleware, path, pathname, port, query, times, url

Aliasing a spy with .as()

Save the intercepted request to use throughout your test code

cy.intercept({
  url: 'http://example.com/search*',
  query: { q: 'expected terms' },
}).as('search')

Waiting for a request with .wait()

After declaring an intercept, use its alias to wait for the request to occur in your test code

cy.intercept('GET', '/users').as('getUsers')

// test code

cy.wait('@getUsers')

// test code continues after request occurs

Flake is caused when:

  • Something unrelated to the functionality you are testing is inconsistent

If a problematic network request is unrelated to the functionality you are testing, stub it out.

Activity 1: Slow network response

(~15 minutes)

  • This test is intentionally flaky because the "like" API has an inconsistent response time
    • Run the test until failure and note the command that fails
    • Refactor the test so it is resistant to flake using the following strategies:
      • Option 1: Increase the default timeout on the flaky command
      • Option 2: Identify the slow POST request and wait for it to occur using cy.intercept() and cy.wait()
        • Hint: The code causing the flake is in backend/like-routes.ts

DOM-related Flake

Cypress Test Runner

Query Command Retryability

it('adds two items', () => {
  cy.get('.new-todo')
    .type('todo A{enter}')
    .type('todo B{enter}')

  cy.get('.todo-list li')
    .should('have.length', 2)
})

Cypress Test Runner

Query Command Retryability

cy.get('.todo-list li') // command
  .should('have.length', 2) // assertion

✅                         ❌

If the assertion that follows the cy.get() command fails, then the cy.get() command will requery the application's DOM again, and again until the cy.get() command timeout is reached.

If the assertion that follows the cy.get() command passes, then the command finishes successfully.

Cypress Test Runner

Query Command Retryability

Test-writing Best Practices

🧠 Only the last query command is retried

💡Use a single query command instead of chaining

-- cy.get() not retried

cy.get(selector).contains(text)
.should()

-- Entire query is retried

cy.contains(selector, text)
.should()

💡Alternate commands and assertions

-- Test will retry the cy.get() until assertion passes
cy.get(selector)
.should('have.length', 3)

-- Now we know we have the correct element before proceeding
.parent()
.should('contain', text)

Test-writing Best Practices

Activity 2: Detached from DOM

(~15 minutes)

  • This test is intentionally flaky and demonstrates an example of when an element can become detached from the DOM
    • Run the test until failure and identify which command fails
    • Note that there is a network request occurring before the failed command. This request is causing the DOM to refresh after we get the element.
    • Refactor the test to be flake resistant using cy.intercept() and cy.wait()
  • Leverage https://www.cypress.io/blog/2020/07/22/do-not-get-too-detached/#detached-elements if needed

All types of Flake

Test Retries

Will retry an individual test a specified number of times before failing

// cypress.json
{
  "retries": 1
}

// configure by mode
{
  "retries": {
// Configure retry attempts for `cypress run`
  "runMode": 2,
// Configure retry attempts for `cypress open`
  "openMode": 0
  }
}

Cypress Dashboard Flake Detection

  • Identifies flaky tests AND flake severity based on number of times a test is retried
  • Test retries must be enabled in the Cypress Test Runner to use
  • See artifacts from failed attempts on flaky runs
  • See most common errors, historical flake and fail rates, and test definition for flaky test cases with Analytics

Cypress Dashboard Flake Management

Coming Soon: Test Burn-In

 

  • Identify if new tests are flaky before introducing them to your test suite
  • Cypress Dashboard will identify new tests and automatically retry multiple times to check for flake
  • Fix flaky test cases before they become an issue

GitHub & Slack Flake Alerts

Cypress CI Recommendations

Parallelization recommendations in Cypress Dashboard

Test Data Management Best Practices

  • Tests should be independent of each other
  • When possible, use seeded data
  • When possible, leverage an endpoint specifically for test data
  • When possible, use your API or front-end state management to set and clean up data for tests
  • When possible, leverage network stubbing and fixtures when you don’t need to hit your real API for testing

Q&A

Flaky Test Management with Cypress

Intermediate Workshop

Cecelia Martinez

Technical Account Mgr

@ceceliacreates

Flaky Test Management with Cypress Workshop TestJS Summit

By Cecelia Martinez

Flaky Test Management with Cypress Workshop TestJS Summit

This workshop is for Cypress users who want to step up their game against flake in their test suites. Leveraging the Cypress Real World App, we’ll cover the most common causes of flake, code through some examples of how to make tests more flake resistant, and review best practices for detecting and mitigating flake to increase confidence and reliability. (Video requires conference subscription, slides public)

  • 2,312