Eliminating flaky Cypress E2E tests and reducing CI times by 70%
Customer Case Study: Optimizing Flaky Cypress E2E Tests and Improving Suite Performance
End-to-end (E2E) testing is an essential part of modern software development, ensuring that user flows work as intended.
However, when these tests are flaky, meaning their outcomes are inconsistent, it can erode trust in the suite and make continuous integration (CI) pipelines unreliable.
In this case study, we detail how we tackled a Cypress E2E test suite with 40% flakiness and dramatically improved its reliability and execution time at a client we worked with recently.
Starting Point: A Flaky End-to-End Test Suite
The test suite in question suffered from flakiness, with approximately 40% of tests failing unpredictably. This made it almost impossible to pass CI reliably. Upon analysis, we discovered a common culprit: the overuse of cy.wait()
.
Here’s a simplified example of the problematic pattern:
cy.visit('/dashboard');
// arbitrary wait time to guess how long the backend call will take
cy.wait(2000);
cy.get('.user-card').should('be.visible');
cy.intercept('GET', '/api/v1/data').as('getData');
// Waiting for the backend call to resolve
cy.wait('@getData');
cy.get('.data-card').should('be.visible');
Using arbitrary values for the wait time, we can clearly identify as problematic because acutally we do not know how long the backend call will take.
But why is the second approach also problematic? I mean, we wait as long as the backend call is finished, right? This is not the case! While it is true that we try to wait until the backend call is finished, we can actually still run into timeouts. This was exactly the problem we were facing at a client.
The problem was even amplified because some environments were running on slow networks, which made the backend calls take much longer than they where supposed to.
The Solution: Think more like a user
The solution is to think and act more like a user when writing the End-to-End tests. Does a user spy at the network tab and check if a certain HTTP-request is finished? No! They actually usually don’t even know that any request is done. What do they do instead? They check for actual progress in the UI: loading spinners popping up, elements getting disabled/enabled, etc.
And this is exactly how we can get rid of cy.wait()
and making a good step towards a more reliable test suite.
So we can actually rewrite the above example like this:
cy.visit('/dashboard');
cy.get('loading-spinner').should('exist');
cy.get('loading-spinner').should('not.exist');
cy.get('.data-card').should('be.visible');
This is much more reliable because we are not waiting for the backend call to finish, but rather for the UI to update. It also reflects more the actual user flow, which is what we want to test.
However, this alone won’t solve the problem completely. We should also make sure that we have quite fine-grained and frequent assertions following along the user flow. This means that if we have some intermediate HTTP-requests that are relevant for the test case, we should try to do assertions in between to our “final assertions”, to make sure that we minimize the risks for timeouts.
Another measure which should be taken into account is to increase the timout for Cypress. This is especially needed if you have CI environments that are running on (much) slower networks than your actual production environment.
Reducing CI-times
As side effect we could also reduce execution time in CI by roughly 70% of the whole test suite.
We did no special tricks there, just some optimization on how the test cases are structured. We saw that there were a lot of “e2e unit tests” how we call it: so a test which is basically like a unit test covering a single thing (like a modal is open) but with the overhead of an e2e test.
We eliminated all these cases by just including all the cases (wehere it made sense) into an existing e2e test case.
The result were way less tests and faster tests overall. Also we follow more a general guidance on e2e tests: they should cover whole user flows and are typically bigger (see https://docs.cypress.io/app/core-concepts/best-practices#Creating-Tiny-Tests-With-A-Single-Assertion for reference)
Also we emphasized to create reusable functions which return the selector for elements make tests more readable and maintainable.
The results
- With the measures taken above, we ware able to reduce the execution time of our test suite by roughly 70%.
- We could remove the flakyness completely 🎉.
- Due to readable selector functions, we could also increase the readability of the tests.
What we learned
- Replace wherever possible cy.wait() with a respective assertion that e.g. some ui elements are visible/interactive. Cy.wait() where in this case the biggest source of flaky e2e tests
- Do many assertions, especially intermediate assertions on the way to the actual assertions your test case is about. The intermediate assertions will improve the stability of your e2e tests because it is less likely to run in timeouts
- Having less but bigger e2e tests can have a gigantic impact on total execution time because every e2e test has some overhead to start the application
- Cypress not detecting fast responses < 50ms, see GitHub issue https://github.com/cypress-io/cypress/issues/30599
- You might need to increase cypress timeout if you have long running backend calls or slow CI environment
- It might not be possible to get rid of all flakiness, but we should strive to minimise it
- When debugging locally your cypress tests with cypress studio try to emulate slow networks as well to check timeout behaviour