Fixing unreliable dark mode tests in Cypress

Photo by Viral Chaudhari on Pixahive

While learning Cypress by implementing end-to-end tests for this website, I encountered an interesting problem: my dark mode tests would fail after passing earlier on the same day. I was creating a test to check that the dark mode toggle button worked correctly. I did this by checking that the dark mode class was added to the html element when the toggle button was clicked.

it('should be able to toggle dark mode', () => {
  cy.visit('/')

  cy.get('html').should('not.have.class', 'dark')

  const darkModeButton = cy.get('[aria-label="Toggle dark mode"]')
  darkModeButton.click()

  cy.get('html').should('have.class', 'dark')

  darkModeButton.click()

  cy.get('html').should('not.have.class', 'dark')
})

This Cypress test snippet should be pretty self-explanatory.

Some background on dark mode in Tailwind CSS

My dark mode toggle button is a button that changes the theme of the website from light to dark and vice versa. It uses the prefers-color-scheme media query to determine the default theme. If the user has set their operating system to use dark mode, the website will use dark mode by default and vice versa for light mode.

As an example, here's how this works using the simple CSS below.

body {
  background-color: white;
}

@media (prefers-color-scheme: dark) {
  body {
    background-color: black;
  }
}

In Tailwind CSS, there are a few methods to achieve this. I used the class mode to manually toggle it with JavaScript. For more on this, read their documentation. To manually toggle the dark mode class, I used the following JavaScript in the _document.tsx file in my Next.js project. Here's a simplified version of it.

const darkModeMediaQuery = window.matchMedia('(prefers-color-scheme: dark)')

function updateMode() {
  const isSystemDarkMode = darkModeMediaQuery.matches

  if (isSystemDarkMode) {
    document.documentElement.classList.add('dark')
  } else {
    document.documentElement.classList.remove('dark')
  }
}

updateMode()
darkModeMediaQuery.addEventListener('change', updateMode)

Note: This snippet ignores the user's manual preference and only uses the system preference. The full version is a bit more complex and is required for the toggle button to work correctly and make the Cypress test pass. You only need to understand this piece if you want to understand the rest of the article.

The problem

While writing this test, it would pass as you would expect. Later that day, I made a few small changes to the website and pushed them to GitHub. My CI/CD pipeline with GitHub Actions kicked in and ran the tests. The test failed. I was confused, because I hadn't changed anything that would affect the test. I ran the tests locally and they suddenly started failing too. I was even more confused. They worked earlier that day didn't they!?

It turns out that the test was failing because the system dark mode preference was set to dark. In the time that passed since I created the test and when my small updates were pushed to GitHub, my operating system had switched to dark mode. This meant that the test was failing because the dark mode class was already present on the html element when the test started. I'm not sure how I overlooked this glaring issue, but I did.

cy.visit('/')

// The control assertion failed, not even reaching the toggling section.
// The dark mode class is already present on the html element
// because the system dark mode preference is set to dark
cy.get('html').should('not.have.class', 'dark')

The solution

I needed to find a way to set the system dark mode preference to light before the test started. I started by looking at mocking out the time of day using Cypress clock methods since this is the issue I was facing. I took a few minutes to look at the documentation before stopping to think, "What am I really trying to test here?".

I'm not trying to test if the website theme changes to dark mode based on the time of day, the operating system already sets this for me. I'm trying to test if the dark mode is applied if the users system preference is set to dark mode. The test should have nothing to do with the time of day.

The real issue was that the test was failing because the dark mode class was already present on the html element due to prefers-color-scheme: dark. I needed to find a way to set the system dark mode preference to light before the test started. Luckily, Cypress makes this pretty easy by using stubs to mock out the matchMedia method on visiting the website.

// Change the visit method to use the onBeforeLoad option
// noinspection JSLastCommaInObjectLiteral

cy.visit('/', {
  onBeforeLoad(win) {
    cy.stub(win, 'matchMedia')
      .withArgs('(prefers-color-scheme: dark)')
      .returns({
        matches: false,
        addEventListener: () => {},
      })
  },
})

This stubs the matchMedia method and returns a value of false for the matches property, forcing the website into light mode. Would you look at that, the test passes! The test is now testing what it should be testing, not the time of day, and it's no longer flaky.

Since there are two branches of logic that the website can take, I added a second test to cover the other branch. The final step remaining was to add a second test where the system starts off in dark mode and the website should switch to light mode. This was achieved by just changing the matches property to true in the stub for the second test.

Other than that, the test does exactly the same thing as the first test, click on some buttons and check the html element for the correct class.

Takeaway points

Flaky tests are a pain to deal with and can be hard to track down. However, planning out your tests can help you avoid this. Most importantly, you have to think carefully about what you're trying to test and what you're actually testing.

If I wrote the test mocking out the time of day and it passed, it would have given me a false sense of security about the actual functionality of the website. I hope this article inspires you to put extra thought into not only what you're testing, but also how and why you're testing it.

A useless test is worse than no test at all.

Resources