Dealing with cryptic Selenium::WebDriver::Error::InvalidSessionIdError errors

We believe in delivering high quality software for our clients, so it probably comes as no surprise that we subscribe ourselves to numerous best practices such as test driven development, and we adopt a strong testing culture.

On one project, we started getting a very cryptic error message when we ran the test suite:

Selenium::WebDriver::Error::InvalidSessionIdError:
           invalid session id

Frustratingly, this was also an intermittent issue  -  sometimes the test suite would fail and sometimes they wouldn’t.

This project in particular used Ruby on Rails, Rspec and Capybara. We run this project, like many of our projects in a Docker Compose stack.

The Culprit: Shared Memory (/dev/shm)

In Linux operating systems there is a shared memory space called /dev/shm. Any Linux process can create a partition within /dev/shm if the process wants to share memory with another process. This is often done to improve performance of similar processes.

The shared memory space is often used by web browsers such as chrome when they’re being orchestrated by a selenium web driver. In our case that was exactly what was happening.

Our test suite is made up of a blend of unit tests, integration tests and feature tests. For the feature tests we use Capybara to orchestrate a headless Chrome browser using selenium. This enables our feature tests to emulate the actions of a user as closely as possible, utilising the full application stack.

We’re running all of this within a Docker Compose stack, and the problem here is that by default a docker container only has 64MB available for /dev/shm.

As a result, the chrome headless browser was needing more than 64MB of shared memory on more complex pages (often pages with more javascript/interactive functionality). So we were seeing the cryptic error because the chrome process was crashing when it tried to use more memory than what was available.

Selenium::WebDriver::Error::InvalidSessionIdError:
           invalid session id

As a quick sanity check of this hypothesis, we used a volume mount to mount /dev/shm on the host machine to /dev/shm on the container. We ran the test suite numerous times, and we didn’t get a single failure.

Confident that we’d identified the issue we provided to specify a more sensible `shm_size` in our compose file.

version: '3.8'
services:
  ...
  web:
    shm_size: '256mb'
    ...
  ...

The end result, no more flaky tests caused by the browser not having enough memory!