For stable end-to-end (E2E) tests, we need an environment that is as isolated from the outside as possible.
Flaky tests are tests that fail for reasons unrelated to your code. They make it difficult to use E2E as a reliable check for the correctness of the application. In extreme cases, flaky tests will teach your team to ignore E2E results. This can kill the effort of automating quality control (QA).
The approach presented here addresses two big sources of potential issue in your test runs:
- state stored by the backend shared by different test jobs, and
- external services that you have no control over.
There are other sources of flakiness not affected by this approach:
- the true issue is in the application that appeared randomly, or
- issues in the application caused by the unnatural speed at which E2E interacts with it.
Before my team and I ran my tests against a dedicated backend container, we used one of our non-production servers. This approach was fine at the experimentation stage of our E2E solution. When there were only a few tests, we could not use the test results for making decisions anyway. However, as we continued to add more tests and generated longer execution times, this approach started falling apart. The main issues were as follows:
- Even though we tried to revert the data changes inside the test, some test failures were leaving behind unexpected changes.
- Parallel jobs were colliding. Different tests jobs were changing the same states, which often caused one of the jobs to fail.
A solution to this problem was to dedicate a separate backend server and database for each test job. This approach would be very challenging if it weren’t for Docker. Docker containers are a perfect tool to create a contained environment with everything that is needed by an application to run:
- the right operating system (or, rather, the right Linux distribution),
- the system dependencies, such as image manipulation libraries etc., and
- the correct version of language interpreters (Python, Node, etc.) or database servers.
For your test, you can prepare a dedicated database container that comes with predictable test data. In this way, you will be able to reproduce exactly the starting point in each E2E execution—making your tests more stable. You can use different tags for your Docker image, for versioning of the test database. The same test database can be used in a development environment as well. For manual tests in development, you need similar example entities as for automated tests.
If you already used Docker for deploying your backend, it will be pretty easy to reuse the same image for running your E2E. In my team, we deploy a backend as containers, and we provide database URL and credentials as environment variables. The very same container version can be deployed in production or used in continuous integration (CI) for running tests—each environment provides the right values for connecting to the DB.
Depending on your deployment strategy, you could do one of the following:
- Use the containers you build as a part of the frontend build.
- Get the compiled files and make sure they are available via HTTP for the tests.
In our case, we use option 2: we deploy the application as static files, so we just created a dedicated container to serve the built files during E2E job runs.
Job services in GitLab
We use GitLab as a platform to run our CI. Each job in GitLab is run inside a container with an image of your choice. Besides the main container, you can define services: additional containers running alongside your tests. The configuration is as easy as:
<job-name>: services: - name: <image> alias: <container-url>
The available options are similar to what you have in Docker Compose, but they're more limited.
One “gotcha” in the GitLab configuration is to set variable
FF_NETWORK_PER_BUILD to 1 if you want to allow services to access each other during the test run.
Consider ad hoc data for in-job isolation
At some point, we were running all the tests in parallel inside one job. At that time, it was necessary to enforce even stronger isolation—each test was using the same backend and database. To work around this issue, we upgraded our tests to depend primarily on the random data we inject just inside the
before section of the tests. This allowed for tests to run unaffected by other changes happening in other threads. This approach can be a bit tricky at first, but it can make sense depending on your circumstances.
Clean up after each test
Even though we start a fresh database for each test job, we are still trying to make our tests leave the application in the same state as they found it. Maybe it’s a bit of a leftover from the period when we were running tests in a shared environment. It’s not crucial anymore, but it can still help during test development in the following cases:
- When you run only one test—so the state the test encounters is not different from when you run all the tests in the file
- When you rerun the same test locally over and over again—so the database is not affected by the earlier runs
Mock external service
There are cases when moving services to a container is not an option. For example:
- if there are external servers that the application uses directly or via some backend proxy, or
- you own servers that are not possible to run inside a container because of technical issues.
In both cases, for isolating the test runs, you can mock the requests that go to those services. This will keep unpredictable external services from affecting your test results. A downside to this approach is that your tests will disconnect from the context in which your applications operate. With mocks in place, your tests will not catch cases when changes in those services affect your application.
Keep on learning
If you are interested in learning more about testing or other programming related topics, you can sign up here to get updates when I publish related content.