Complete your Go Testing Pyramid: Single-Service E2E Tests

An often overlooked level of testing in the pyramid is the one between e2e full scale tests and component integrations tests. They could be called single service e2e tests, or they could be defined as integration tests (as they test the integration of service components). In my opinion this intermediate level of testing strikes a good balance between thoroughness and speed, and when implemented in Go they consist of building the binary as close as possible to production and to execute it locally, with local or mocked external dependencies (databases, apis, …).

I will demonstrate how to implement single service end-to-end (e2e) tests in a Go application. It utilises Go's built-in testing capabilities alongside some utility from the Ginkgo ecosystem for building, running and interacting with the application binary during testing.

NB: the following is a personal, incomplete and probably incorrect definition of various types of tests.

Unit Tests

These tests are characterized by their speed, ease of implementation, low cost of execution, and a correspondingly lower level of confidence in the overall system's correctness. Unit tests focus on verifying the behavior of individual components or functions in isolation, without dependencies on other parts of the system or external resources.

Especially when applying TDD unit tests are an integral part of the development process, providing rapid feedback on code changes. While essential for catching localized bugs and ensuring the correct functioning of individual units, their isolated nature means they offer limited insight into how these units will interact within the larger application.

Integration tests

Integration tests verify how multiple components work together. These tests typically utilize part of the application's real wiring (assuming dependency injection), while mocking or excluding the most complex and expensive integrations or components.

They are crucial for ensuring that the integration and configuration of various components are thoroughly tested. Reusing a portion of the application's actual wiring is vital to ensure that the genuine application setup is being tested, rather than just mocks.

Full-scale end-to-end (E2E)

Full-scale end-to-end (E2E) tests validate the entire system, simulating real user scenarios. They are slow, complex to implement, and costly but could provide the highest confidence and verify integrations, detect system-wide issues, validate user journeys, and provide stakeholder confidence. While crucial, their slowness and complexity require a strategic approach, focusing on critical user paths and complementing lower-level tests. Over-reliance on E2E tests can slow feedback loops and increase costs.

Single service e2e test

Single-service end-to-end (E2E) tests offer a valuable balance within the Go testing pyramid. Compared to broader, multi-service E2E tests, they are significantly faster and cheaper to execute, while still providing a high degree of confidence in the functionality and integration of a single service's components. Although more complex to set up than unit and integration tests, they offer a more holistic view by verifying interactions between different layers within the service, such as handlers, business logic, and data access.

However, it's important to acknowledge that single-service E2E tests do not provide full system-wide confidence, as they do not validate interactions with other dependent services. Their relatively contained scope allows them to be run efficiently on developer’s machines, facilitating quicker feedback loops during development. If these tests can be made sufficiently fast, they can even be effectively integrated into a Test-Driven Development (TDD) workflow, guiding the design and implementation of the service. While requiring more initial effort than unit tests, the enhanced confidence and integration validation offered by single-service E2E tests make them a crucial layer in a comprehensive testing strategy.

The test subject

A simple web service, that upon starting prints on stderr Starting server on :8080 and that exposes a single endpoint /health that returns a response with status code 200 .

The test plan

We are going to list what are the cases that we want to test on our subject:

  • The service can be built
  • The service starts
  • The response for a GET request to /health has status code 200
  • The service shuts down when receiving an interrupt signal

The service can be built

We are going to use gexec.Build from Ginkgo to build a binary of the project, pathToBin will be the path of the built executable.

pathToBin, err := gexec.Build("github.com/carlo-colombo/test-pyramid")
if err != nil {
  t.Fatalf("build failure: %s", err)
}

Using the gexec package solves a bunch of problems such as identifying the go binary, managing a temporary path where the compiled artefact can live, and later on cleaning all up.

The service starts

We again leverage gexec to simplify service running, directing stdout and stderr to GinkgoWriter . This io.Writer implementation captures all command output, displaying it during verbose testing ( -test.v ) or upon test failure, prefixed with [out] or [err] for easier debugging.

Subsequently, a deferred function ensures the termination of all processes and the removal of generated artefacts. This cleanup mechanism is crucial for preventing the accumulation of unnecessary files and processes on the test execution environment, and to avoid misleading errors (e.g. a mismatch between the test and the version running).

cmd := exec.Command(pathToBin)

session, err := gexec.Start(cmd,
  gexec.NewPrefixedWriter("[out] ", core.GinkgoWriter),
  gexec.NewPrefixedWriter("[err] ", core.GinkgoWriter))

defer func() {
  gexec.Kill()
  gexec.CleanupBuildArtifacts()
}

if err != nil {
  t.Fatalf("service failed to run: %s", err)
}

The response for a GET request to /health has status code 200

Check if the service is up and running, sending multiple requests until we have an answer from the servers. Then we ensure that we stopped checking because of an answer from the server and that we have a 200 back.

assert.EventuallyWithT(t, func(c *assert.CollectT) {
	resp, err := http.Get("http://localhost:8080/health")

	require.NoError(c, err, "cannot perform request")
	require.Equal(c, 200, resp.StatusCode, "expected 200 status code")
}, 2*time.Second, 10*time.Millisecond, "service did not answer in time")

The service stops

Now we need to check that the service stops cleanly, session.Exited is a channel that is notified when the program exits.

session.Interrupt()
assert.Eventually(t, func() bool {
select {
		case <-session.Exited:
			return true
default:
	return false
		}
}, 2*time.Second, 10*time.Millisecond, 
"process did not exit in time after interrupt")

And do not forget your coverage

Instrumenting your test code and measuring code coverage are vital practices that complement single-service E2E tests. Code coverage helps identify untested areas, revealing gaps in your test suite and ensuring that critical parts of your application are exercised. This insight is crucial for maintaining high code quality and reducing the risk of regressions, ultimately boosting confidence in your service's robustness.

Starting with Go 1.20, it is possible to collect coverage profiles for larger integration tests: more heavy-weight, complex tests that perform multiple runs of a given application binary.

Instrumenting your subject test binary is straightforward: simply add -cover to gexec.Build, which accepts a slice of build parameters. When the executable runs, it expects a GOCOVERDIR environment variable. This variable should point to the directory where code coverage reports will be stored upon the binary's exit; a warning will be issued if it's not set.

One thing to notice is that the coverage data is only created when the executable exits cleanly , invoking os.Exit() or returns normally from main.main.

gexec.Build("github.com/carlo-colombo/test-pyramid", "-cover")

When the binary is executed with GOCOVERDIR set to an existing and writable directory, Go generates coverage profiles in a binary format. These profiles can then be used to generate a report using Go's covdata tool.

❯ GOCOVERDIR=.coverdir go test .
❯ go tool covdata func -i=$GOCOVERDIR

github.com/carlo-colombo/test-pyramid/main.go:18:       healthHandler   100.0%
github.com/carlo-colombo/test-pyramid/main.go:24:       main            89.5%
total                                                   (statements)    90.9%

To achieve coverage profile alongside unit tests, execute go test ./… -test.gocoverdir="$PWD/.coverdir". Note: due to the behavior of Go's testing tools, it is not possible to generate coverage for both binaries and unit tests simultaneously. To generate reports html reports a 2 steps process is required: first, create a cover profile in text format using go tool covdata textfmt -i=$GOCOVERDIR -o=c.out, and then generate the HTML report with go tool cover -html=c.out.

Putting all together

We can consolidate everything into a single file, employing utility functions to minimise redundancy and enhance readability. The test is organised using Go's subtesting feature to clearly delineate its individual steps.

package main_test

import (
	"net/http"
	"net"
	"os"
	"os/exec"
	"strings"
	"testing"
	"time"

	"github.com/onsi/ginkgo/v2/dsl/core"
	"github.com/onsi/gomega/gexec"
	"github.com/stretchr/testify/assert"
	"github.com/stretchr/testify/require"
)

func Test_myservice(t *testing.T) {
	var session, cleanup = buildAndRun(t)
	defer cleanup()

	t.Run("answers to the health endpoint", func(t *testing.T) {
		assert.EventuallyWithT(t, func(c *assert.CollectT) {
			resp, err := http.Get("http://localhost:8080/health")

			require.NoError(c, err, "cannot perform request")
			require.Equal(c, 200, resp.StatusCode,
"expected 200 status code")
		}, 2*time.Second, 10*time.Millisecond,
"service did not answer in time")
	})


	t.Run("shuts down with interrupt signal", func(t *testing.T) {
		session.Interrupt()
		assert.Eventually(t, func() bool {
			select {
			case <-session.Exited:
				return true
			default:
				return false
			}
		}, 2*time.Second, 10*time.Millisecond,
"process did not exit in time after interrupt")
	})
}

// build and run the test subjects
// returns a session that could be used to interact with the service (stdin) or
// to check the output. It also returns a cleanup function that should be deferred at the
// end of the test cases.
func buildAndRun(t *testing.T) (*gexec.Session, func()) {
	t.Helper()

	//change the following to your root module
	pathToBin, err := gexec.Build("github.com/carlo-colombo/test-pyramid", "-cover")

	require.NoError(t, err, "failed to build the service")

	cmd := exec.Command(pathToBin)

	//passing additional environment variables when running the binary
	cmd.Env = append(cmd.Env, os.Environ()...)

	session, err := gexec.Start(cmd,
		gexec.NewPrefixedWriter("[out] ", core.GinkgoWriter),
		gexec.NewPrefixedWriter("[err] ", core.GinkgoWriter))

	require.NoError(t, err, "failed to start service")

	return session, func() {
		gexec.Kill()
		gexec.CleanupBuildArtifacts()
	}
}


Conclusions

Confidence

Adding e2e tests to your service improves confidence that your binary is correctly built, it runs, and that the components are integrating correctly. Running locally speeds up the feedback loop, allowing to catch issues earlier than when running full scale e2e tests.

The missing parts (draw the rest of the owl)

The test expects the service starting on port 8080, a more robust approach would be to start on a random port or to identify an available port and start listening on that port.

The example is minimal and the service and endpoint do not reach a database or an external API, but real-world applications require them. In a more complex application, stub api services and database should be provided for the e2e test to work. Starting end-to-end testing early allows for iterative development, where dependencies are added incrementally, preventing the complexity of integrating everything at once.

Variations in how the binary is built by end-to-end (E2E) tests (e.g., via gexec.Build) compared to the CI/CD pipeline can lead to subtle bugs. This also necessitates keeping the build commands synchronized, for instance, by adding flags or environment variables to both. A more robust solution, especially when the binary's build process requires additional configuration, involves reusing existing scripts or tasks (such as Makefiles, Taskfiles, or other scripts) that are already employed to create the final production artifact.

Additional resources