Automate testing of poor network conditions with Shopify's Toxiproxy in Go

Posted on Sunday, 5th December 2021

What is Toxiproxy?

Toxiproxy is a TCP Proxy built by Shopify and written in Go designed for simulating a wide range of network conditions.

Built specifically to work in testing, continuous integration and development environments, Toxiproxy’s strength comes from the fact that it provides a dynamic API that allows you to easily add, remove and configure the type of network conditions you want to test against on the fly.

Toxiproxy comes in two parts:

  1. A TCP Proxy written in Go, this acts as a server for proxying TCP traffic between the host machine and the upstream service you wish to simulate network conditions against.
  2. A client used for communicating with the Toxiproxy server over HTTP. There are a wide range of Toxiproxy clients available from .NET, Ruby and PHP to Rust, Elixir and Haskell.

Let’s get started

Starting the Toxiproxy server

Before we can begin we need to download and run the Toxiproxy server. You can either run the server in a Docker container (really helpful for automated tests) or directly via an executable.

Head over to the Toxiproxy installation section to find your preferred approach.

Once running you should see the following output in your terminal:

$ INFO[0000] API HTTP server starting                      host=localhost port=8474 version=git

With the Toxiproxy server now running, let’s take a look at writing our first test.

Writing our first tests using Toxiproxy

In this example I’m going to use the TestMain(m *testing.M) function for setup and tearing down Toxiproxy.

Looking at the setup code below you’ll see I first connect to the Toxiproxy server (in our case running on port 8474 as per the above terminal output), then add my first proxy. A proxy is a service you wish to control network conditions for such as PostgreSQL, a Redis cache or an HTTP API.

When creating the client you’ll notice we inform Toxiproxy to listen on localhost:5050. This instructs Toxiproxy to listen on the aforementioned address and proxy any network traffic from port :5050 to whatever is listening on port :8080 (in this case a simple web service)

package main

import (
	toxiproxy "github.com/Shopify/toxiproxy/v2/client"
	"os"
	"testing"
)

var client *toxiproxy.Client

func TestMain(m *testing.M) {
    upstreamService := "localhost:8080"
    listen := "localhost:5050"

    // Connect to the proxy server and create a proxy which we'll configure in individual test methods.
    client = toxiproxy.NewClient("localhost:8474")
    proxy, err := client.CreateProxy("upstream_api", listen, upstreamService)
    if err != nil {
        panic(err)
    }

    // Clean up the proxy once all tests have run
    defer proxy.Delete()

    os.Exit(m.Run())
}

Next we’ll add our test method to simulate latency on the upstream request (Toxiproxy enables you introduce latency on either the upstream or downstream network traffic). We do this by adding what Toxiproxy calls a “Toxic”.

What’s a Toxic?

A Toxic is a network condition such as latency, bandwidth limitation, connection reset etc. There are a number of toxics available out of the box but you can even create your own if desired.

func TestSlowConnection(t *testing.T) {

    // Arrange

    // Use the client to get our proxy configured on test setup
    proxy, err := client.Proxy("upstream_api")
    if err != nil {
        t.Fatalf("fetching proxy 'upstream_api': %v", err)
    }

    // Add a 'toxic' to introduce 3 seconds latency into the upstream request 
    proxy.AddToxic("latency", "latency", "upstream", 1.0, toxiproxy.Attributes{
        "latency": 3000,
    })

    // Delete the proxy after use for test isolation
    defer proxy.RemoveToxic("latency")

    // Act
    start := time.Now()

    // Make our request to our dependant service
    resp, err := http.Get("http://" + proxy.Listen)

    // Assert tab
    if time.Since(start) < 3000*time.Millisecond {
        t.Fatalf("request completed sooner than expected")
    }

    if resp.StatusCode != http.StatusOK {
        t.Fatalf("wanted 200, got %v", resp.StatusCode)
    }
}

As you’ll see from the test code above, first we get the proxy (named upstream_api) we wish to introduce network issues into then we add the latency toxi that introduces latency of 3000 millisecods. We then defer the removal of the toxic to ensure it doesn’t affect subsequent tests using that proxy, then finish up by asserting that our request took the expected length of time to respond.

Wrapping up

Hopefully the above example has provided you with a good introduction to Toxiproxy and how you can use it to create automated tests that verify expected behaviour of services under various network conditions.