TypeSafe Testing in Backends-for-Frontends

Introducing WunderGraph Hub: Rethinking How Teams Build APIs

WunderGraph Hub is our new collaborative platform for designing, evolving, and shipping APIs together. It’s a design-first workspace that brings schema design, mocks, and workflows into one place.

Request Early Access

Type-Safe Testing in Backends-for-Frontends

Integration, Unit, and E2E testing aren’t mutually exclusive, but they sure are a pain to implement in a BFF system. Could WunderGraph’s type-safe testing and mocking library make this easier?

With the Backends-for-frontends (BFF) pattern, you build servers that are an intermediary aggregation layer, orchestrating multiple API calls to provide one specific API, tailored to the needs of a specific client.

Why build BFFs? Well, if you’re building even moderately sized apps, you’re going to have to deal with a whole bunch of underlying domain services — whether they’re internal microservices, CRM/content APIs, third-party APIs that are business critical, or just databases — and you’ll need to bring them together somehow. That’s a lot of API calls.

10 times out of 10, I’m going to aggregate those calls on a BFF server layer with a ~10Gbps downstream interconnect with much lower latencies between my services, rather than on my client, which would have anywhere from ~10–100 Mbps upstream (half that on mobile) with bigger, slower, wildly unpredictable hops.

But because BFFs can aggregate and map downstream data however you like, from various sources, each with its own architecture (and idiosyncrasies), testing becomes even more critical.

Why? And how should we approach testing for BFFs, then? Let’s talk about it. I’ve been using WunderGraph — an open-source BFF framework — to build BFFs for data-heavy apps for a while now, and its integrated testing library that lets me write framework-agnostic integration tests (mocking data when needed) has been a godsend.

But first, let’s answer the one burning question most developers have:

What is it about BFFs that makes testing so important?

TL;DR: with the Backends-for-frontends pattern you have multiple points of failure.

For any other app, if you’re using TypeScript and performing runtime validation with Zod, sure, maybe you can get away with no testing. But what happens when you need to orchestrate multiple downstream calls on the BFF server, with data sources ranging in the dozens?

Because of the inherently inconsistent nature of so many disparate underlying services, building and maintaining your BFF around them becomes a pain. You need to know your BFF does what you’ve coded it to do, and so, testing becomes crucial. You need to:

Make sure the BFF server is working correctly and can handle requests from client(s), ensure that the client(s) can successfully communicate with the corresponding backend services, including any necessary data transformations, API calls, or business logic
Make sure each of the downstream services (calls to APIs, data sources, etc) are working and returning data according to the agreed-upon API specification.
Make sure the BFF aggregates and processes the data returned from downstream services correctly, and returns data in the format the client needs. I.e. verify the contract.
Simulate failure scenarios, to evaluate the error handling and resilience mechanisms of the BFF, making sure it gracefully handles failures, retries, timeouts, or degraded service scenarios.
Simulate scenarios with exceeded limits/resource constraints to evaluate performance and scalability of the BFF to ensure the system can handle these edge cases.

Unit tests help, sure, but integration tests are the name of the game when it comes to BFFs because they are specialized for evaluating component interaction — exactly what you want with the Backends-for-frontends pattern.

WunderGraph — A Primer

Using WunderGraph, you define a number of heterogeneous data dependencies — internal microservices, databases, as well as third-party APIs — as config-as-code, and it will introspect each, aggregating and abstracting them into a namespaced virtual graph.

https://github.com/wundergraph/wundergraph/

First of all, you’ll have to name your APIs and services as data dependencies, much like you would add project dependencies to a package.json file.

💡 You could just use the actual URL, but defining data sources as ENV variables here allows you to replace them with a mock server URL when testing. We’ll get to that later.

You can then write GraphQL operations (queries, mutations, subscriptions) or async resolver functions in TypeScript — or a combination of the two — to aggregate your downstream data in whatever fashion you want, process them, and compose them into the final response for the client your BFF is serving.

To get a country’s capital:

./.wundergraph/operations/CountryByCode.graphql

To get the weather by city:

./.wundergraph/operations/WeatherByCity.graphql

Combine the two by writing an async function in TypeScript, to get the weather of a given country’s capital.

./.wundergraph/operations/WeatherByCountryCapital.ts

Now you can just call this resolver clientside to get the data you want, using a fully typesafe client that WunderGraph generates for you if you’re using a React-based framework, or a data fetching library like SWR or Tanstack Query.

If you aren’t, WunderGraph always mounts each operation as its own endpoint by default, serving data as JSON over RPC, so regardless of framework, you could just use your library of choice to make a regular HTTP GET request to the WunderGraph BFF server:

http://localhost:9991/operations/WeatherByCountryCapital?code=INSERT_COUNTRY_ISO_CODE_HERE

…to get the data you want, in JSON.

All good, but as said before, the more data dependencies you have, the more important testing becomes. Whenever you’re interacting with multiple APIs and using disparate data to craft a final client response, you’re going to have multiple potential points of failure in your app. Which brings us to testing.

WunderGraph’s Type-safe Testing Library

WunderGraph’s testing library makes writing tests for all of your data sources (whether they’re GraphQL, REST, databases, Apollo Federations, and more) in a single test suite dead simple — setting up a testing server for you, with full typesafe access to your data. It comes with Jest out of the box, but you could use it with any testing framework at all.

Let’s make sure each of our downstream calls works right, first. We can test integration points after.

createTestServer() returns a WunderGraphTestServer object that wraps the test server and the type-safe client WunderGraph auto-generated for you, so you’ll still have autocomplete when writing Jest assertions for your data structure.
You call the WunderGraph-generated client by calling testServer.client() within a test, and choosing the query to run (including its inputs, if any).

Next, let’s test our main Integration point — the BFF. This API’s response, and its ability to aggregate data, is critical for the client, and needs to be tested.

These are mostly happy-path tests, but now that you know the framework is there, and how easy setting up/spinning down test servers are, you could apply these principles and implement fuzzing, limit testing, whatever you want.

Any time any of your datasources change, WunderGraph will regenerate the client (with wunderctl generate, which you call each run when you start up the BFF server and your frontend), and your assertions would fail on a npm test, letting you know immediately.

All of this great, but you don’t want to actually call the individual services/APIs every single time you iterate in development, right? Or, another scenario: what if those services are things you know will exist come production time, but ones the backend team hasn’t finished building yet?

This makes for the perfect segue into…

Mocking

While writing tests, you’ll often need to fake, or simulate data. The purpose of this “mocking” is to control the behavior of dependencies/external functions, making it easier to isolate and verify the correctness of the actual code you’re testing — without messing with your actual datasources. Otherwise, it’s way too easy to write tests that accidentally manipulate data, end up running 10–20x slower, and still pass (because they’re technically correct).

WunderGraph’s testing library provides a createTestAndMockServer() function which works much the same way as the createTestServer() we used before, wrapping a test server and the auto-generated typesafe client, but also allowing you to replace calls to HTTP datasources (that you’ve defined as environment variables in wundergraph.config.ts. See why that was needed?) and mocking their responses.

Including our two data dependencies — COUNTRIES_URL and WEATHER_URL — in the mockURLEnvs array tells WunderGraph’s test server, “Capture all requests made to these two URLs within each test, and mock their responses instead.”

Then you can set up that mock with mock(), watching for a matching HTTP request and writing a handler function to return the data you want. The argument to the mock() function is an object with the following properties:

times — The number of times the mock should be called. Defaults to 1.
persist — If true, the mock will not be removed after any number of calls, and you’ll have to do it manually with testServer.mockServer.reset() after tests are done. Defaults to false.
match — A function that returns true if the HTTP request for this test is a match
handler — A function that mocks data when it is a match, and either returns the response or throws an error.

Now, with the mock properly set up, and HTTP requests being properly intercepted — you can now simply call the real mounted endpoint for this operation using the WunderGraph generated client again, and get back the mocked response — one that never makes the actual request to the data source at all.

Here’s the full test.

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

WunderGraph’s generated typesafe client being accessible here (calling.client().query()) means that you’re not limited to just GraphQL operations for this. You can use TypeScript operations too for your tests and mocks — just pass in the namespaced operationName and you’re golden.

E2E Testing

Finally, for End-to-End (E2E) testing, WunderGraph does not provide any libraries specifically for it, but it is fully compatible with PlayWright. Just make sure you run WunderGraph’s BFF server and the frontend first (WunderGraph’s default npm run start script does this for you) in playwright.config.ts.

Where WG_NODE_URL is a default WunderGraph Environment Variable pointing to the base URL for the WunderGraph server (http://localhost:9991 by default).

Where to go from here?

Hopefully, now you have a better idea of why testing is so critical for building better, more maintainable Backends-for-frontends.

Using WunderGraph’s built-in testing library opens up the opportunity to write better tests, more easily, for pretty much any integration point you want, and also mock responses so you can test your app and BFF implementation without ever calling the actual data sources during development, using up your quota or blowing past rate limits and getting throttled before your app is even in production.

WunderGraph can do far more than just testing, though. It’s a full-featured BFF framework. Head on over to their docs , or their Discord to know more.

Router / Gateway

MCP Gateway

Documentation

Zero to Production

GitHub

Community