Blog
/
Comparison

Benchmark: Apollo Federation Gateway v1 vs v2 vs WunderGraph vs mercurius-js

cover
Jens Neuse

Jens Neuse

min read

Cosmo: Full Lifecycle GraphQL API Management

Are you looking for an Open Source Graph Manager? Cosmo is the most complete solution including Schema Registry, Router, Studio, Metrics, Analytics, Distributed Tracing, Breaking Change detection and more.

Apollo just released their newest version of the Apollo Federation Gateway (v2), so I was curious how it performs against v1 and the WunderGraph implementation.

Both Apollo Gateway v1 and v2 are implemented using NodeJS, the WunderGraph Gateway is written in Go.

So far, WunderGraph is the only Federation implementation aside from Apollo. It was brought to my attention that there's also mercurius-js that implements the Federation specification. It's built on top of Fastify (NodeJS). Thanks to @matteocollina for pointing out! If you know any other implementation, please let me know!

TLDR

WunderGraph achieves up to 271x (132x) more requests per second vs. Apollo Gateway v1 (v2), 99th percentile latency is 292x (54x) lower. Apollo Gateway v2 achieves 2x more rps than v1, 99th percentile latency is 5.6x slower than v1. While Apollo Gateway v1 had problems with timeout errors, v2 solved this problem.

Apollo Federation with Subscriptions

In contrast to Apollo Gateway, WunderGraph supports Subscriptions. This is possible because Go has green threads (goroutines) which enable services to scale easily across all cores of a computer. Each Subscription can run in its own goroutine, which only takes up a few kilobytes of memory in stack size, so scaling this solution is quite efficient.

That said, getting the architecture right for federated subscriptions is a complex problem. Most if not all GraphQL server implementations "interpret" GraphQL Operations at runtime. This means, they parse the Operation into an AST at runtime and work through this AST to resolve the Operations.

WunderGraph takes a different approach. We've built a Query Compiler that divides resolving a GraphQL Operation into multiple stages. On a high level, we differentiate between the planning, and the execution phase. During planning, we evaluate the AST and build an optimized execution Plan, hence the "Query Compiler". This execution Plan can be cached, which makes this approach very efficient. But efficinecy is not everything. More importantly, this approach allows us to solve complex problems like resolving federated GraphQL Operations with a multi step compiler, combined with an optimized execution engine.

Btw. this Query Compiler and Execution engine is open source under a MIT license . It's used by more and more companies in production. We're very proud that the developers of Khan Academy joined the ranks of maintainers recently.

One last word on open source, graphql-go-tools, the library we're building WunderGraph upon, has some amazing contributors. Amongst them is Vasyl Domanchuk from the Ukraine, he contributed the DataLoader implementation that plays an important role in making the engine so fast. This implementation solves the N+1 problem when resolving nested federated GraphQL Operations.

Thank you Vasyl, your work is highly appreciated!

Benchmarking methodology

I've setup the basic Federation demo, more info at the end of the post. For benchmarking, I've used the cli "hey" with a concurrency of 50 over 10 seconds.

Results - Apollo Federation Gateway vs WunderGraph

Requests per second (small Query)

GatewayRps
Apollo Gateway v1195
Apollo Gateway v2400
WunderGraph53000
Mercurius-js1883
RPS small payload

Requests per second (large Query)

GatewayRps
Apollo Gateway v1186
Apollo Gateway v2293
WunderGraph43834
Mercurius-js208
RPS large payload

Latency (small Query)

GatewayAvg99th percentile95th percentile
Apollo Gateway v18110897
Apollo Gateway v297.1194127
WunderGraph0.94.41.9
Mercurius-js264034
Latency small

Latency (large Query)

GatewayAvg99th percentile95th percentile
Apollo Gateway v1115287114
Apollo Gateway v21701609614
WunderGraph0.94.22.1
Mercurius-js238344269
Latency large

Observations

Apollo Gateway v1 always has timeout errors under high load. The newer version (v2) fixed this problem. However, v2 seems to be not yet mature as requests per second ranged from 10 to 400 in some test runs.

I've also found that Apollo now configures their gateway to use Apollo Studio by default. As an alternative, they provide you a code snipped to use curl. Additionally, there's a link to the docs to enable the Playground again, running on your local machine:

1
2
3
4
5
6
7
8
9
10
11
12

mercurius-js is written in NodeJS, similarly to Apollos gateway. For the server, it's using the Fastify framework, which is visible from the results. On small payloads, it comes out on top of Apollo by almost 5x in terms of rps. It only seems that it struggles with the large Query. This could be either processing more data in general or due to the higher amount of network requests, the gateway has to make. Something must be going that here, which makes mercurius fall behind Apollo on the large Query.

Conclusion

NodeJS is still not comparable in terms of performance vs. Golang. While the new version of the Apollo Gateway doesn't throw timeout errors anymore, it's visible that it doesn't scale well when GraphQL Operations become deeply nested.

Comparing the latencies of Apollo v2 for the small and large payload, it's observable that the numbers skyrocket when Operations become more nested.

WunderGraph on the other hand is not yet saturated with the workload. we could probably increase the nesting further until it has to give up.

If you want a fast Federation compatible Gateway solution, WunderGraph can save you a lot of money for hosting while increasing the security of your API.

What makes the difference?

It's mainly two things. For one, WunderGraph is written in Go, a language that is much more capable when it comes to concurrent workloads like implementing an HTTP server. The second aspect is the architecture of WunderGraph. Instead of "interpreting" Operations, WunderGraph works with a Query Compiler that prepares the execution of an Operation at deployment time, removing all the complexity of working with the GraphQL AST at runtime.

If you want to learn more on this topic, have a look at the overview on the Query Compiler.

Demo Setup

WunderGraph: https://github.com/wundergraph/wundergraph-demo

Apollo: https://github.com/StevenACoffman/federation-demo

In both cases, I was using the upstreams implemented using gqlgen to eliminate perfomance issues on the upstreams.

If you want to reproduce the results, just clone the repos and use hey or similar tools to benchmark yourself.

Test Query Small

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

Test Query Large

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53