WunderGraph Cloud Early Access
Before we get into the blog post. WunderGraph Cloud is being released very soon. We’re looking for Alpha and Beta testers for WunderGraph Cloud.
Testers will receive early access to WunderGraph Cloud and 3 months Cloud Pro for free.
Apollo just released their newest version of the Apollo Federation Gateway (v2), so I was curious how it performs against v1 and the WunderGraph implementation.
Both Apollo Gateway v1 and v2 are implemented using NodeJS, the WunderGraph Gateway is written in Go.
So far, WunderGraph is the only Federation implementation aside from Apollo. It was brought to my attention that there's also mercurius-js that implements the Federation specification. It's built on top of Fastify (NodeJS). Thanks to @matteocollina for pointing out! If you know any other implementation, please let me know!
WunderGraph achieves up to 271x (132x) more requests per second vs. Apollo Gateway v1 (v2), 99th percentile latency is 292x (54x) lower. Apollo Gateway v2 achieves 2x more rps than v1, 99th percentile latency is 5.6x slower than v1. While Apollo Gateway v1 had problems with timeout errors, v2 solved this problem.
Apollo Federation with Subscriptions
In contrast to Apollo Gateway, WunderGraph supports Subscriptions. This is possible because Go has green threads (goroutines) which enable services to scale easily across all cores of a computer. Each Subscription can run in its own goroutine, which only takes up a few kilobytes of memory in stack size, so scaling this solution is quite efficient.
That said, getting the architecture right for federated subscriptions is a complex problem. Most if not all GraphQL server implementations "interpret" GraphQL Operations at runtime. This means, they parse the Operation into an AST at runtime and work through this AST to resolve the Operations.
WunderGraph takes a different approach. We've built a Query Compiler that divides resolving a GraphQL Operation into multiple stages. On a high level, we differentiate between the planning, and the execution phase. During planning, we evaluate the AST and build an optimized execution Plan, hence the "Query Compiler". This execution Plan can be cached, which makes this approach very efficient. But efficinecy is not everything. More importantly, this approach allows us to solve complex problems like resolving federated GraphQL Operations with a multi step compiler, combined with an optimized execution engine.
Btw. this Query Compiler and Execution engine is open source under a MIT license . It's used by more and more companies in production. We're very proud that the developers of Khan Academy joined the ranks of maintainers recently.
One last word on open source, graphql-go-tools, the library we're building WunderGraph upon, has some amazing contributors. Amongst them is Vasyl Domanchuk from the Ukraine, he contributed the DataLoader implementation that plays an important role in making the engine so fast. This implementation solves the N+1 problem when resolving nested federated GraphQL Operations.
Thank you Vasyl, your work is highly appreciated!
I've setup the basic Federation demo, more info at the end of the post. For benchmarking, I've used the cli "hey" with a concurrency of 50 over 10 seconds.
Results - Apollo Federation Gateway vs WunderGraph
Requests per second (small Query)
|Apollo Gateway v1||195|
|Apollo Gateway v2||400|
Requests per second (large Query)
|Apollo Gateway v1||186|
|Apollo Gateway v2||293|
Latency (small Query)
|Gateway||Avg||99th percentile||95th percentile|
|Apollo Gateway v1||81||108||97|
|Apollo Gateway v2||97.1||194||127|
Latency (large Query)
|Gateway||Avg||99th percentile||95th percentile|
|Apollo Gateway v1||115||287||114|
|Apollo Gateway v2||170||1609||614|
Apollo Gateway v1 always has timeout errors under high load. The newer version (v2) fixed this problem. However, v2 seems to be not yet mature as requests per second ranged from 10 to 400 in some test runs.
I've also found that Apollo now configures their gateway to use Apollo Studio by default. As an alternative, they provide you a code snipped to use curl. Additionally, there's a link to the docs to enable the Playground again, running on your local machine:
mercurius-js is written in NodeJS, similarly to Apollos gateway. For the server, it's using the Fastify framework, which is visible from the results. On small payloads, it comes out on top of Apollo by almost 5x in terms of rps. It only seems that it struggles with the large Query. This could be either processing more data in general or due to the higher amount of network requests, the gateway has to make. Something must be going that here, which makes mercurius fall behind Apollo on the large Query.
NodeJS is still not comparable in terms of performance vs. Golang. While the new version of the Apollo Gateway doesn't throw timeout errors anymore, it's visible that it doesn't scale well when GraphQL Operations become deeply nested.
Comparing the latencies of Apollo v2 for the small and large payload, it's observable that the numbers skyrocket when Operations become more nested.
WunderGraph on the other hand is not yet saturated with the workload. we could probably increase the nesting further until it has to give up.
If you want a fast Federation compatible Gateway solution, WunderGraph can save you a lot of money for hosting while increasing the security of your API.
What makes the difference?
It's mainly two things. For one, WunderGraph is written in Go, a language that is much more capable when it comes to concurrent workloads like implementing an HTTP server. The second aspect is the architecture of WunderGraph. Instead of "interpreting" Operations, WunderGraph works with a Query Compiler that prepares the execution of an Operation at deployment time, removing all the complexity of working with the GraphQL AST at runtime.
If you want to learn more on this topic, have a look at the overview on the Query Compiler.
In both cases, I was using the upstreams implemented using gqlgen to eliminate perfomance issues on the upstreams.
If you want to reproduce the results, just clone the repos and use hey or similar tools to benchmark yourself.