Cosmo: Full Lifecycle GraphQL API Management
Are you looking for an Open Source Graph Manager? Cosmo is the most complete solution including Schema Registry, Router, Studio, Metrics, Analytics, Distributed Tracing, Breaking Change detection and more.
WunderGraph exposes GraphQL Subscriptions over SSE (Server-Sent Events) or Fetch (as a fallback). This post explains why we've decided to take this approach and think it's better than using WebSockets.
What is a GraphQL Subscription?
GraphQL Subscriptions allow a client to subscribe to changes. Instead of polling for changes, the client can receive updates in real-time.
Here's a simple example from our GraphQL Federation Demo :
This is based on Apollo Federation. Once the "product" microservice has a price update, WunderGraph joins the data with the reviews from the "review" microservice, does an additional join with some user information from the "user" microservice and sends the data back to the client.
The client gets this as a stream of data. This way, the user interface can be updated in real-time.
Traditional ways of implementing GraphQL Subscriptions
The most widely adopted way of implementing GraphQL Subscriptions is to use WebSockets. The WebSocket API is an HTTP 1.1 standard that's usually supported by all modern browsers. (According to caniuse.com, 94.22% of all browsers support the WebSockets API )
First, the client sends an HTTP Upgrade request, asking the server to upgrade the connection to a WebSocket. Once the server upgrades the connection, both the client and the server can send and receive data by passing messages over the WebSocket.
Let's now discuss the Problems with WebSockets
The WebSocket API is an HTTP 1.1 standard
Most websites nowadays use HTTP/2 or even HTTP/3 to speed up the web. HTTP/2 allows for multiplexing multiple requests over a single TCP connection. This means that the client can send multiple requests at the same time. HTTP/3 improves this even further, but that's not the point of this post.
What's problematic is that if your website is mixing both HTTP/1.1 and HTTP/2, the client will have to open multiple TCP connections to the server.
Clients can easily multiplex up to 100 HTTP/2 requests over a single TCP connection, whereas with WebSockets, you're forced to open a new TCP connection for each WebSocket.
If a user opens multiple tabs to your website, each tab will open a new TCP connection to the server. Using HTTP/2, multiple tabs can share the same TCP connection.
So, the first problem with WebSockets is that it's using an outdated and unmaintained protocol that causes extra TCP connections.
WebSockets are stateful
Another problem with WebSockets is that the client and the server have to keep track of the state of the connection. If we look at the principles of REST, one of them states that requests should be stateless .
Stateless in this context means that each request should contain all the required information to be able to process it.
Let's look at a few scenarios how you could use GraphQL Subscriptions with WebSockets:
1. Send an Authorization header with the Upgrade Request
As we've learned above, each WebSocket connection starts with an HTTP Upgrade request. What if we send an Authorization header with the Upgrade Request? It's possible, but it also means that when we "subscribe" using a WebSocket message, that "subscription" is no longer stateless, as it relies on the Authorization header that we've previously sent.
What if the user logged out in the meantime, but we forgot to close the WebSocket connection?
Another problem with this approach is that the WebSocket Browser API doesn't allow us to set Headers on the Upgrade Request. This is only possible using custom WebSocket clients.
So, in reality, this way of implementing GraphQL Subscriptions is not very practical.
2. Send an Auth Token with the "connection_init" WebSocket Message
Another approach is to send an Auth Token with the "connection_init" WebSocket Message. This is the way it's done by Reddit. If you go to reddit.com, open Chrome DevTools, click on the network tab and filter for "ws". You'll see a WebSocket connection where the client sends a Bearer token with the "connection_init" message.
This approach is also stateful. You can copy this token and use any other WebSocket client to subscribe to the GraphQL Subscription. You can then log out on the website without the WebSocket connection being closed.
Subsequent subscribe messages will also rely on the context that was set by the initial "connection_init" message, just to underscore the fact that it's still stateful.
That said, there's a much bigger problem with this approach. As you saw, the client sent a Bearer token with the "connection_init" message. This means that at some point in time, the client had access to said token.
A better solution is to always keep such tokens in a secure location, we'll come to this later.
3. Send an Auth Token with the "subscribe" WebSocket Message
Another approach would be to send an Auth Token with the "subscribe" WebSocket Message. This would make our GraphQL Subscription stateless again, as all information to process the request is contained in the "subscribe" message.
However, this approach creates a bunch of other problems.
First, it would mean that we have to allow clients to anonymously open WebSocket connections without checking who they are. As we want to keep our GraphQL Subscription stateless, the first time we'd send an Authorization token would be when we send the "subscribe" message.
What happens if millions of clients open WebSocket connections to your GraphQL server without ever sending a "subscribe" message? Upgrading WebSocket connections can be quite expensive, and you also have to have CPU and Memory to keep the connections around. When should you cut off a "malicious" WebSocket connection? What if you have false positives?
Another issue with this approach is that you're more or less re-inventing HTTP over WebSockets. If you're sending "Authorization Metadata" with the "subscribe" message, you're essentially re-implementing HTTP Headers. Why not just use HTTP instead?
We'll discuss a better approach (SSE/Fetch) later.
WebSockets allow for bidirectional communication
The next issue with WebSockets is that they allow for bidirectional communication. Clients can send arbitrary messages to the server.
If we revisit the GraphQL specification, we'll see that no bidirectional communication is required to implement Subscriptions. Clients subscribe once. After that, it's only the server who sends messages to the client. If you use a protocol (WebSockets) that allows clients to send arbitrary messages to the server, you have to somehow throttle the amount of that the client can send.
What if a malicious client sends a lot of messages to the server? The server will usually spend CPU time and memory while parsing and dismissing the messages.
Wouldn't it be better to use a protocol that denies clients from sending arbitrary messages to the server?
WebSockets are not ideal for SSR (server-side-rendering)
Another issue we faced is the usability of WebSockets when doing SSR (server-side-rendering).
One of the problems we've recently solved is to allow "Universal Rendering" (SSR) with GraphQL Subscriptions. We were looking for a nice way to be able to render a GraphQL Subscription on the server as well as in the browser.
Why would you want to do this? Imagine, you build a website that should always show the latest price for a stock or an article. You definitely want the website to be (near) real-time, but you also want to render the content on the server for SEO and usability reasons.
Here's an example from our GraphQL Federation demo :
This (NextJS) page is first rendered on the server, and then re-hydrated on the client, which continues with the Subscription.
We'll talk about this in more detail in a bit, let's focus on the challenge with WebSockets first.
If the server had to render this page, it would have to first start a WebSocket connection to the GraphQL Subscription server. It'll then have to wait until the first message is received from the server. Only then, it could continue rendering the page.
While technically possible, there's no simple "async await" API to solve this problem, hence nobody is really doing this as it's way too expensive, not robust, and complicated to implement.
Summary of the Problems with GraphQL Subscriptions over WebSockets
- WebSockets make your GraphQL Subscriptions stateful
- WebSockets cause the browser to fall back to HTTP/1.1
- WebSockets cause security problems by exposing Auth Tokens to the client
- WebSockets allow for bidirectional communication
- WebSockets are not ideal for SSR (server-side-rendering)
To sum up the previous section, GraphQL Subscriptions over WebSockets cause a few problems with performance, security and usability. If we're building tools for the modern web, we should consider better solutions.
Why we chose SSE (Server-Sent Events) / Fetch to implement GraphQL Subscriptions
Let's go through the problems one by one and discuss how we've solved them.
Keep in mind that the approach we've chosen is only possible if you use a "GraphQL Operation Compiler". By default, GraphQL clients have to send all the information to the server to be able to initiate a GraphQL Subscription.
Thanks to our GraphQL Operation Compiler, we're in a unique position that allows us to only send the "Operation Name" as well as the "Variables" to the server. This approach makes our GraphQL API much more secure as it hides it behind a JSON-RPC API. You can check out an example here , and we're also open sourcing the solution soon.
So, why did we choose SSE (Server-Sent Events) / Fetch to implement GraphQL Subscriptions?
SSE (Server-Sent Events) / Fetch is stateless
Both SSE and Fetch are stateless APIs and very easy to use. Simply make a GET request with the name of the Operation and the Variables as query parameters.
Each request contains all the information required to initiate the Subscription. When the browser talks to the server, it can use the SSE API or fall back to the Fetch API if the browser doesn't support SSE.
Here's an example request (fetch):
The response looks like this:
It's a stream of JSON objects, delimited by two newline characters.
Alternatively, we could also use the SSE API:
The response looks very similar to the Fetch response, just prefixed with "data":
SSE (Server-Sent Events) / Fetch can leverage HTTP/2
Both SSE and Fetch can leverage HTTP/2. Actually, you should avoid using SSE/Fetch for GraphQL Subscriptions when HTTP/2 is not available, as using it with HTTP 1.1 will cause the browser to create a lot of TCP connections, quickly exhausting the maximum number of concurrent TCP connections that a browser can open to the same origin.
Using SSE/Fetch with HTTP/2 means that you get a modern, easy to use API, that's also very fast. In rare cases where you have to fall back to HTTP 1.1, you can still use SSE/Fetch thought.
SSE (Server-Sent Events) / Fetch can easily be secured
We've implemented the "Token Handler Pattern" to make our API secure. The Token Handler Pattern is a way of handling Auth Tokens on the server, not on the client.
First, you redirect the user to an identity provider, e.g. Keycloak. Once the login is complete, the user is redirected back to the "WunderGraph Server" with an auth code. This auth code is then exchanged for a token.
Exchanging the auth code for a token is happening on the back channel, the browser has no way to know about it.
Once this cookie is set, each SSE/Fetch request is automatically authenticated. If the user signs out, the cookie is deleted and no further subscriptions are possible. Each subscription request always contains all the information required to initiate the Subscription (stateless).
SSE (Server-Sent Events) / Fetch disallow the client to send arbitrary data
Server-Sent Events (SSE), as the name indicates, is an API to send events from the server to the client. Once initiated, the client can receive events from the server, but this channel cannot be used to communicate back.
Combined with the "Token Handler Pattern", this means that we can shut down requests immediately after reading the HTTP headers.
The same goes for the Fetch API as it's very similar to SSE.
Fetch can easily be used to implement SSR (Server-Side Rendering) for GraphQL Subscriptions
Core part of our Implementation of Subscriptions over SSE/Fetch is the "HTTP Flusher". After each event is written to the response buffer, we have to "flush" the connection to send the data to the client.
In order to support Server-Side Rendering (SSR), we've added a very simple trick. When using the "PriceUpdates" API on the server, we append a query parameter to the URL:
The flag "wg_subscribe_once" tells the server to only send one event to the client and then close the connection. So, instead of flushing the connection and then waiting for the next event, we simply close it.
Additionally, we only send the following headers if the flag is not set:
In case of "wg_subscribe_once", we simply omit those headers and set the content type to "application/json". This way, node-fetch can easily deal with this API when doing server-side rendering.
Implementing GraphQL Subscriptions over SSE/Fetch gives us a modern, easy to use API with great usability. It's performant, secure, and allows us to also implement SSR (Server-Side Rendering) for GraphQL Subscriptions. It's so simple that you can even consume it using curl.
WebSockets on the other hand come with a lot of problems regarding security and performance.
caniuse.com, 93.76% of all browsers support HTTP/2 . 94.65% of all browser support the EventSource API (SSE) . 93.62% support the Fetch .
I think it's about time to migrate over from WebSocket to SSE/Fetch for GraphQL Subscriptions. If you'd like to get some inspiration, here's a demo that you can run locally: https://github.com/wundergraph/wundergraph-demo
We're also going to open source our implementation very soon. Sign up with your Email if you'd like to be notified when it's ready.
What do you think about this approach? How do you implement GraphQL Subscriptions yourself? Join us on Discord and share your thoughts!