Blog
/
Education

What every GraphQL user should know about HTTP and REST

cover
Jens Neuse

Jens Neuse

10min read

WunderGraph Cloud Early Access

Before we get into the blog post. WunderGraph Cloud is being released very soon. We’re looking for Alpha and Beta testers for WunderGraph Cloud.

Testers will receive early access to WunderGraph Cloud and 3 months Cloud Pro for free.

GraphQL is usually praised to the moon and beyond while REST seems to be and old school way of doing things.

I keep on hearing from Developer Advocates about how great GraphQL is and how much better it is than REST. I keep reading blog posts that compare GraphQL and REST APIs where GraphQL is always much more powerful and flexible than REST, with no disadvantages obviously.

I think these comparisons fail to show the real value of following the constraints of REST. I believe that both GraphQL and REST are great tools to build powerful API solutions, when used together. It's not a question of either or, but rather of how well they can work together.

I've recently posted on this blog about the idea of putting a REST API in front of a GraphQL API. This is one of the responses I've got back:

I'm trying to understand. You cover graphql with rest. So you're loosing possibility to for example select only subset of fields. It means efficiency will be horrible. No caching, no batching.

The assumptions above are not correct. Putting a REST (JSON RPC) API in front of GraphQL is actually a very good idea and widely used.

If you visit websites like Facebook, Twitter or Twitch, open Chrome DevTools, and you'll see that these companies are wrapping their GraphQL API layer with a REST API / JSON RPC API.

The question to ask is, why are these early adopters of GraphQL wrapping their APIs with another API layer? Why are they not directly exposing their GraphQL API, like most of the GraphQL community does?

But let's not get ahead of ourselves. We should start with the basics of HTTP and REST.

A simple model to think about REST

There's the dissertation by Roy Fielding, there's the Richardson Maturity Model, there's Hypermedia, URLs, HTTP verbs, HTTP headers, HTTP status codes, and more. The topic can be quite overwhelming.

The older readers will find it tedious to read about the topic again and again. But the reality is, a lot of young developers skip the basics and don't learn a lot about the fundamentals of the web.

To make the topic more approachable, I'd like to propose a simpler model to think about REST.

If you build a RESTful service, it's compatible to the REST of the web.

If you don't care much about REST, your service will be less compatible to the web. It's as simple as that.

It's not a goal to build something in a RESTful way, but doing so means that your service fits very well with the existing infrastructure of the web.

Here's another quote I've read recently:

Once you tried GraphQL you can never go back to REST, the developer experience is just too amazing

GraphQL is a Query language. The GraphQL specification doesn't mention the word HTTP a single time.

REST on the other hand is a set of constraints that, if you follow them, makes your service compatible to the web.

When you're using GraphQL over HTTP, you're actually using REST, just very limited version of REST because you're not following a lot of the constraints.

Why GraphQL enthusiasts keep bashing REST

So this whole quote is a bit misleading and is the core of the problem. Most GraphQL enthusiasts see REST as bad, old-fashioned and outdated. They believe that GraphQL is the successor of REST.

This just doesn't make sense. If you want to use GraphQL on the web, you have to use HTTP and that means you're in REST territory.

The only differentiator is that you can either accept REST and try to follow the constraints, or you can ignore them and use GraphQL in a way that is not really leveraging the existing infrastructure of the web.

That's all I'm trying to say.

Don't ignore the web when building APIs for the web.

It's OK to send read requests over HTTP POST with a Query in the JSON body. It's just that you're violating a fundamental principle of the web, making it very hard for Browsers and Caches to understand what you're trying to do.

I think it would help the GraphQL community if we accepted REST for what it is and stop fighting against it.

The URL, the most fundamental component of the web

We all know what a URL is. It's a piece of text that points to a resource on the web. Ideally, a URL uniquely identifies a resource on the web. This is because browsers, CDNs, Caches, Proxies and many other components of the web follow a set of rules around the concept of the URL.

Concepts like Caching (Cache-Control header), and Cache Invalidation (ETag header) only work when we use a unique URL for each resource.

As mentioned earlier, the GraphQL specification doesn't mention HTTP, that's because it simply describes the Query language. From the point of view of the GraphQL specification, GraphQL is not tied to any transport.

To be more specific, GraphQL is not defined in a way to be used with a transport at all. That's what I mean when I say that GraphQL is not meant to be exposed over the Internet. As we know, you can use GraphQL on the web, but the specification doesn't say anything about it.

So how do we do GraphQL over HTTP? We're following the rules set by companies like Apollo. We're sending a POST request to the "/graphql" endpoint.

This means, we are not able to use a unique URL for different resources, represented by GraphQL types.

The consequence is that we're not able to use HTTP layer caching and ETag headers.

There's a GraphQL-over-HTTP specification on the official "graphql" repository from the foundation, describing a way how to send Queries via HTTP GET.

However, this specification still allows using HTTP POST for read requests, so it's not ideal.

API Requests should be stateless

Aside from the URL, there's another very important constraint of RESTful APIs: Every API Request should be stateless.

Stateless in this context means that each request contains all the information needed to process it. There's no server-side state that is shared between requests, no history, no session.

Stateless APIs are very easily scalable because you can easily scale your backend systems horizontally. Because all information is sent in each request, it doesn't matter which server you talk to.

There's a problem though with GraphQL. When using Subscriptions, we're usually using WebSockets as the transport. WebSockets are initiated via an HTTP Upgrade request. Once the Upgrade request is successful, the WebSocket connection is established, which is essentially just a TCP connection.

Once the WebSocket connection is established, client and server can send and receive messages.

What's wrong with this? Go to reddit.com to your favorite subreddit, make sure you're logged in. Open the Chrome DevTools and go to the Network tab and filter for "WS". You will see that a WebSocket connection is initiated using this URL: "wss://gql-realtime.reddit.com/query"

The message sent from the client to the server looks like this:

1
{"type":"connection_init","payload":{"Authorization":"Bearer XXX"}}

The Reddit engineers are using this message to authenticate the user. You might be asking why they are not sending a Header with the Upgrade Request? That's because you cannot send Headers when initiating a WebSocket connection, the API for doing so doesn't exist.

It's possible to use Cookies though. However, this would mean that the Bearer token would first have to be set by the server, which makes this flow more complicated. But even if you use Cookies, what if the cookie was deleted server-side but the WebSocket connection still remains?

What's also notable is that sending a Bearer token in a WebSocket message is essentially re-inventing HTTP over WebSockets.

There's another issue with this approach that is not immediately obvious. When the client is able to send a Bearer token as a WebSocket message, it means that the client-side JavaScript has access to this token. We know how vulnerable the npm ecosystem is. If you can, you should always try to keep Bearer/JWT tokens away from the client / JavaScript.

This can be achieved by using a server-side authentication flow, e.g. using an OpenID Connect Provider. Once the flow is completed, the Claims of the user can be securely stored in an encrypted, HTTP only cookie.

Claims are name value pairs of information about the user.

This way, you could also just send GraphQL Subscriptions over HTTP/2 Streams. Each Subscription Request contains all the information needed to process it, no additional protocols have to be implemented on top.

HTTP/2 allows us to multiplex many Subscriptions over the same TCP connection. So it's not just easier to handle, it's also more efficient. If you're already making Query Requests to "api.example.com", a TCP connection is already established.

Requests should be cacheable

It's funny that the person mentioned above thinks that by putting a REST API in front of a GraphQL API, you're losing the ability for caching and batching.

In reality, the opposite is the case. We're gaining a lot by exposing REST instead of GraphQL without losing the capabilities of GraphQL.

Think of it like this: By exposing REST instead of GraphQL, we're simply moving the "GraphQL client" out of the client (Browser) and into the server behind the REST API.

Each REST API Endpoint is a GraphQL Operation essentially. Parameters are being mapped from the REST API to the GraphQL Query.

Give each GraphQL Operation a unique URL, and we're able to use GraphQL, but with Caching at the HTTP layer.

The GraphQL Community is trying to solve "Caching" for many years now by adding normalized client-side caches. These solutions are very smart and perform well. Kudos to the engineers to come up with this solution.

However, if we were to use a REST API instead of GraphQL, we would not have to solve the problem at all. Browsers, CDNs, Proxies, API Gateways and Cache Servers are able to cache REST requests.

By exposing GraphQL with a REST-incompatible (HTTP POST for reads) API, you're forcing yourself to write "smart" GraphQL Clients with normalized caching.

I'll repeat myself here: If you're building for the web, don't ignore the web.

Don't dismiss REST if you're using GraphQL, make them work together instead

GraphQL is a joy to work with, it's a fantastic Query Language. I see GraphQL as THE API Integration language.

However, the current state of how most of us use GraphQL is just plain wrong and not optimal.

GraphQL Developer Advocates should stop dismissing REST.

If we want to make GraphQL scale, we need to make it work with REST.

The discussions about "REST vs GraphQL" should end. Instead, we should be talking about how we get the most out of both, the flexibility of GraphQL and the performance of REST.

If we were to move GraphQL from the client to the server, we could save ourselves so much time and effort.

Tools that shouldn't exist

If you think about this "paradigm shift", a lot of tools shouldn't exist in the first place.

A lot of really smart engineers have spent years building tools that might not be needed anymore.

GraphQL Client Libraries

Think about all the super smart GraphQL clients and their normalized Caches. If we move GraphQL to the server, we can leverage the Browser's Cache to store the results of the Query. Cache-Control headers are very capable and allow us to define granular invalidation rules.

GraphQL CDNs

Some super smart folks have put JavaScript and Rust code on the edge so that GraphQL POST Requests can be cached. They went as far as implementing ways to invalidate the Cache when a mutation affects the same data, using smart correlation algorithms.

If we move GraphQL to the server, you can use any CDN or Cache to do the same thing, with no setup at all, it just works.

You can also just use the popular Vanish Cache (used by fastly), it works well with REST APIs.

GraphQL Analytics, Logging and Monitoring

Thanks to GraphQL breaking multiple constraints of REST, we don't just need GraphQL clients, Caches and CDNs, we also have to rethink how we're going to monitor and log our GraphQL APIs.

One of the constraints of REST is to use a layered architecture. If we're exposing REST instead of GraphQL, you can actually use all the existing infrastructure for analytics, monitoring and logging.

Monitoring REST APIs is a solved Problem. There's lots of competition in the market and the tooling is very mature.

GraphQL Security vs. REST Security

Any Web Application Firewall (WAF) can easily protect REST APIs. With GraphQL APIs, that's a lot harder because the WAF has to understand the GraphQL Operation.

Security Experts will love you for putting a REST API in front of your GraphQL API because you're taking away a lot of headaches from them.

How GraphQL and REST can play nicely together

So how can this work?

You might be thinking that this is a drastic shift, but on the surface the changes will be very small.

Imagine we're using the GraphQL Playground on GitHub.com.

You're writing your GraphQL Query as usual. Once you hit the "run" button, we'd send an HTTP Post request to GitHub, but not to execute the Operation.

Instead, we're simply "registering" the GraphQL Document. GitHub will then parse the Document and create a REST Endpoint for us. Aside from just returning the Endpoint to us, we will also get information on the Complexity of the Operation, how much "Budget" it will cost to execute it, and what the estimated Rate Limit is.

This information will help a client to estimate how often it can make requests to the Endpoint.

Contrary to a public GraphQL Endpoint, it's pretty unpredictable what the rate limit for a Query is. You first have to send it to the server and have it executed, only to find out that you've exceeded the limit of complexity.

Once we have our Endpoint back, we're able to call it using the variables. We don't need a GraphQL Client to do this.

On the server-side, the registration process of GraphQL Documents can be very efficient. Requests can be cached so that you don't have to parse the same GraphQL Document over and over again.

Imagine how much CPU time could be saved if every GraphQL Operation was only parsed once...

WunderGraph: A stupid simple approach to GraphQL and REST

As you can see, the developer experience will not really change when using GraphQL and REST together.

However, setting up everything to turn this idea into a great Developer Experience is a lot of work. You could just use 10 different npm packages and implement it yourself, but it's easy to get lost in the details and find yourself in a rabbit hole of edge cases.

Luckily, you don't have to start from scratch. We've already implemented the approach described above and are about to open source it very soon!

We're combining the flexibility of GraphQL with the power of REST.

We make use of GraphQL in the areas where it shines, giving us a flexible way to talk to APIs, and leverage the power of REST in the areas where GraphQL lacks compatibility with the web.

The result is a more scalable, flexible and powerful use of GraphQL than ever before.

You can try out WunderGraph today, and we're soon going to open source it.

If you're interested to join our thriving community, jump on our Discord and say hi!

Closing thoughts

You would probably not expose your SQL database to a browser-based client. (Some people might do, but I hope they know what they're doing.)

Why are we making a difference here for GraphQL? Why disallow a Query language for tables while allowing a Query language for APIs?

OpenAPI Specification (OAS) is full of terms related to HTTP. The GraphQL Specification doesn't mention HTTP a single time. SQL is also not about building HTTP-based APIs but rather talking to your database, and everybody accepts this.

Why are we so keen on using GraphQL in a way that requires us to rewrite the entire architecture of the Internet?

Why not just use GraphQL like SQL, on the server, behind a REST API?

Stay up to date

The latest WunderGraph news, articles, and resources, sent to your inbox.

© 2022 WunderGraph, Inc. All rights reserved.