Blog
/
Education

Exploring 2.5 Reasons People Embrace GraphQL in 2024, and the Caveats Behind Its Non-Adoption

cover
Jens Neuse

Jens Neuse

min read

In my day-to-day work, I interact with medium-sized companies like SoundTrackYourBrand (Spotify for Business), TailorTech (Headless ERP) or TravelPass (Travel Management Platform) as well as larger companies like Uber, Ebay, Equinix, Crypto.com and many others. At the same time, I'm also interacting with a lot of people on Twitter, Reddit, and other platforms.

What I've noticed is that there seems to be a fundamental split in how people perceive GraphQL. Medium-sized companies and larger companies are embracing GraphQL, Tech Twitter seems to have moved on to the next shiny thing, while the Reddit community sees both sides of the coin.

In this post, I'd like to give you a new perspective on why people are embracing GraphQL in 2024, and why a lot of people moved on.

From what I've seen, there's not one single "GraphQL" anymore. There's the "old" GraphQL that solves over-fetching, under-fetching, and other REST-related problems. This is the GraphQL that a lot of people moved on from. GraphQL is great for solving these problems, but alternatives like tRPC and React Server Components emerged that solve these problems in different ways. The old GraphQL can be understood as an alternative to the Backend for Frontend (BFF) pattern.

Then there's the "new" GraphQL. The new GraphQL is all about Federation. Being able to serve different clients and solving problems like the above is still relevant, but the main focus of the new GraphQL is to solve the problem of building APIs across different teams and services.

Cosmo: Full Lifecycle GraphQL API Management

Are you looking for an Open Source Graph Manager? Cosmo is the most complete solution including Schema Registry, Router, Studio, Metrics, Analytics, Distributed Tracing, Breaking Change detection and more.

GraphQL Federation solves fundamentally different problems than what monolithic GraphQL solves. This is why there's such a split in the community. Once you look at the reasons why people moved on from GraphQL, you'll see that they moved on to alternatives to the old GraphQL, while the new GraphQL is only just starting to be adopted.

Let's take a look at the reasons why people left the old GraphQL behind and prove the above statement with data.

Why developers moved on from GraphQL to different solutions in 2024

GraphQL seems to be too complex for small projects

My impression is that there's the general sentiment that GraphQL is too complex for small projects. Indeed, GraphQL adds complexity to your project, but this complexity also comes with some benefits. GraphQL forces you to have a Schema, either by defining it upfront (schema-first) or by deriving it from your code (code-first). But a Schema also gives you a lot of benefits, like being able to generate types for your frontend, and being able to generate documentation.

It's worth noting that there's a lot of ongoing investment in making it easier to build GraphQL APIs. There's for example Grats , a GraphQL Server Framework that leverages TypeScript to generate the Schema for you.

The GraphQL ecosystem has come a long way since its inception, and it's only getting better.

Rate limiting GraphQL APIs seems to be hard

Is it really that hard to rate limit a GraphQL API? If you say "no", it's probably because you haven't dug deep enough into the problem.

Let's take a look at how you'd rate limit a REST API. You define a rate limit for a specific endpoint. When the user exceeds the rate limit, you return a 429 status code. Add some documentation on how much the user can request each endpoint, and you're done.

Now let's take a look at some considerations you have to make when rate limiting a GraphQL API.

  • You can rate limit based on the complexity of the query (depth, number of fields, etc.)
  • You can rate limit based on the number of Subgraph requests
  • You can rate limit based on the actual load on your origin server (but how do you measure that?)
  • How will a user know how much they can request? How can they estimate the cost of their query?
  • How can a client automatically back off when they're rate limited?

It's not as simple as adding a middleware to a URL. That being said, there are solutions to these problems, e.g. Cosmo Router is an open-source GraphQL API Gateway that comes with support for rate limiting.

Small teams don't benefit from the upsides of GraphQL

Here are two tweets that illustrate that the "old" GraphQL doesn't bring enough benefits to small teams, when compared to other solutions like REST or RPC.

I think what the person is trying to say is that they are more familiar with REST. They are not making any particular point about REST being better than GraphQL, it's just that they seem to be more comfortable with REST, which is a very good reason to stick with a technology.

The second tweet is from myself. We're a single team, working on a single product, with one single monorepo. Our backend is a monolith, and we're using buf connect (gRPC over HTTP) to communicate between the monolith, the frontend, and a command-line tool.

We're using simple RPC over HTTP, and it's working great for us.

If we were using GraphQL, our workflow would look like this:

  1. Update the Schema
  2. Generate types for the Server
  3. Update Resolvers
  4. Generate types for the Client
  5. Update the Client

With RPC, our workflow looks like this:

  1. Update the proto file
  2. Generate types for client and server
  3. Update the server implementation

With RPC, both the Schema and the Query definitions are the same thing. If we were to add more and more services with completely different requirements, we'd probably move to GraphQL, but for now, RPC allows us to move fast and focus on the product.

REST gets the job done

I believe we're at the point where REST doesn't get the job done faster than GraphQL, given that you have a good understanding of the query language and use a modern GraphQL Framework. However, many people have a lot of experience in building RESTful APIs, so it's only natural that they stick with what they know.

But that's not the only factor that plays into this. It's important to take into consideration who you're building the API for. Is your audience more familiar with REST or GraphQL? REST is usually a safe bet when you're building an API for people who are not in your team or company. GraphQL is always a small bet, because you're raising the bar for people who want to use your API.

GraphQL doesn't play nice with the Web

There are a lot of valid points in this tweet. REST leverages the Web by building on top of HTTP, leveraging URLs, status codes, and headers.

GraphQL, on the other hand, uses HTTP POST requests as a tunnel, so you're losing a lot of the benefits of the Web, like built-in caching in the browser.

There are many solutions to add e.g. Caching to GraphQL, but it's not as straightforward as with REST.

GraphQL creates a lot of security concerns

GraphQL does indeed require you to think about security in a different way than REST. GraphQL inherits all security concerns from HTTP-based APIs like REST, and then adds some more on top of that because of the flexibility of the query language.

As with rate limiting and caching, the GraphQL ecosystem is very mature and has solutions for security as well, but you can see that there's a theme here: A query-style API is fundamentally different from a URL-based API. The "Q" in GraphQL gives you a lot of power, and that comes at a cost.

You can just build a custom Endpoint for that

There are two ways to build a great API that performs well for every use case.

  1. You build a super flexible API (GraphQL) and use analytics, metrics, and tracing information to optimize the performance.
  2. You build very specific Endpoints that are optimized for each particular use case.

Which one is better? It doesn't just depend. Maintaining the two APIs is just a bit different.

With a very flexible API, your users can construct the queries they need. These might not always be the most efficient queries, but at least they can work independently from the team who's implementing the API.

With very specific Endpoints, you can build every endpoint in the most optimal way and won't have to worry about users constructing inefficient queries. However, you'll have to maintain all these custom Endpoints. In addition, whenever a new use case comes up, you might have to build a new Endpoint or modify an existing one. But if you extend an existing Endpoint, you're introducing the over-fetching problem that GraphQL solves.

There's no clear winner here, which also makes it clear that it's more complex than just saying "I can build a custom Endpoint for that".

You don't need GraphQL when all use cases are known

This is a wunderful tweet, because it's actually making a good point on what GraphQL is good for.

If all use cases are known, we can build the perfect REST API for that and be done with it.

Let's reverse that.

If not all use cases are known, we can build a flexible API that allows users to construct queries for use cases that we haven't thought of yet.

Using GraphQL leads to unnecessary complexity

I have empathy for this tweet. You use a technology that adds complexity to your project, and over time, it turns into a complete mess. We all have been there.

What I've learned is that it's usually not the technology that's at fault, but the way we use it. You can mess up the architecture of a REST API just as much as you can mess up the architecture of a GraphQL API. When a team is incapable of building a good REST API, and then migrates to GraphQL, it'll surprise me if they suddenly build a good GraphQL API.

Server Actions, TypeScript and a Cache are all you need

Especially for simpler use cases, new technologies like tRPC and React Server Components are a great alternative to GraphQL. They are much more lightweight and don't require you to build a Schema. They leverage the TypeScript compiler and ecosystem, which can lead to a superior developer experience.

GraphQL makes projects slower, more complex, and more likely to fail

This tweet carries a slightly negative sentiment, but we can take a look at the points made and see if they are valid.

First, I think it's generally wrong to say that GraphQL makes projects slower, more complex, and more likely to fail. This is a very broad statement that's lacking specifics and is hard to prove.

It goes on to say that 90% of GraphQL use cases can be handled by using a simple REST API. Why only 90%? There's almost no use case that you cannot model with REST.

Lastly, it says that GraphQL is overused by engineers who prioritize their well-being over user value and time to market. I don't actually see this as criticism of GraphQL, but rather as criticism of engineers who prioritize their personal goals, e.g. using a technology they like, over the goals of the company.

The last point is valid for any technology, not just GraphQL. We write software to solve problems for our users/customers (hopefully), not just to use the latest and greatest technology. That said, Engineers are also humans and want to enjoy their work. I think it's fine to use a technology that you enjoy as long as it doesn't hurt the company.

That migration to the latest version of NextJS with Server Actions, the rewrite using Rust, or the use of MongoDB instead of PostgreSQL, was it really necessary?

Luckily, we can stop these discussions very soon, when AI writes all the code for us.

Summary of why people moved on from GraphQL

Let's summarize the main reasons why people moved on from GraphQL:

  • GraphQL seems to be too complex for small projects
  • Rate limiting GraphQL APIs seems to be hard
  • Small teams don't benefit from the upsides of GraphQL
  • REST gets the job done
  • GraphQL doesn't play nice with the Web
  • GraphQL creates a lot of security concerns
  • You can just build a custom Endpoint for that
  • You don't need GraphQL when all use cases are known
  • Using GraphQL leads to unnecessary complexity
  • Server Actions, TypeScript and a Cache are all you need
  • GraphQL makes projects slower, more complex, and more likely to fail

We can condense these reasons into one sentence:

GraphQL is too much overhead, and there are simpler alternatives like REST and RPC.

Was that the nail in the coffin for GraphQL? Absolutely not! There are a lot of valid reasons why people moved on from GraphQL. I could go on forever, but you'll only get more of the same or similar reasons.

But there's a problem with the data we've looked at. These people are all talking about the "old" GraphQL, the GraphQL that's an alternative to the BFF pattern, the GraphQL that solves over-fetching, under-fetching, and other REST-related problems.

But what about this "new" GraphQL? Why are more and more companies embracing it? What are the problems it solves? What are the benefits? Let's take a look!

The 2.5 reasons why people embrace GraphQL in 2024

We'll soon lift the secret of the 0.5 reason, but first, let's take a look at the two main reasons why people embrace GraphQL in 2024.

I've interviewed hundreds of engineers, architects, engineering managers and CTOs over the last two years. When it comes to APIs, here's what I heard the most:

  • We have organically grown a lot of services and APIs
  • It's hard to discover what services and APIs are available
  • We probably have a lot of duplicate functionality, but we're not sure
  • We're using a lot of different technologies, protocols, and data formats and it's hard to keep up
  • We have integrated a lot of external services and APIs and it's getting out of hand
  • We have acquired companies and inherited their services and APIs

There's an overarching term for these problems: API sprawl.

Solving API sprawl: Decentralized API platforms with centralized governance

When we talk about API sprawl, the first thing that comes to mind is that we should invest in service discovery, documentation, API catalogs, API management, API style guides, and API governance.

So the intuitive solution to counter API sprawl is to put your existing REST, gRPC, SOAP, GraphQL, AsyncAPI, and other APIs into a catalog, properly document them, so that your organization can discover them easily, and then enforce some rules on how to build new APIs.

But have you thought about a completely different approach? What if we could unify all these APIs into one single unified API with a single Schema?

Imagine if the users of your API only needs to use one single API, and they don't have to worry about what technology, protocol, or data format the underlying services use. Instead of having to deal with a lot of different APIs and protocols, they only have to deal with one single API and one single protocol.

Suddenly, the "Q" in GraphQL becomes a lot more powerful again. If we're unifying all our APIs into one single Schema, we need a query language to select a subset of the Schema, and that's exactly what GraphQL is good at.

But how do we unify all our APIs into one single Schema you ask? We've seen a lot of attemtps to do this in the past, like the Enterprise Service Bus (ESB), API Gateways and API Management Platforms, Backend for Frontend (BFF) patterns, and more recently, GraphQL Schema Stitching.

What all of these solutions have in common is that they require a centralized component that's responsible for unifying, aggregating, and transforming the APIs into a single Schema. This centralized component quickly becomes a bottleneck, and history has shown many times that it's hard to scale and maintain.

Companies need a solution that's decentralized, where each team can work independently, but we're still able to enforce rules and standards across the organization.

What we need is a decentralized API platform with centralized governance.

Distributed GraphQL: Building APIs across different teams and services

There's a new pattern that's emerging in the GraphQL community over the last two years, and it's called Distributed GraphQL.

Distributed GraphQL has 2 main components:

  1. A composition algorithm that defines and implements rules on how to merge multiple GraphQL Schemas into a unified one
  2. A runtime that's responsible for executing the queries against the unified Schema

It's important to have these two components separated. We've seen with previous attempts to unify APIs with a single step, where composition and execution are a single step.

The problem with this approach is that you'll only know if your Graph composes correctly when you actually run it. This entails that you have to run the entirety of the Graph to know if it's correct, which doesn't work well once you have more than a few services.

With Distributed GraphQL, you can compose the Graph at build time. This means that you can change one of the Subgraphs in isolation and verify that it still composes correctly with the rest of the Graph.

But is it actually enough to test if the Graph composes correctly? Short answer: No.

We should also check if the resulting Graph has breaking changes, and if so, are clients affected by these changes?

In addition, we'd want to enforce linting rules on the changes we're making. This lets us govern the Graph and make sure that it's consistent and follows the rules we've set.

This sounds like a complex workflow you might think, but is it really? Imagine we were using gRPC, REST, SOAP and other APIs in a wild mix. How would we implement the same workflow?

The answer is that it's almost impossible and I've never seen it done in practice.

Distributed GraphQL is a game changer as it enables us to create a whole new workflow for building APIs across different teams and services.

But how do we actually implement Distributed GraphQL? Let's take a look at the other 1.5 reasons why people embrace GraphQL in 2024 and then look at how we can implement it.

GraphQL as a way to serve different clients

In large organizations, there's not just one client or application that's consuming an API. From the web, to mobile and IoT devices, to internal services and third party integrations, there can be many different clients that are consuming the same API.

The standard way to serve different clients is to build dedicated Endpoints for each client, which is nothing else than the Backend for Frontend (BFF) pattern, creating a dedicated backend that encapsulates the needs of a specific client.

If we try to draw a parallel to a GraphQL API, each BFF or Endpoint can be understood as a GraphQL Query of your unified Schema. Let's compare the two approaches to see the differences:

In the BFF approach, you have to build a dedicated backend for each client. When a new client requirement comes up, we have to change the code of the BFF and deploy it. If the BFF is not able to fulfill the requirement, we have to delegate the request to the owners of the underlying services. Once the services are updated with the new requirement, we have to update the BFF to include the new data in the response. Once the BFF is updated, we can update the client to consume the new data.

In the distributed GraphQL approach, we have a single unified Schema. When a new client requirement comes up, the client owner can modify an existing Query or create a new Query that fulfills the requirement. If the unified Graph is not able to fulfill the requirement, the client owner can ask the affected Subgraph owners to add a new field. The Subgraph gets updated, and the new field is published into the unified Graph. The client can now consume the new field.

In this workflow, there are a few differences to point out.

First, the client owner can modify the Query themselves. They don't have to ask the BFF owner to do it for them. This gives the client owner a lot of flexibility and independence. And even if the client owner and the BFF owner are the same person or team, it would still be a lot easier to modify the Query than to modify the BFF.

Second, as we described earlier, the composition step of the unified Graph allows us to verify changes to a service against all clients. With the BFF approach, you have to implement breaking change detection, client traffic analysis, and linting yourself.

In my opinion, the distributed GraphQL approach comes with 3 main benefits:

  1. Both client and service owners can work more independently
  2. The composition step allows us to verify changes of one service against all other services and clients
  3. While team can work independently, we can still employ centralized governance, e.g. through linting rules

A side effect of this approach is that we only need one layer of Routers, which is the runtime that's responsible for executing the queries against the unified Schema. Compare this to the BFF approach, where we have to run, maintain, and operate one BFF per client.

Fragments-based GraphQL Clients: Giving your frontend developers superpowers to build maintainable user interfaces

I still believe that Relay gives frontend developers a far superior developer experience compared to other API styles. It's not just about the Fragments, but also about the way Relay handles caching, pagination, and other common frontend problems.

As Relay is already a well-known technology, I'd like to point you towards new clients that are emerging in the GraphQL community, like for example Isograph .

Isograph is an opinionated framework for building interactive, data-driven apps. It makes heavy use of its compiler and of generated code to enable developers to quickly and confidently build stable and performant apps, while providing an amazing developer experience.

Isograph by Robert Balicki takes Fragments one step further. You can define "data-components", which are re-usable data fetching requirements to avoid code duplication.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40

Similar to Relay, Isograph comes with a compiler that generates types and code for you by analyzing your UI components and the iso tags.

The second honorable mention is gql.tada . This library takes a different approach compared to Isograph. Instead of using a compiler to generate types and code for you, gql.tada is essentially a TypeScript Language Service Plugin (LSP) that gives you autocompletion and type checking for your GraphQL queries in your favorite editor without any code generation or compilation step.

The developer experience of gql.tada is phenomenal. You can use Fragments, infer the value of __typename, and many more useful features.

So, why is this just a 0.5 reason? Let me explain.

In the past, the primary reason to use GraphQL was to solve over-fetching, under-fetching, and other REST-related problems. We've discussed this in the beginning of the post. There are now many alternatives like tRPC, RSC, and others that solve these problems in different ways. This is why I don't believe that clients like Isograph and gql.tada will change the game, but they play a crucial role in the GraphQL ecosystem.

As more and more companies adopt and expand their usage of distributed GraphQL, the need for a great developer experience for frontend developers will increase.

Distributed GraphQL will become the Enterprise standard for building APIs across different teams and services, and clients like Isograph and gql.tada show that the GraphQL ecosystem is ready to support this shift.

How to implement Distributed GraphQL

We've discussed the 2.5 reasons why people embrace GraphQL in 2024, and now it's time to take a look at how we can implement Distributed GraphQL.

The first thing you need to do is to define the composition algorithm. There are source available solutions like Apollo Federation, and Open Source alternatives like Open Federation . We're developing Open Federation and our goal is to keep the ecosystem open and interoperable. Everybody should be able to implement their own composition tooling and runtimes, and we should be able to use different solutions together.

You can find our implementation of the composition algorithm on GitHub , it's available in JavaScript/TypeScript and Go.

Once you have a composition workflow defined, you need a place to store the unified Schema and provide it to the runtime. This component is usually called Schema Registry. An example can be found in the Cosmo GitHub repository as well, we call it the Control Plane .

Once you have your unified Schema published into the Schema Registry, you need a runtime that's responsible for executing the queries against the unified Schema. This component is usually called Router or Gateway. In our case, we call it the Cosmo Router .

Now we can start serving traffic to the unified Schema, but we're not done yet. It's like we're sitting in the cockpit of a plane, but we have no instruments to tell us if we're flying in the right direction. So we need a Dashboard that gives us insights into the Graph, like breaking changes, client traffic, analytics, metrics, tracing, etc... This dashboard is usually called Studio, and you can find our implementation here .

But wait, we've forgotten something. Our studio is only showing us the Graph, but we have no data to show. We need two more things to complete the picture. We need a GraphQL Metrics Collector , which is responsible for collecting Schema Usage Metrics from the Router. This allows us to compute if a change to a Subgraph affects a client or not.

And lastly, we need an OTEL Collector . This component is responsible for collecting traces and metrics from the Router, your Subgraphs, Clients, and other components in your infrastructure. The Collector aggregates all of this data and sends it to the Cosmo Control Plane and your observability platform of choice. This gives us the full picture of what's happening in our distributed GraphQL infrastructure.

If you put all of these components together, you have a complete solution for building APIs across different teams and services. We call this solution Cosmo, and it's available as open source on GitHub .

Conclusion

In my opinion, the main reason for adopting GraphQL is not to solve over-fetching, under-fetching, and other REST-related problems anymore. It's about solving API sprawl and building APIs across different teams and services. The GraphQL ecosystem is still thriving and has a lot of innovation to offer, and I'm excited to see where it's heading in the next few years.

What I'm most excited about is that once we have a unified Schema and a runtime to serve it, we can start to build tooling around unified Graphs, which will further propell the growth of the ecosystem.

We're talking with many hyperscalers, and what we've found is that a lot of them want to run their distributed Graph platform fully on-premises. We believe that Open Source is the best approach to unite companies and fuel collaboration to solve similar problems.

That was my 2.5 reasons why I think people embrace GraphQL in 2024. I hope you've enjoyed the read and that you've learned something new.