Namespacing for GraphQL: Conflict-Free merging of any number of APIs
We're hiring!
We're looking for Golang (Go) Developers, DevOps Engineers and Solution Architects who want to help us shape the future of Microservices, distributed systems, and APIs.
By working at WunderGraph, you'll have the opportunity to build the next generation of API and Microservices infrastructure. Our customer base ranges from small startups to well-known enterprises, allowing you to not just have an impact at scale, but also to build a network of industry professionals.
Namespacing is an essential concept in programming, allowing us to group things and prevent naming collisions. This post shows you how we apply the concept to APIs to make composition and integration of different services easier.
We'll show you how to integrate 8 services, SpaceX GraphQL, 4x GraphQL using Apollo Federation, a REST API using OpenAPI Specification, a PostgreSQL-based API and a Planetscale-Vitess-based (MySQL) API with just a couple of lines of code, fully automatic, without any conflicts.
When you install a npm package, it lives within its own namespace. One such package is axios, a very popular client to make HTTP requests.
To install axios, you run the following command:
This installs the axios dependency into your node_modules folder and adds it to your package.json
file.
From now on, you can import and use the code provided by the axios package like so:
Import the dependency, give it a name, in this case just axios, then use it. We could have also renamed axios to bxios. Renaming an import is essential to dependency management to avoid collisions.
One essential rule is that you should have no two imports with the same name, otherwise you have a naming collision, and it's unclear how the program should be executed.
Should we run axios or bxios?
Alright, enough intro. You're probably familiar with all this already, what does it have to do with APIs?
A lot! At least I think so. This whole workflow is amazing!
You can write code, package it up as a npm package, publish it, and others can import and use it very easily. It's such a nice way to collaborate using code.
How does it look like for using APIs? Well, it's not such an oiled machine. With APIs, we're still in the stone-age when it comes to this workflow.
Some companies offer an SDK which you can download and integrate. Others just publish a REST or GraphQL API. Some of them have an OpenAPI Specification, others just offer their own custom API documentation.
Imagine you'd have to integrate 8 services to get data from them. Why could you not just run something similar to yarn add axios
and get the job done? Why is it so complicated to combine services?
The Problem - How to merge APIs conflict free
To get there, we have to solve a number of problems.
- We need to settle on a common language, a universal language to unify all our APIs
- We need to figure out a way to "namespace" our APIs to resolve conflicts
- We need a runtime to execute the "namespaced" Operations
Let's drill down the problems one by one.
GraphQL: The universal API integration language
The first problem to solve is that we need a common language to base our implementation approach on. Without going onto a tangent, let me explain why GraphQL is a great fit for this purpose.
GraphQL comes with two very powerful features that are essential for our use case. On the one hand, it allows us to query exactly the data we need. This is very important when we're using a lot of data sources as we can easily drill down into the fields we are interested in.
On the other hand, GraphQL lets us easily build and follow links between types. E.g. you could have two REST Endpoints, one with Posts, another with Comments. With a GraphQL API in front of them, you can build a link between the two Objects and allow you users to get Posts and Comments with a single Query.
On top of that, GraphQL has a thriving community, lots of conferences and people actively engaging, building tools around the Query language and more.
GraphQL and Microservices: Schema Stitching vs. Federation
That said, GraphQL also has a weakness when it comes to API integration. It doesn't have a concept of namespaces, making it a bit complex to use it for API integration, until now!
When it comes to service integration, there are so far two major approaches to solve the problem. For one, there is Schema Stitching and then there's also Federation.
With Schema Stitching, you can combine GraphQL services that are not aware of the stitching. Merging the APIs happens in a centralized place, a GraphQL API gateway, without the services being aware of this.
Federation, specified by Apollo, on the other hand proposes a different approach. Instead of centralizing the stitching logic and rules, federation distributes it across all GraphQL Microservices, also known as Subgraphs. Each Subgraph defines how it contributes to the overall schema, fully aware that other Subgraphs exist.
There's not really a "better" solution here. Both are good approaches to Microservices. They are just different. One favours centralized logic while the other proposes a decentralized approach. Both come with their own challenges.
That being said, the problem of service integration goes way beyond federation and schema stitching.
One Graph to rule them all, or not!
The number one pattern of Principled GraphQL is about integrity and states:
Your company should have one unified graph, instead of multiple graphs created by each team. By having one graph, you maximize the value of GraphQL:
- More data and services can be accessed from a single query
- Code, queries, skills, and experience are portable across teams
- One central catalog of all available data that all graph users can look to
- Implementation cost is minimized, because graph implementation work isn't duplicated
- Central management of the graph – for example, unified access control policies – becomes possible
When teams create their own individual graphs without coordinating their work, it is all but inevitable that their graphs will begin to overlap, adding the same data to the graph in incompatible ways. At best, this is costly to rework; at worst, it creates chaos. This principle should be followed as early in a company's graph adoption journey as possible.
Let's compare this principle to what we've learned about code above, you know, the example with axios and bxios.
More data and services can be accessed from a single query
Imagine there was one giant npm package per company with all the dependencies. If you wanted to add axios to your npm package, you'd have to manually copy all the code into your own library and make it "your own" package. This wouldn't be maintainable.
One single graph sounds great when you are in total isolation. In reality however, it means that you have to add all external APIs, all the "packages" that you don't control, to your one graph. This integration must be maintained by yourself.
Code, queries, skills, and experience are portable across teams
It's right. With just one graph, we can easily share Queries across teams. But is that really a feature? If we split our code into packages and publish them separately, it's easy for others to pick exactly what they need.
Imagine a single graph with millions of fields. Is that really a scalable solution? How about just selecting the sub-parts of a giant GraphQL schema that are really relevant to you?
One central catalog of all available data that all graph users can look to
With just one schema, we can have a centralized catalog, true. But keep in mind that this catalog can only represent our own API. What about all the other APIs in the world?
Also, why can't we have a catalog of multiple APIs? Just like npm packages which you can search and browse.
Implementation cost is minimized, because graph implementation work isn't duplicated
I'd argue that the opposite is true. Especially with Federation, the proposed solution by Apollo to implement a Graph, it becomes a lot more complex to maintain your Graph. If you want to deprecate type definitions across multiple Subgraphs, you have to carefully orchestrate the change across all of them.
Microservices are not really micro if there are dependencies between them. This pattern is rather called distributed monolith.
Central management of the graph – for example, unified access control policies – becomes possible
It's interesting what should be possible but isn't reality. We're yet to see a centralized access control policy system that add role based access controls for federated graphs. Oh, this is actually one of our features, but let's not talk about security today.
Why the One Graph principle doesn't make sense
Building one single Graph sounds like a great idea when your isolated on a tiny isle with no internet. You're probably not going to consume and integrate any third party APIs.
Anybody else who is connected to the internet will probably want to integrate external APIs. Want to check sales using the stripe API? Send emails via Mailchimp or Sendgrid? Do you really want to add these external services manually to your "One Graph"?
The One Graph principle fails the reality check. Instead, we need a simple way to compose multiple Graphs!
The world is a diverse place. There are many great companies offering really nice products via APIs. Let's make it easy to build integrations without having to manually add them to our "One Graph".
GraphQL Namespacing: Conflict-free merging of any number of APIs
That leads us to our second problem, naming conflicts.
Imagine that both stripe and mailchimp define the type Customer, but both of them have a different understanding of the Customer, with different fields and types.
How could both Customers types co-exist within the same GraphQL Schema? As proposed above, we steal a concept from programming languages, namespaces!
How to accomplish this? Let's break down this problem a bit more. As GraphQL has no out-of-the-box namespacing feature, we have to be a bit creative.
First, we have to remove any naming collisions for the types. This can be done by suffixing each "Customer" type with the namespace. So, we'd have "Customer_stripe" and "Customer_mailchimp". First problem solved!
Another issue we could run into is field naming collisions on the root operation types, that is, on the Query, Mutation and Subscription type. We can solve this problem by prefixing all fields, e.g. "stripe_customer(by: ID!)" and "mailchimp_customer(by: ID!)".
Finally, we have to be careful about another feature of GraphQL, often ignored by other approaches to this problem, Directives!
What happens if you define a directive called @formatDateString
and two Schemas, but they have a different meaning? Wouldn't that lead to unpredictable execution paths? Yes, probably. Let's also fix that.
We can rename the directive to @stripe_formatDateString
and @mailchimp_formatDateString
respectively. This way, we can easily distinguish between the two.
With that, all naming collisions should be solved. Are we done yet? Actually not. Unfortunately, with our solution we've created a lot of new problems!
WunderGraph: A runtime to facilitate namespaced GraphQL
By renaming all types and fields, we've actually caused a lot of trouble. Let's have a look at this Query:
What are the problems here?
The field "mailchimp_customer" doesn't exist on the Mailchimp schema, we have to rename it to "customer".
The directive "mailchimp_formatDateString" also doesn't exist on the Mailchimp Schema. We have to rename it to "formatDateString" before sending it to the upstream. But be careful about this! Make sure this directive actually exists on the origin. We're automatically checking if this is the case as you might accidentally use the wrong directive on the wrong field.
Lastly, the type definition "PaidCustomer_mailchimp" also doesn't exist on the origin schema. We have to rename it to "PaidCustomer", otherwise the origin wouldn't understand it.
Sounds like a lot of work? Well, it's already done and you can use this right away. Just type yarn global add @wundergraph/wunderctl
into your terminal, and you're ready to try it out!
It's also going to be open source very soon. Make sure to sign up and get notified when we're ready!
With that, we're ready for the implementation phase.
Importing the API Dependencies
In the first step, we've got to "import" our API Dependencies. We can do so by using the WunderGraph SDK. Simply "introspect" all the different services and combine them into an "application".
If you look at the code, you'll probably realize the keyword "apiNamespace" a few times. The "apiNamespace" makes sure to put each API into its own boundary. This way, naming collisions are automatically avoided.
Once you've introspected all dependencies, we're ready to write a query that spans all 8 services.
We want to get the users from the spaceX API, users from the JSON Placeholder API, more users from our PostgreSQL Database, yet more users from the Planetsacle Database, and finally, a single user with reviews and products from a federated Graph.
All this is possible through our rich set of DataSources.
Notice how it makes use of the prefixed/namespaced root fields. This Query gives us data from all 8 services at once, crazy!
Now run wunderctl up
to get the whole thing running in a couple of seconds. This works on your local machine without calling any cloud services.
Summary
We've started this post talking about how namespacing makes writing and sharing code so easy. Then we've explored the differences between the "code approach" and having to deal with API integrations.
We explored Schema Stitching as well as Federation and learned that both are good approaches but not enough. We've looked into the "One Graph" principle and realised that it has its shortcomings.
Finally, we've introduced the concept of namespaced GraphQL APIs, making it possible to combine GraphQL, Federation, REST, PostgreSQL and Planetscale APIs in just a couple of lines of code.
If you're interested in seeing all this in action, here's a video of me going through the whole flow:
Our vision for WunderGraph is to become the "Package Manager for APIs". We've not yet quite there, but you'll eventually be able to run wunderctl integrate stripe/stripe
, then write a Query or Mutation and the integration is done.
If you want to follow us along, sign up for our NewsLetter. See you next time!