Blog
/
Education

What happens if we treat GraphQL Queries as the API definition?

cover
Jens Neuse

Jens Neuse

min read

We're hiring!

We're looking for Golang (Go) Developers, DevOps Engineers and Solution Architects who want to help us shape the future of Microservices, distributed systems, and APIs.

By working at WunderGraph, you'll have the opportunity to build the next generation of API and Microservices infrastructure. Our customer base ranges from small startups to well-known enterprises, allowing you to not just have an impact at scale, but also to build a network of industry professionals.

You've probably read the title twice, and it still doesn't make sense. One very powerful feature of GraphQL is that we have a schema. Why would you want to give this up and how should we treat Queries like Schemas? Let me explain...

A while ago, I was working on a library called graphql-go-tools which implements the GraphQL specification in Go, so that one can build tools like GraphQL proxies, caches, WunderGraphs etc. on top of it. I was investigating how persisted Queries could help to make GraphQL APIs more secure and performant when I came across a video of a very smart person, Sean Grove.

Sean was presenting how he used an extended version of GraphiQL . First, he wrote a number of Queries in the browser. Then he clicked a button and generated a fully functional React Application. With the help of CodeSandbox, you were immediately able to run the app from within GraphiQL.

There were a few things I didn't agree with. My thoughts were that this shouldn't be embedded into the browser. It needs to work in a local dev environment. It should be easy to embed this into the development workflow.

One thing stood out for me:

An application built on top of GraphQL actually has two schemas

One is obvious, it's the schema returned by an introspection Query. It tells you about all the Queries, Mutations and Subscriptions, that are available on a GraphQL server.

The second schema is kind of implicit. If you take all the GraphQL Operations that are being used in an application, you're able to extract a description of all possible inputs and outputs of all of them.

Let's assume that by default, all GraphQL Operations will be persisted, we're able to generate an OpenAPI specification for the resulting API.

Don't worry if it doesn't immediately click. Let's look at an example to illustrate the idea.

Let's assume we're building a Hacker News clone and want to build the landing page where we have to fetch a feed of the newest stories. Our Query (simplified) could look like this:

1
2
3
4
5
6

Persisting a Query means, the Query gets stored on the server with a name, e.g. a Hash. A client can only call this Query by sending the name (Hash) and the variables.

If we persist the Landing Query, the resulting (REST-ish) endpoint could look like this:

1

An alternative way of representing this endpoint could look like this:

1

You'll soon understand why it's an advantage to use a JSON encoded object in the URL instead of plain query parameters.

Now that the concept of persisted Queries is clear, let's build on top of that knowledge.

We have a REST endpoint. We know the exact input variables. From the underlying GraphQL schema, we're also able to extract the exact structure of the response. Both the input variables and the response variables can be represented as a JSON.

Luckily, there's very good vocabulary to define the structure of JSON documents: JSON Schema!

There are a lot of benefits of using JSON schema. One of them is, there are implementations in many languages to validate a JSON string based on a JSON schema.

Let's look at an example. Here's the JSON schema for the Landing Query REST Endpoint:

1
2
3
4
5
6
7
8
9
10

A JSON Object is expected. It should have the property page which is of type number. The property page is required. Other properties are disallowed.

If you know JSON Schema a bit better, you'd know that it allows you to define a lot more than just the structure of the JSON.

Let's say, we'd like to build an API that only accepts positive integers. We could change the schema slightly to make this work.

1
2
3
4
5
6
7
8
9
10
11

GraphQL has no such rules. You could extend your GraphQL server by using custom directives. How about relying on an existing standard?

JSON Schema

In WunderGraph, we generate a JSON Schema automatically from the Queries you write. This way, you could easily modify the resulting schema to add rules like defining a minimum.

Alright, we've covered the input variables. Let's have a look at the JSON Schema for the response object:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

It looks pretty basic, yet it's super powerful to have it. GraphQL is "quite new" and niche compared to JSON Schema. There are a lot of tools that can pick up JSON Schemas and do useful stuff with it.

You might think that by turning GraphQL into REST, you're losing on type safety or ease of use.

From my point of view, the opposite is the case.

We can take the JSON Schemas for all our Operations and feed them into code-generators. This way, we're able to generate a fully type-safe client for all our Operations.

Write a Query, extract the JSON Schema by visiting all nodes of the GraphQL Query AST. Take the JSON Schema and generate the client in any possible language.

This is exactly the flow we've implemented in WunderGraph. The end result is a developer experience that feels like you're using GraphQL but under the hood, it's a lot more secure and lightweight.

The generated client only needs to know the endpoint to call. The type definitions, e.g. in case of using TypeScript, will be transpiled away before runtime. This is also the reason why the WunderGraph client is just 2kb of JavaScript without giving up on features or developer experience.

Alright, that's already a lot of value, we're getting from this pattern. What else is possible? I'd like to give you some inspiration and ideas to think about.

Better input validation for GraphQL

Concepts like persisted queries are widely known within the GraphQL community. We've already introduced it at the beginning of this post.

If you add JSON Schemas for your inputs, you're able to use a library off-the-shelf to validate the variables for all Operations. If you hook this up into a middleware, you can make your GraphQL server a lot more secure. Expecting a URL as input and not just a plain String? Define the format using JSON Schema, and you're done.

Generate Forms

There's a library called react-jsonschema-form . It takes a JSON Schema and generates forms for your React application.

As we recall, the JSON Schema can be generated just by writing GraphQL Queries. This means, we're able to build ready to use UI components by writing Queries. That's super powerful if you ask me!

You could turn any GraphQL API into an admin dashboard by writing Queries.

Guess what we're working on at WunderGraph? We allow you to connect any API, Service, Database and mesh them all together to a single GraphQL API. We add a layer of authentication and let you generate fully functional dashboards by writing Queries.

Advanced Authentication & Authorization patterns

Speaking of authentication, if we add an authentication middleware in front of the REST-GraphQL-API, we get another super powerful feature, almost for free.

A very simple and broadly used pattern for authentication is to delegate the heavy lifting to another party that is focused on logging users in. Such services are called identity providers. So if we want to authenticate a user, we redirect them to the identity provider. If all goes well, the identity provider returns the user back to our service with some information on them. This information is called claims. Keep in mind that the description of the login flow is overly simplified.

Claims are name value pairs of information about the user, e.g. their name, email or role.

What can we do with the claims?

Here's an example use-case from a NextJS PostgreSQL Demo .

1
2
3
4
5
6
7
8
9
10

This mutation create a message on behalf of the user. If there's already a user object in the database, look it up via their email. If they don't yet exist in the database, create an entry and connect it.

You can see that both $email and $name are annotated using a custom @fromClaim directive. This directive is a custom implementation of WunderGraph. You could easily implement a similar functionality with any GraphQL implementation/framework.

What happens at runtime? First, we can declare this endpoint to require the user to be authenticated. That is, if they are not logged in, we'd return a 401 unauthorized. Next, before actually calling the resolvers, we're going to take the values from their claims email and name and inject them into the variables.

We can save a lot of custom business logic by using this pattern. It's a declarative approach that makes it very easy to understand what values will be injected into a Query. Otherwise, you'd hide this logic somewhere in the code base, hard to find.

If your API, like most, is a private API, you're able to build the GraphQL API a lot more flexible.

This leads to another advantage.

Another layer of abstraction

I've previously spoken about why I think, GraphQL is not meant to be exposed over the internet. There are a lot of problems to solve when you're about to bring your GraphQL API into production. On the other hand, if you're not directly exposing GraphQL, you have a lot less problems to solve.

In another post, I've debated that generated GraphQL APIs result in tight coupling between the services that inform a generated API and the clients using the API.

If you look at the example above, the one where we create a message on behalf of a user, it's obvious to see that we cannot expose this API directly.

There needs to be a layer of abstraction on top of it to make it useful. Otherwise, users could create messages on behalf of other users, no login required.

No Backend Required

Keeping Queries entirely on the backend, adding authentication and authorization as well as the ability to inject claims is not a replacement to implement custom business logic in your backend.

However, as you saw in some examples above, you're able to solve a broad set of use cases without any additional business logic.

For situations, where @fromClaim directives are not enough, there's always an easy escape route by adding a REST or GraphQL API.

Your own Backend as a Service

Using the described approach in this post, we're able to leverage a whole new set of tools to its full capacity.

Services like Hasura, FaunaDB, dgraph and many others, directly expose GraphQL or REST APIs on top of a database. With the help of WunderGraph, you're able to turn any such service into your own Backend as a Service (BaaS).

You'll get a generated typesafe client with authentication and authorization on top. This gives you the convenience of using a BaaS, combined with the flexibility and power of owning your own data(-base).

Summary

I hope you're able to take some inspiration on what's possible when we treat GraphQL Queries as the API definition.

I don't think this will replace the way we currently use GraphQL APIs though. For public APIs like e.g. GitHub or Shopify it doesn't even make sense. However, as most of us use GraphQL as a private API, this pattern can help us to build better apps faster and make them more secure.

It can be especially useful when combined with generated APIs or APIs directly exposed from a database.

If you have other great ideas how to use this pattern, please reach our via Twitter . I'd be very interested to hear your thoughts.

Try it out yourself!

If you're keen trying out the described concepts yourself, have a look at this repository . It demonstrates how to build a Real-time Chat application with NextJS as the frontend and a PostgreSQL database as the backend.