Solving the double (quintuple) declaration Problem in GraphQL Applications: How to not repeat yourself!
We're hiring!
We're looking for Golang (Go) Developers, DevOps Engineers and Solution Architects who want to help us shape the future of Microservices, distributed systems, and APIs.
By working at WunderGraph, you'll have the opportunity to build the next generation of API and Microservices infrastructure. Our customer base ranges from small startups to well-known enterprises, allowing you to not just have an impact at scale, but also to build a network of industry professionals.
In this post, we'll look into the double declaration problem of (GraphQL) APIs in Web Applications. First I'll define the problem and see that it should actually be called quintuple declaration problem. Once the problem is clear, I'll propose an optimal solution.
Who is suffering from the double declaration problem? Essentially, everybody who is writing Web Applications has to deal with the problem. So, if you're a web developer, or you manage web developers, this post is for you!
There's also a video accompanying this blog post if you're interested in seeing everything in action: https://youtu.be/ftu8MM5BmGE
Why the double declaration problem should be called quintuple declaration problem instead
Alright, what's the double declaration problem, why is it important and why is the name actually misleading.
If you ask Google about for GraphQL double declaration problem
you'll get a few good results which explain the issue well. In a nutshell, it describes the problem that as a developer, you have to define the GraphQL Operation (e.g. Query or Mutation) as well as the type definitions for the inputs as well as the response. This is a problem because you have to keep the two in sync. If you add a field to your Query, you also have to add the field to the response structure, e.g. a TypeScript interface. This is not only boring extra work but also error prone because every non-automated task is a good candidate for errors.
Now that we understand the problem, why is "double declaration" misleading?
So far, we've talked about two declarations, the GraphQL Query as well as the type definition of the response. What did we miss? Starting from the backend, we have to define the schema of the database (3). Next, we have to define the schema of the GraphQL API (4). Finally, to connect the user interface, we have to build another schema for the UI components / forms (5). At this point, we've reached a quintuple. We could easily add more layers, e.g. input validation at the backend but this should be enough for now.
So, to recap the complete flow:
- Define the database schema
- Define the API schema
- Write a GraphQL Operation
- Define the response type definition
- Define the UI components / build the forms
Most Web Applications are just forms that talk to a database.
So what we really want is a simple way to build forms that talk to our DB / API / Service.
Is this just a GraphQL problem? Not really. Replace GraphQL with REST or any other API style, and you end up more or less with the same problems. You might not have to define the GraphQL Operation but all other tasks still apply.
How do existing solutions deal with the problem?
The most common approach is to generate the response type definitions automatically. This is a very smart solution as it can be done automatically.
First, declare your GraphQL Operation with a "tag" annotation. Then a program will continuously look through all of your code and look for these tags. It'll parse all GraphQL Operations and generate type definitions for them.
What are the problems with this approach?
You still have to match the correct Operation with the generated type definition manually. It's a small source for errors but still. All the other declarations are still present, including database, api schema and the UI components / form declarations.
To sum it up, existing solutions turn the problem into a quadruple declaration problem, at least...
Defining an ideal solution
Now that we understand the problem as well as explored existing solutions, let's talk about an ideal scenario.
- Define the database schema
- generate the API
- Write the GraphQL Operation
- generate the response type definitions
- generate the forms
In an ideal scenario, we're using code generation to solve 3 of the five problems. You'll shortly understand how this is possible but first let's explain the remaining steps.
Obviously, we're not able to not define a database schema. It's also impossible to not define the API requests. However, all other steps can be handled by code generation. We can automatically turn a database into an API. It's also possible to generate response type definitions from a Query. In addition to existing solutions, we will also generate forms.
In our ideal scenario, the double declaration problem is fully gone. However, it turns out that just reducing the problems from 5 to 2 is not enough.
We forgot a few problems to deal with.
Generating the type definitions in the client is a good start, but we can do better than that. What if we also generate a fully typesafe client? What if we also generate hooks (in case of React) to make our client easy to use? Wouldn't that solve even more problems?
What about the backend? What about authentication & authorization? Wouldn't it make sense to also generate the required middlewares to handle these as well?
This is exactly what we're going to do. We generate a client, we generate the backend, the middlewares. Essentially, we generate everything we can, leaving only those parts to the developer that are important:
- Defining the Database schema
- Defining the GraphQL Operations
- Tweaking the UI
Everything else should be automatically generated for us. A developer's dream coming true, isn't it? Hyper productivity, but how can we achieve it?
Implementing the solution to our double (quintuple) declaration problem
Alright, let's roll up the sleeves and go!
If you'd like to follow along, you're invited to use our template. This template helps you initialize a NextJS project with Postgres as the database.
Step 1: Define the database schema
We're using the excellent prisma cli for database migrations.
Obviously you're free to use MySQL instead, or you could even use a dedicated GraphQL or REST API. WunderGraph is not forcing you to generate an API from your database. That said, it can definitely be a very convenient tool.
Anyways, let's start by defining our database schema:
This schema defines a user as well as posts and some relations. Make sure to only use lowercase names for models, otherwise you run into weird issues.
Before we're able to run the migration, we have to start the database. Run yarn database
from a terminal and wait until the DB is ready. If we now run yarn migrate init
, the schema is applied to our database.
Next, we want to generate our API. From the root of your project, run yarn wundergraph
. As everything is already configured properly, this will automatically introspect the database and generate an API from it.
Great, we've already solved two problems!
Step 2: Define the GraphQL Operation
Now that the API is ready, let's define a Mutation so that users can create posts. Create a new file at the path ./.wundergraph/operations/NewPost.graphql
and the following content.
You have to follow a few conventions here. The name of the file is important, it defines the name of our Operation. It's also important that the file sits under .wundergraph/operations
and has the file extension .graphql
. You don't have to name your Operation, only the file name is relevant.
Once the file is created, restart the terminal that runs yarn wundergraph
. This picks up the new Operation and generates a lot of code.
Before we dive into all the generated code, let's first talk a bit about the Operation itself. It's a mutation that takes three variables: name, email and message. From the three variables, two are annotates using a directive provided by WunderGraph.
If you're familiar with OpenID Connect (OIDC) you should immediately understand this and might skip this section. Claims are name value pairs of information from an authenticated user. Using the fromClaim
directive means two things. First, it forces the Operation to require a user to be authenticated. If they are not authenticated, they cannot use the Operation. Second, we're injecting the claims "email" and "name" at runtime into the Mutation. The user is disallowed to provide values from them. If they try, it'll fail with a bad request.
The only variable the user is allowed to enter is the "message".
In a nutshell, writing this operation in this particular way does not just define an API endpoint. It also enables authentication and authorization as the user needs to be logged in, and we're associating the new post with the user.
Now that the content of the Operation is clear, let's talk about all the code we generate. All the generated code can be found in the directory .wundergraph/generated
.
Obviously, we generate the models for inputs as well as responses:
Nothing spectacular here, just some models.
Next, we generate a fully typesafe client as well as React Hooks to make the client easy to use. I don't want to paste all the code here as it's a bit verbose. If you're curious, check .wundergraph/generated
.
The generated hook can be used like this:
The useMutation
hook wrapper is generated in .wundergraph/generated/hooks.ts
.
Next up, we're getting to the most exciting part, JSON Schema and forms!
One of the many capabilities of WunderGraph is to parse all GraphQL Operations and turn them into JSON Schemas. JSON Schema is an amazingly helpful tool because it allows us to integrate with a lot of existing solutions.
If you look at .wundergraph/generated/jsonschema.ts
, you'll find all JSON Schema definitions for all your declared Operations.
In case of the "NewPost" mutation, the JSON schema for the input will look like this:
If you go back to the declaration of the Mutation, you'll remember that it only allowed the user to define the message. The field message was also a required field. Both name and email should not be allowed to be defined by the user. Looking at the JSON Schema definition, you'll see that it exactly reflects these requirements.
How does this work? We'll walk through all files in the .wundergraph/operations
directory and look for the .graphql
file extension. Then we parse all GraphQL Operations into an AST. Finally, we'll walk through the AST and extract the JSON schema definition from it.
Ok, now we have a JSON Schema. Why is this the most exciting part?
It is for two reasons. First, there are existing tools that allow us to implement a JSON Schema validation at the API layer.
If you're familiar with WunderGraph, you already know that we don't expose the GraphQL API directly. Instead, we're persisting all GraphQL Operations on the Server (WunderNode) and turn them into JSON RPC. Now that we also have a JSON Schema for the inputs of all Operations, we're able to easily validate them.
Great! Input validation for free! What's the second reason?
If you recall correctly, we wanted to also generate form for our API. There's an amazing library that allows us to generate Forms from a JSON Schema. This library takes a JSON Schema and generated a form with input validation and everything we need, we only have to hook it up with the generated hooks.
Luckily, that's already done by the WunderGraph code generator. The result looks like this:
It's a fully functional React Component. Drop it into your NextJS Page and you're done. It's cheating, I know.
Step 3: Tweaking the UI
You might be thinking that this works great for prototyping only. In reality, use cases are more complex, and you want customization. Have a look at the docs of react jsonschema form . You can customize everything and write your own theme if none of these are good enough for you: Bootstrap 3 / 4, Material UI, Fluent UI, andt, Semantic UI.
Id suggest, you rather spend your time customizing these forms than trying to build your own forms entirely.
Summary
Alright, that was a lot of content, let's summarize.
Our goal was to solve the double declaration problem. We then figured out that it's not really a double but instead a quintuple declaration problem. Namely, defining the database schema, defining the API schema, writing the GraphQL Query, writing the type definitions and defining the UI components / forms. Additionally, we've covered that there's some additional work to be done. We also need to write an API client and handle authentication and authorization.
We've then moved onto defining an ideal solution which meant that we basically are able to generate almost the entire application.
Finally, we've explained how to implement the solution.
- Define your Database schema
- Write a GraphQL Query
- Add the generated Form Component to a Page
One could say that by using this approach you're able to build entire applications just by writing GraphQL Operations.
What do you think? How much time could this save you? Will you stick to your quintuple declarations (what a mouthful) or do you join the team of lazy developers? I'm pretty sure you can use the time saved and spend it in areas that really create value to your users. No developer should have to define schemas in five different places.
Once again, if you'd like to see all this in action, here's the link to the video: https://youtu.be/ftu8MM5BmGE
If you have questions, you'll find us on Twitter or Discord.