What does Apollo GraphOS and WunderGraph Cosmo have in common?
Both Apollo GraphOS and WunderGraph Cosmo build on top of the idea to implement GraphQL Servers in such a way that they can be composed into a federated GraphQL API.
GraphQL Servers that implement the Federation Subgraph Specification are compatible with both Apollo GraphOS and WunderGraph Cosmo, as the two products use the same directives, like @key
, @external
, @requires
, @overrides
, and more to define the schema and the relationships between the subgraphs.
In addition to the compatibility with the Federation Subgraph Specification, both solutions provide a similar developer experience and workflow to build, deploy, operate and evolve a federated GraphQL API.
What this means at a high level is that you can take your existing Subgraphs and use them with either Apollo GraphOS or WunderGraph Cosmo, you get the same benefits of a federated GraphQL API, all without having to change your existing Services.
What are the differences between Apollo GraphOS and WunderGraph Cosmo?
The main difference between Apollo GraphOS and WunderGraph Cosmo is the architecture and the way the two products are built. Cosmo is 100% Open Source (Apache 2.0 License) and can be self-hosted, e.g. on Kubernetes, while Apollo GraphOS is a Closed Source SaaS offering.
At a high level, any GraphQL Federation solution consists of four main components:
- Schema Registry
- Studio
- Router / Gateway
- CLI
When we dive a bit deeper into the problems that we'd like to solve with a GraphQL Federation solution, we can see that there are a few more important components that are required to build a complete solution.
We need a CDN to serve the Router configuration across the globe, highly available and fast. We've implemented this in a generic way, so you can use any CDN and S3 provider that you like. Our hosted solution uses Cloudflare and R2, but you could use AWS CloudFront and S3, or any other provider.
We need a solution to federate authentication and authorization for the Studio, also known as Single Sign-On (SSO). Cosmo is using Keycloak for this purpose, which is an Open Source Identity and Access Management solution. We've implemented OpenID Connect and SCIM, so you can integrate with any Identity Provider that supports these standards, like Okta, Auth0, or Azure AD.
In addition, we need to solve the problem of monitoring and observability. Users of a system like Cosmo need to know what's going on in the system, how many requests are being served, which client is using which part of the GraphQL Schema, and how the system is performing. Our goal is to provide a solution that works out of the box with minimal configuration, but also allows more advanced users to integrate with their existing monitoring and observability solutions, like Prometheus, Grafana, or Datadog, New Relic, and others. It was also important for us to implement a solution that is Open Source and can be self-hosted, so we've implemented monitoring and observability on top of Open Telemetry (OTEL) and Prometheus, with ClickHouse as the default storage backend for our hosted solution. This gives you the flexibility to bring your own monitoring and observability solution, or use the one that we provide out of the box.
Apollo GraphOS on the other hand is a Closed Source SaaS offering, with the Router not being Open Source (Elastic License), and only the CLI (Rover) being MIT licensed. We'll dive deeper into the differences between Cosmo Router and Apollo Gateway / Apollo Router in a separate section.
Advantages of Open Source and Self-Hosting
Being able to self-host Cosmo gives you a lot of advantages. You can run Cosmo in your own infrastructure, on your own Kubernetes cluster, in your own VPC, and you have full control over the data that flows through the system. You have complete control over upgrades, scaling, and monitoring. If you're operating in a regulated environment, you're able to meet compliance requirements by running Cosmo in your own infrastructure.
Another advantage of Cosmo is that it's a single monorepo, which means that you can modify multiple parts of the system in a single codebase. This makes it easier to understand, modify, and test the solution. The growth of GitHub stars, forks, outside contributions, and the activity in the Discord community show that Cosmo is by far the most popular and fastest-growing Open Source solution in the GraphQL Federation space.
We found that a lot of developers like to understand how things work under the hood. To address this, we provide a Docker Compose setup that allows you to run Cosmo on your local machine in a few minutes. If you'd like to test Cosmo with multiple team members, you can use the Helm Chart to deploy Cosmo on your Kubernetes cluster. Both options allow you to get started with Cosmo in a matter of minutes.
Is there a difference in how Apollo and WunderGraph handle GraphQL Federation?
Both Apollo and WunderGraph use and implement the same Federation Subgraph Specification. This means that you can use the same Subgraphs with both solutions. There are some differences in how the two implementations work under the hood, e.g. how the Router plans and executes the queries, but from a user perspective, the two solutions handle GraphQL Federation in a similar way.
How does Cosmo Router compare to Apollo Router & Apollo Gateway?
Apollo Gateway, Apollo Router, and Cosmo Router are all components that are responsible for routing and executing queries in a federated GraphQL API. Apollo Gateway is implemented in Node.js and supports the Federation Subgraph Specification v1. Apollo Router is a new implementation in Rust, which is source-available under the Elastic License. Cosmo Router is implemented in Go and is Open Source under the Apache 2.0 License. While Apollo Gateway suffers from performance issues, both Apollo Router and Cosmo Router are very capable and support the latest features of the Federation Subgraph Specification (v2).
When it comes to performance, it's important to understand what factors influence the performance of a federated GraphQL API Router.
- HTTP Server
- Query Parsing, Normalization, Validation & Planning
- The shape of the Query Execution Plan
- Query Execution Plan Caching
- Query Execution
- Subgraph Response Merging
- Response Serialization
- Overhead (Telemetry, Logging, Tracing, etc.)
As the Cosmo Router is implemented in Go, it's able to take advantage of the performance and concurrency features of the Go programming language. In addition, we're able to leverage the mature ecosystem of the Go programming language, like the HTTP Server, the net/http package, and the OpenTelemetry (OTEL) and Prometheus libraries to provide monitoring and observability out of the box.
If you're looking for a head-to-head performance comparison between Apollo Router and Cosmo Router, it's important to understand how to properly benchmark a federated GraphQL API Router. Most importantly, you need to benchmark the Router under realistic conditions, and not just in a lab environment with a single query that continously hits the Query Execution Plan Cache.
Our honest opinion is that both Apollo Router and Cosmo Router are very mature and capable solutions, and that the choice between the two should be based on the overall architecture and the requirements of your system, and not just on slight differences in performance. In a big picture view, we believe that the overhead of the Router is negligible compared to the latency of the Subgraphs, which will be the main bottleneck in most systems, especially if the Subgraphs store and retrieve data from a database. It's very unlikely that the Router will be the bottleneck.
That said, we'd like to give you some insights into the performance characteristics of the Cosmo Router, so you can make an informed decision on what overhead the Router introduces to your system. We're leveraging ART (Advanced Request Tracing), which is a feature of the Cosmo Router that allows you to debug and trace the execution of a GraphQL Request in the Playground.
This diagram shows the execution of a relatively complex GraphQL Query that spans multiple Subgraphs, and leverages complex Federation v2 features. If we subtract the execution time of the Subgraphs from the total execution time of the Router, we can see that the overhead of the Router to parse, normalize, validate, and plan the Operation is ~2.6ms. In a real-world scenario, this overhead would be reduced even further by the Query Execution Plan Cache to a sub-millisecond value.
In terms of execution overhead, the Cosmo Router comes with two powerful features that ensure the performance and scalability of the system:
- Dataloader 3.0 - A breadth-first execution strategy to batch requests efficiently
- AST-JSON - An efficient way to merge and serialize the responses of the Subgraphs
The Dataloader 3.0 implementation in the Cosmo Router ensures that at every level of the Query Execution Plan, the Router efficiently batches requests to the Subgraphs and executes them with the minimal amount of concurrency required. Some batching algorithms heavily rely on concurrency to achieve high throughput, but this can lead to high memory usage and contention in the system. With the breadth-first execution strategy of Dataloader 3.0, the Cosmo Router is capable of keeping the concurrency at an optimal minimum, while still achieving high throughput and low latency.
With the introduction of AST-JSON, we've implemented a highly efficient way to merge and serialize the responses of the Subgraphs. When resolving a GraphQL Query, the Router needs to merge a lot of small nested JSON objects into a single JSON response. This can turn into a performance bottleneck, especially if response sizes grow or a lot of nested Subgraph requests are made. With AST-JSON, we're able to merge and serialize the responses with the least amount of memory allocations and CPU cycles required, which helps to keep the latency of the Router low and the throughput high.
How does Cosmo Studio compare to Apollo Studio?
Cosmo Studio and Apollo Studio are the frontend components of the two solutions, which are responsible for providing a user interface to manage and operate a federated GraphQL API. They both provide a similar set of features, like Schema Management, GraphQL Playground, Query Plan Visualization, Metrics, and Monitoring.
Cosmo Studio is implemented in React (Next.js) and TypeScript, and is Open Source under the Apache 2.0 License like the rest of the Cosmo solution. Apollo Studio on the other hand is a Closed Source SaaS offering which cannot be self-hosted.
We've chosen Next.js and React with TypeScript for Cosmo Studio because there's a large community of developers that are familiar with these technologies, as well as a lot of existing components and libraries that we can leverage. This allows us to build a feature-rich and user-friendly Studio that is easy to use and understand.
We found that having an open-source Studio is a huge advantage, as we can easily collaborate with the community to improve it, or work together with partners to add new features and integrations.
Some of the notable highlights of Cosmo Studio are Advanced Request Tracing, Namespacing, Subgraph Ownership (RBAC), the SSO & SCIM integration, as well as the OTEL-based metrics and distributed tracing capabilities that are built into the Studio.
Advanced Request Tracing gives you a detailed real-time view of how the Cosmo Router plans and executed a GraphQL Operation.
Namespacing is an essential concept in Cosmo that we've borrowed from Kubernetes, enabling you to group resources and isolate environments.
With Graph Access Control, you can implement Graph Ownership, ensuring that only those with the right permissions are allowed to publish a Subgraph Schema.
Cosmo Supports SSO with OIDC (OpenID Connect) and SCIM (System for Cross-domain Identity Management), so you can manage access to Cosmo Studio for your users through your existing Identity System.
Apollo GraphOS Enterprise vs WunderGraph Cosmo Open Source vs WunderGraph Cosmo Enterprise
At WunderGraph, we don't think that the distinction between Regular customers and Enterprise customers comes down to features. Instead, we believe that Enterprise Customers have different requirements in terms of the shape of the contract, Service Level Agreements, and having access to exclusive resources like a Solution Architect or dedicated Support. As such, you can see a clear difference in how we shape and offer our solution.
Apollo GraphOS Enterprise | WunderGraph Cosmo Open Source | WunderGraph Cosmo Enterprise | |
---|---|---|---|
Self-hosted Router | ✓ | ✓ | ✓ |
GraphQL Subscriptions | ✓ | ✓ | ✓ |
Authentication (JWT) | ✓ | ✓ | ✓ |
Authorization Directives | ✓ | ✓ | ✓ |
Progressive Override | ✓ | ✓ | ✓ |
Query Plan Caching | ✓ | ✓ | ✓ |
External Coprocessing | ✓ | ✓ | ✓ |
Operation Limits & Safelisting | ✓ | ✓ | ✓ |
Custom telemetry attributes & spans | ✓ | ✓ | ✓ |
Offline "Enterprise" license | ✓ | ✓ | ✓ |
Schema Filtering (contracts) | ✓ | ✓ | ✓ |
SSO (Okta, Azure AD) | ✓ | ✓ | ✓ |
Datadog Integration | ✓ | ✓ | ✓ |
Build status notifications | ✓ | ✓ | ✓ |
Open Telemetry (OTEL) | ✓ | ✓ | ✓ |
SCIM | ❌ | ✓ | ✓ |
Subgraph Ownership (RBAC) | ❌ | ✓ | ✓ |
Event-Driven Federated Subscriptions (EDFS) | ❌ | ✓ | ✓ |
Why is there no Enterprise Version of Cosmo?
We've decided against having an Enterprise Version of Cosmo for multiple reasons:
- Open Source Community: We believe in the power of Open Source and the community that comes with it
- Transparency: Being Open Source makes the solution more transparent, trustworthy, and secure
- Enterprise Features: Drawing a line between Open Source and Enterprise features will always lead to problems
- Velocity: We can move faster and innovate more by having a single codebase
- Testing: A single codebase allows us to end-to-end test changes very effectively
- Integrations: Open Source makes it easier to integrate Cosmo into an existing infrastructure or toolchain
We believe that in the long run, the Open Source nature of Cosmo will lead to a better product, more thorough testing, and more innovation, as we can leverage the power of the community to improve the solution. We already see a lot of contributions from the community, with more and more large companies adopting Cosmo and contributing back to the project.
We also believe that being Open Source is a big advantage when it comes to security, as the code is open for everyone to see and audit. Security through obscurity has proven to not work in the past, and we believe that being Open Source is the best way to ensure the security of the solution.
When you have an Enterprise Version of a product, there's always the problem that you need to draw a line between what features are Open Source and what features are Enterprise. The community might want to implement a feature that you believe should be an Enterprise feature. This can lead to conflicts and problems, which can ultimately result in a "true" Open Source fork of the project. We've structured our portfolio of products in such a way that doesn't require us to have an Enterprise Version of Cosmo.
The digital world is moving fast, and we want to be able to offer you the most advanced and innovative solution that we can. As such, velocity and speed of innovation are very important to us. Having a single monorepo allows us to change and test multiple components of the Cosmo stack in a single pull request. Not having to synchronize changes between multiple public and private repositories is a huge advantage for us, and ultimately for you, because we can deliver new capabilities faster and at a lower cost.
Another benefit of having a single codebase is that we can end-to-end test changes very effectively. We've built an extensive integration test suite that helps us to test multiple components of the Cosmo stack in a single test run. We're able to provide pre-release Docker images for every pull request, so you can test changes early and give us feedback on new features and bug fixes.
Speaking of collaboration, we're also big proponents of open RFCs (Request for Comments) and open design discussions. We're regularly discussing new designs and features publicly, and we're always open to feedback and contributions from the community. A benefit of being Open Source is that we don't have to sign NDAs or other legal agreements to collaborate with you.
Lastly, we believe that being Open Source makes it easier to integrate Cosmo into an existing infrastructure or toolchain. We've seen companies that have built their own tooling around Cosmo, or modified the source code to implement a custom authentication mechanism, or integrate with an existing monitoring and observability solution.
When should you choose Apollo GraphOS vs WunderGraph Cosmo?
Apollo GraphOS Enterprise is a complete solution for building, deploying, operating, and evolving a federated GraphQL API. You get a hosted Schema Registry and Studio as well as a hosted Router.
WunderGraph Cosmo on the other hand doesn't offer a hosted Router. Our understanding of the market is that most companies want to self-host the Router in their own infrastructure, so we've decided to focus on providing a solution that integrates well with existing infrastructure. Self-hosting the Router gives you more control over the data that flows through the system, and allows you to meet the most demanding compliance requirements if you're operating in a regulated environment. In addition, self-hosting the Router gives you more control over upgrades, scaling, and monitoring. Furthermore, running your own Router also means that you can deploy it as close to your Subgraphs as possible, reducing latency and increasing throughput.
On the business side, we don't believe that it's a great business model to charge for the Router. We'd buy hosting for the Router from a cloud provider like AWS, GCP, or Azure, and then charge you a markup on top of that. You'd be paying for the hosted Router service, but you still need an infrastructure or operations team to manage your Subgraphs. If you're already running your own infrastructure, you might as well run the Router on your own infrastructure as well and save the markup. This allows us to focus our resources on maintaining and improving Cosmo, as well as providing you with the best possible support and service.
What are the ideal use cases for choosing WunderGraph Cosmo?
We have identified 3 main use cases where WunderGraph Cosmo is the ideal solution:
- Startups and SMBs: Startups and SMBs that want to run their own Router in their own infrastructure, but still want to benefit from a hosted Schema Registry and Studio
- Enterprise - Hybrid Cloud: Enterprises that are looking for a managed solution with SLAs, dedicated support, and SOC2 compliance
- Enterprise - Self-Hosted: Large Enterprises that want to run the entire solution in their own infrastructure, e.g. on Kubernetes, in their own VPC
If you're a Startup or SMB that wants to adopt GraphQL Federation, our self-service offerings are the perfect fit for you. Without any feature limitations, not even SSO or SCIM, you get access to the full power of Cosmo just by swiping your credit card. No complex sales or procurement process, no long-term contracts, just a monthly subscription that you can cancel at any time.
If you're an Enterprise that is looking for a managed solution with SLAs, dedicated support, and SOC2 compliance, our Enterprise offering is the perfect fit for you. We can provide you with a custom contract that meets your requirements, and we can offer you a dedicated Solution Architect or Support Engineer to help you with your implementation.
At the top end of the spectrum, we also support large Enterprises that want to run the entire solution in their own infrastructure. We provide dedicated support, training, and consulting to help your company roll out GraphQL Federation at scale. One of our strengths is that we're able to provide OSS Development and Consulting Services, so we can integrate Cosmo into your existing infrastructure and workflows, and help you with the migration of your existing Services to a federated GraphQL API.
Next Steps - Try WunderGraph Cosmo now or book a meeting
If you're interested in trying out WunderGraph Cosmo, you can sign up for a free account on Cosmo Cloud, or clone the Cosmo Repository and follow the instructions in the README to run the whole stack on your local machine.
If you've got more advanced requirements and are ready to talk to us, please book a meeting.