Cosmo Router

The door to data in your federated graph

A single open-source binary for federated GraphQL. Retries, circuit breakers, and traffic management built in. Built to stay up even when the rest of your platform doesn't.

Apache 2.0. One Go binary. No runtime dependency on the control plane.

Overview

What the Cosmo Router is

The Cosmo Router is a GraphQL federation router. It runs your federated GraphQL graph: it accepts a single client query, works out which subgraphs need to answer each field, sends the right sub-queries in the right order, and stitches the results back together.

It is written in Go. It is Apache 2.0 licensed. It ships as a single binary that runs on Linux, macOS, Windows, or any container platform. It does not need the Cosmo Control Plane to serve traffic β€” the control plane only ships configs and collects telemetry. If the control plane goes away, the router keeps serving.

Why federation routers exist

Why teams choose a dedicated GraphQL federation router

Most API gateways move HTTP. They cannot read a GraphQL document, break it into subgraph-specific operations, or plan execution order across services that share types. A stock GraphQL API gateway is not a federation router, and a service mesh is not a query planner.

Teams running a federated graph through a general-purpose stack tend to hit the same four walls.

Query planning does not exist.

Generic gateways forward requests. They do not decide which subgraph owns which field.

Resilience sits outside the request path.

Retries, timeouts, and circuit breakers get reimplemented in each subgraph or bolted on as a separate system. More latency, more drift, more surface area.

Headers become a patchwork.

Passing auth, tenant context, or correlation IDs to subgraphs without leaking sensitive headers needs custom code or permissive pass-through rules.

Extending behavior is expensive.

Custom auth, caching, or policy usually means a separate proxy, a fork, or a scripting language your team doesn't use.

The Cosmo Router handles all of this natively. One binary, one config, one process.

Cosmo Router capabilities

GraphQL Federation Router

The flagship. A high-performance, Go-based router that routes requests across subgraphs, aggregates responses, and scales independently of the control plane.

Free

Query Planning

Query plan generation that breaks down each operation into the optimal set of subgraph calls. Visualize and inspect plans in the Studio Playground.

Free

Query Batching

Run multiple GraphQL operations in a single HTTP request with configurable concurrency and size limits.

Free

File Upload

Handle single and multi-file uploads through the router using the standard GraphQL multipart request spec.

Free

Which GraphQL Router capability do you need?

If you are…Start here
New to Cosmo and evaluating the runtimeGraphQL Federation Router
Standing up router-to-subgraph resilience from scratchTraffic Shaping
Seeing cascading failures from flaky subgraphsCircuit Breaker
Recovering from transient failures on query operationsRetry Mechanism
Fighting hung connections or slow subgraphsTimeout Configuration
Figuring out why a query is slowQuery Planning
Passing JWTs or tenant IDs to subgraphsRequest Header Operations
Controlling which subgraph headers reach the clientResponse Header Operations
Sending subscription init data through the routerForward Client Extensions
Deploying across environments with different subgraph URLsOverride Subgraph Config
Rolling out config changes without downtimeConfig Hot Reload
Adding custom auth, caching, or policy logicCustom Modules
Required to keep config artifacts in your own infrastructureStorage Providers
Handling file uploads over GraphQLFile Upload
Getting set up for local developmentDevelopment Mode

How the Cosmo Router compares

Cosmo RouterApollo RouterHive RouterService mesh + stock GraphQL server
LicenseApache 2.0Source-available (ELv2)MITVaries
LanguageGoRustRustVaries
Federation query planningYesYesYesNo
Traffic shaping in the request pathIn the routerPartialPartialIn the mesh (extra hop)
Native extensibilityPure Go modulesRhai scripts / coprocessorsPluginsOut of scope
Self-hosting without license feesYesRestricted by ELv2YesYes
Use cases

GraphQL Router use cases

Real failure and rollout patterns β€” and how the router responds in one process, without extra hops.

Resilience

A payments subgraph starts degrading at peak

Scenario

The payments service hits a database hotspot during a Friday-night traffic spike. Latency climbs. Error rates pass 50 percent.

How the router handles it

The circuit breaker opens. Requests to payments are rejected immediately instead of piling up. Retries stop. Queries that don't touch payments keep running. After the sleep window, half-open state tests recovery before restoring traffic.

Outcome

Checkout, catalog, and auth stay up. Payments recovers on its own. No 2 a.m. page.

Migration

Migrating off Apollo Gateway

Scenario

An existing client fleet batches multiple operations per request and relies on specific header propagation rules. Client-side changes can't ship in the migration window.

How the router handles it

Query batching preserves the existing client pattern. Header operations reproduce the old propagation rules. Override subgraph config maps existing URLs into the router without touching the control plane.

Outcome

Migration behind a feature flag. No client changes. No coordinated freeze. Gradual cutover over weeks, not a big-bang release.

Rollout

Rolling out a new schema version to a single tenant

Scenario

A platform team needs to preview a new subgraph implementation with one internal customer before a broader rollout.

How the router handles it

Override subgraph config routes that tenant's traffic to a staging subgraph on a header. Hot reload applies the change without a restart. Query plan visualization shows how the router will execute against the new subgraph.

Outcome

Targeted rollout. No redeploy. Validate before it touches anyone else.

Isolation

Enforcing tenant isolation at the gateway

Scenario

A multi-tenant SaaS needs to prevent one tenant's query from reaching another tenant's data.

How the router handles it

A custom Go module reads the JWT, extracts the tenant ID, and rejects requests that don't match. Header operations forward the tenant ID to every subgraph for defense in depth.

Outcome

Tenant isolation in one place, in native Go. One audit point. One place to fix if requirements change.

Why teams run the Cosmo Router

  • Traffic shaping, retries, and circuit breakers are in the router. One less system to operate. One less hop in the request path.
  • A typo in your config doesn't take down production. JSON schema validation catches it before deploy. Hot reload rolls out changes without dropping connections.
  • Your team writes router extensions in Go. Not Lua, not JavaScript, not a DSL. Full type safety, full Go ecosystem, native performance.
  • Apache 2.0. Run it, fork it, self-host it. No contract required to put the router in production.
Questions & Answers

Frequently Asked Questions

Licensing, control plane dependency, migration, extensions, and what the router leaves to the platform.

Yes. The router is licensed under Apache 2.0. You can run it, modify it, and ship it in your own product. The Cosmo Control Plane has separate licensing β€” see the pricing page for details.

Get started

Run federated GraphQL in production with the Cosmo Router