Our solution
Transient failures retry automatically without duplicating mutations
Mutations are never retried.
Cosmo Router automatically retries failed GraphQL queries using exponential backoff with jitter, configured once at the router, with expression-based conditions for when retries fire.
When a subgraph request fails with a retryable condition, the router waits using the AWS-recommended Backoff and Jitter pattern, then tries again. Limits cap attempts, intervals, and total retry duration.
Mutations are never retried. An expression evaluates each failure; defaults cover connection errors and HTTP 502/503/504. Extend with `statusCode == 429` or boolean logic as needed.
One policy for every subgraph, tuned with expressions when you need nuance.

