Traffic management · Timeout configuration

Give each subgraph the timeout budget it actually needs

Set time limits for each stage of a subgraph request, with overrides for services that need different timeout behavior.

Global defaults for every subgraph. Overrides for the services that need different treatment.

Subgraph request lifecycle

Each stage has its own budget. Tune connection setup, response headers, and full request completion separately.

Dialdial_timeoutTLStls_handshaketimeoutRequest→ subgraphHeadersresponse_headertimeoutBody · totalrequest_timeout

Available onFreeProEnterprise

The problem

Why one timeout rarely fits every subgraph

Your subgraphs are not all the same speed. One global timeout is either too tight for your slow services or too loose for everything else.

A strict global timeout kills your slowest (but totally valid) services

Your product catalog responds in milliseconds. Your inventory service talks to a legacy backend and takes much longer. Set one tight limit and you are constantly failing requests that should succeed.

A generous global timeout lets bad requests linger forever

Give every service a long timeout and a single hung subgraph can hold router resources and keep clients waiting long after the request should have been dropped.

A connection that never opens and a response that is just slow look identical

If a connection cannot establish, you want to catch that in seconds. If data is streaming in slowly, you want to wait it out. With one timeout covering everything, you cannot tell the difference.

Our solution

Fine-grained timeout control for fast and slow subgraphs

Cosmo Router timeout configuration sets limits per phase of a subgraph call. You keep defaults sensible for most traffic and adjust only the outliers.

How configuration tiers fit together

  1. Stage-by-stage budgets cover dial, TLS handshake, response headers, and full request completion.

  2. Per-subgraph overrides change limits for named services only, leaving your global defaults intact.

  3. Set defaults under `traffic_shaping.all` and service-specific values under `traffic_shaping.subgraphs`.

Same router config file as your other traffic shaping settings, so timeouts stay next to related controls.

Configuration

Example YAML

You need a deployed Cosmo Router and access to its router configuration file.

Global defaults live under all. Service-specific timeout overrides live under subgraphs.

1
2
3
4
5
6
7
8

Tradeoffs

Before & After

Before CosmoWith Cosmo
One global timeout for every subgraphSeparate timeouts per request lifecycle stage
Legacy slow service forces loose timeouts for everyonePer-subgraph override: slow service gets its own budget
Hung connections accumulate and exhaust resourcesKeep-alive idle timeout reclaims idle connections automatically
Cannot tell a stuck dial from a slow responseDistinct timeouts for dial, TLS, response header, and full request

Reference

Timeout fields

Quick lookup for each key: details and defaults live in the docs.

Cosmo supports seven timeout-related controls: request_timeout, dial_timeout, tls_handshake_timeout, response_header_timeout, expect_continue_timeout, keep_alive_idle_timeout, and keep_alive_probe_interval.

TimeoutControls
request_timeoutMaximum total time for the full request lifecycle
dial_timeoutMaximum time to establish a connection
tls_handshake_timeoutMaximum time for TLS negotiation
response_header_timeoutMaximum time to receive response headers
expect_continue_timeoutTime to wait for a 100-continue response
keep_alive_idle_timeoutTime before closing idle connections
keep_alive_probe_intervalInterval between keep-alive probes

Details and defaults live in the timeout documentation.

How timeout configuration works

01
Defaults propagate until you override.

Set global defaults in `all`

Under `traffic_shaping.all`, set `request_timeout`, `dial_timeout`, `tls_handshake_timeout`, and keep-alive settings that apply to every subgraph.

02
One slow outlier should not set the rules for everyone.

Override per subgraph

Under `traffic_shaping.subgraphs.<name>`, override timeout fields for a specific service that needs different treatment.

03
Timeouts can interact with retries and circuit breakers.

Enforce at runtime

The router enforces configured timeout limits at runtime. Requests that exceed their configured limits fail, and those failures can interact with retry or circuit breaker behavior when those controls are configured.

04
Tune based on observed timeout behavior.

Observe

Use observability data to identify timeout patterns, then tune timeout values for the affected subgraphs.

Use cases

Patterns teams ship first

Tune timeout behavior by lifecycle stage: connection setup, TLS negotiation, response headers, full request completion, and idle connection management.

Legacy backend integration

Per subgraph

Set request_timeout: 10s globally and override to request_timeout: 60s for just the legacy subgraph. Fast services keep the tight budget; the legacy one gets what it needs.

Cross-region subgraph connectivity

Dial · TLS

Subgraphs live in a different region; connection establishment occasionally takes longer. Increase dial_timeout for cross-region setup while keeping tls_handshake_timeout controlled separately.

Connection pool management

Keep-alive

During low-traffic windows, idle connections accumulate. Configure keep_alive_idle_timeout so the router closes connections that have been idle too long and reclaims resources automatically.

Slow response headers

Headers

response_header_timeout detects when a subgraph accepts the request but does not start responding in time. The router can fail that request earlier instead of waiting for the full request timeout.

Interaction with retries and circuit breakers

Traffic shaping

Timeouts are part of Cosmo Router traffic management. Timeout failures can interact with retry and circuit breaker behavior, so teams should tune these controls together when they use them together.

Tune timeout behavior from one config

Configure stage-level timeout budgets from one YAML file in the Cosmo Router.

FAQ

GraphQL router timeout configuration

Full reference in the timeout documentation.