Rate limits
Scout enforces rate limits at three layers: per-account credits, a
shared concurrency pool, and upstream provider limits we have to live
within. Knowing which is which makes it easier to size a workload and
recover from a 429 or 503.
Account-level limits
The credit counter resets on the first of each month in UTC. The Settings → Usage page shows live remaining balance.
Concurrency
Scout runs every account on a shared pool. Today we cap simultaneous in-flight work at the service level, not per key — the practical limits to expect:
- Synchronous endpoints (
/v1/search,/v1/extract,/v1/chat/completions): designed to return in under 30 seconds. Requests that would push the browser pool past its capacity queue for up to 10 seconds before returning503 Service Unavailable. A retry with a small backoff is usually enough. - Asynchronous endpoints (
/v1/task,/v1/search?depth=deep,/v1/findall): the API responds with atask_idorsearch_idin under a second, and the heavy work runs in the background. You will not see a queue delay on creation, but the run itself may take longer to reachcompletedif the pool is busy. - Webhook deliveries: retried up to 3 times with exponential backoff on any non-2xx response. See Webhooks.
Per-key concurrency caps are on the roadmap; until then a runaway client can affect their own account’s tail latency. Use a polite parallelism cap (start at 5–10 concurrent requests per key) and let the async endpoints handle anything fan-out-shaped.
When you see a 429 or 503
The exact reason for a 503 is in the detail field of the JSON body
when it is safe to surface. See Errors & status codes.
Upstream limits we live within
Scout’s web fetch path goes through a residential and datacenter proxy pool, then to the open web. Two upstreams set ceilings we cannot exceed:
- Google SERP rate limits. Google throttles aggressively on
repeated identical queries from the same exit IP. Scout caches every
SERP for a short window and rotates IPs across the pool, but bursts
of the same query within seconds still risk a throttle and a
503. - Per-page anti-bot challenges. Some sites serve a challenge page
to non-human traffic. Scout retries with a fresh browser fingerprint
and IP up to three times before returning the result with an
errors[]entry.
These show up to you as occasional 503s on heavy or repetitive
workloads. Diversifying queries and spacing them out by even a few
hundred milliseconds is usually enough to stay under the limit.
Recommended client behavior
- Set a small connection pool. 5–10 concurrent requests is a good starting point for the synchronous endpoints. Async endpoints can fan out further because they return immediately.
- Honor
Retry-After. When we send one, retrying before that time will not succeed any faster. - Use exponential backoff. Start at 1 second, double each attempt, cap at 30 seconds. Five tries is enough; past that, surface the error to your caller.
- Cache idempotent responses on your side. Scout already caches the
SERP layer, but if the same
(query, country, language)will be consumed many times by your app, cache the result in your own store to keep both your latency and your credit spend low.