Skip Navigation Links

Rate Limits

Documentation on API rate limits, response header interpretation, and best practices for building resilient clients.

To ensure stability, prevent abuse, and provide fair usage for all customers, rate limits are applied to certain API groups. When a limit is exceeded, the API responds with an HTTP 429 Too Many Requests error.

This guide explains how rate limits work, how headers in responses are interpreted, and best practices for handling retries.

Rate limits are applied at different scopes (e.g., per organisation, per user, per IP-Address). The "Scope" column in the table below is the definitive source for which entity a specific limit is applied to.

For example, a limit with a scope of "Per organisation" means that all requests from the entire organisation (tenant) share a single request pool for that API group.

Note: In the context of "Per organisation" limits, "organisation" and "tenant" are used interchangeably.


Current Rate Limits

If an API group is not listed here, it is not currently rate-limited. The "Scope" column defines what the limit is applied to.

API groupRate LimitScope
User Admin API600 req/minutePer organisation
Legacy Public Post API300 req/minutePer organisation

Rate Limit Headers

When making requests to a rate-limited API group, standard HTTP headers are returned indicating the current limit status. The implementation relay on the RateLimit header fields for HTTP RFC Draft. When a 429 Too Many Requests error is received, these headers indicate when to retry.

Which Limit is Active?

These headers define the current quota and remaining balance.

  • RateLimit-Limit: The quota(s) for the current time window(s).
    • Example: RateLimit-Limit: 10;w=1, 300;w=60 means the limit is 10 requests per second and 300 requests per minute.
  • RateLimit-Remaining: The number of requests remaining in the window that is closest to being exhausted.
  • RateLimit-Reset: The number of seconds until the window resets.

Determining Retry Timing

When rate-limited, the headers indicate when the next request can be sent.

Golden Rule: The RateLimit-Remaining header should be checked. If it is 0, the number of seconds specified in RateLimit-Reset must elapse before making a new request.

Header Examples

Example: A successful 200 OK response

This example shows 7 requests remaining in the current 60-seconds window.

Example: A 429 Too Many Requests response

This example shows 0 requests remaining. The client must wait 20 seconds (from RateLimit-Reset) before retrying.


Best Practices

Proper handling of 429 errors is critical for client stability. Retrying immediately will lead to additional errors.

Simply waiting for RateLimit-Reset is insufficient. If all servers retry at the exact same moment the window resets, they will be rate-limited again (this is known as a "thundering herd").

The recommended solution is exponential backoff (waiting progressively longer after each failure) combined with jitter (adding a small, random delay to break up synchronized retries).

Client-side retry logic should implement the following:

  1. Check Headers: When a 429 is received, the RateLimit-Reset header should be read. This represents the minimum server-required wait time (ServerWaitSeconds).

  2. Calculate Client Backoff: On each consecutive failure, the client's own wait time should be doubled (e.g., 1s, 2s, 4s, 8s...). This is the "exponential backoff."

  3. Add Jitter: A small, random amount of time (e.g., +/- 20%) should be added to this backoff period. This prevents all clients from retrying at the same microsecond.

  4. Wait and Retry: The client should wait for the longest of (ServerWaitSeconds) or (ClientBackoff + Jitter).

  5. Cap Retries: Retries should stop after a maximum number of attempts (e.g., 10) to avoid infinite loops.

  6. Reset: After a successful request, the retry attempt counter should be reset back to zero.

Be Proactive, Not Just Reactive: Clients should not only react to 429s. The RateLimit-Remaining header should be read on successful 200 OK responses. If the remaining count is low (e.g., < 10%), the request rate should be proactively slowed down.