Rate Limiting

Rate limiting policies, limits per endpoint group, handling 429 errors, and best practices for building resilient API clients.

Updated Dec 16, 2025
Edit on GitHub
rate-limiting performance best-practices resilience

Rate Limiting

The AuthOS API implements rate limiting to ensure fair usage and protect against abuse. Rate limits are applied based on IP address and vary by endpoint group.

Overview

Rate limiting is enforced using a token bucket algorithm with configurable burst size and replenishment rate. When you exceed the rate limit, the API returns a 429 Too Many Requests response with retry information.

Key Characteristics

  • IP-Based: Limits apply per client IP address
  • Endpoint Groups: Different limits for different endpoint categories
  • Token Bucket Algorithm: Allows bursts while maintaining average rate
  • Automatic Reset: Limits replenish automatically over time

Rate Limit Groups

The API applies different rate limits to different categories of endpoints based on their security sensitivity and resource requirements.

1. Authentication Routes

Endpoints:

  • /auth/* (OAuth initiation and callbacks)
  • /api/auth/* (Registration, login, password reset)

Limits:

  • Rate: 1 request per second
  • Burst Size: 20 requests

Rationale: Authentication endpoints are rate-limited to prevent credential stuffing and brute force attacks while allowing legitimate authentication flows.

2. Device Flow Routes

Endpoints:

  • /auth/device/code (Request device codes)
  • /auth/device/verify (Verify user code)
  • /auth/token (Token exchange)

Limits:

  • Rate: 1 request every 5 seconds (0.2 requests per second)
  • Burst Size: 10 requests

Rationale: Stricter burst limit prevents abuse of the device flow while accommodating polling behavior.

3. MFA Setup Routes

Endpoints:

  • /api/user/mfa/enable (Enable TOTP)
  • /api/user/mfa/disable (Disable MFA)
  • /api/user/mfa/backup-codes (View backup codes)
  • /api/user/mfa/backup-codes/regenerate (Regenerate codes)

Limits:

  • Rate: 300 requests per second (5 minutes per request)
  • Burst Size: 5 requests

Rationale: Very permissive rate limit since these are infrequent administrative actions, but burst limit prevents rapid-fire abuse.

4. MFA Verification Routes

Endpoints:

  • /api/auth/mfa/verify (Verify MFA code during login)
  • /api/user/mfa/verify (Test MFA code)

Limits:

  • Rate: 1 request per second
  • Burst Size: 3 requests

Rationale: Strict burst limit protects against brute force attacks on MFA codes while allowing legitimate retry attempts.


Rate Limit Response

When you exceed the rate limit, the API returns a 429 Too Many Requests response.

Response Format

Status: 429 Too Many Requests

{
  "error": "Rate limit exceeded",
  "error_code": "RATE_LIMIT_EXCEEDED",
  "timestamp": "2025-01-15T10:30:00.123Z"
}

Response Headers

The API includes the Retry-After header indicating how many seconds to wait before retrying:

HTTP/1.1 429 Too Many Requests
Retry-After: 60
Content-Type: application/json

Example

curl -X POST https://sso.example.com/api/auth/login \
  -H "Content-Type: application/json" \
  -d '{"email": "user@example.com", "password": "secret"}'

Response (after exceeding rate limit):

HTTP/1.1 429 Too Many Requests
Retry-After: 42

{
  "error": "Rate limit exceeded",
  "error_code": "RATE_LIMIT_EXCEEDED",
  "timestamp": "2025-01-15T10:30:00.123Z"
}

Handling Rate Limits

1. Check for 429 Status

Always check for 429 status codes in your API client:

const response = await fetch(url, options);

if (response.status === 429) {
  const retryAfter = parseInt(response.headers.get('Retry-After') || '60');
  console.log(`Rate limited. Retry after ${retryAfter} seconds`);
  // Handle rate limit
}

2. Implement Exponential Backoff

Use exponential backoff to automatically retry rate-limited requests:

async function fetchWithBackoff(url, options, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, options);

    if (response.status === 429) {
      const retryAfter = parseInt(response.headers.get('Retry-After') || '60');
      const backoff = Math.min(retryAfter * 1000, 2 ** attempt * 1000);

      console.log(`Rate limited. Retrying after ${backoff}ms (attempt ${attempt + 1}/${maxRetries})`);
      await new Promise(resolve => setTimeout(resolve, backoff));
      continue;
    }

    return response;
  }

  throw new Error('Max retries exceeded');
}

3. Respect Retry-After Header

Always honor the Retry-After header value:

async function handleRateLimit(response) {
  if (response.status === 429) {
    const retryAfter = parseInt(response.headers.get('Retry-After') || '60');

    // Wait for the specified duration
    await new Promise(resolve => setTimeout(resolve, retryAfter * 1000));

    // Retry the request
    return fetch(response.url, {
      method: response.request.method,
      headers: response.request.headers,
      body: response.request.body
    });
  }

  return response;
}

4. Implement Request Queuing

For high-volume applications, implement a request queue to stay within rate limits:

class RateLimitedQueue {
  constructor(requestsPerSecond = 50) {
    this.queue = [];
    this.processing = false;
    this.interval = 1000 / requestsPerSecond; // Time between requests
  }

  async enqueue(requestFn) {
    return new Promise((resolve, reject) => {
      this.queue.push({ requestFn, resolve, reject });
      this.process();
    });
  }

  async process() {
    if (this.processing || this.queue.length === 0) return;

    this.processing = true;

    while (this.queue.length > 0) {
      const { requestFn, resolve, reject } = this.queue.shift();

      try {
        const result = await requestFn();
        resolve(result);
      } catch (error) {
        reject(error);
      }

      // Wait before processing next request
      if (this.queue.length > 0) {
        await new Promise(resolve => setTimeout(resolve, this.interval));
      }
    }

    this.processing = false;
  }
}

// Usage
const queue = new RateLimitedQueue(50); // 50 requests per second

async function makeRequest(url, options) {
  return queue.enqueue(() => fetch(url, options));
}

Best Practices

1. Cache Responses

Reduce API calls by caching responses:

const cache = new Map();

async function getCachedUser(userId) {
  const cacheKey = `user_${userId}`;

  if (cache.has(cacheKey)) {
    const { data, timestamp } = cache.get(cacheKey);

    // Use cached data for 5 minutes
    if (Date.now() - timestamp < 5 * 60 * 1000) {
      return data;
    }
  }

  const response = await fetch(`/api/service/users/${userId}`);
  const data = await response.json();

  cache.set(cacheKey, { data, timestamp: Date.now() });
  return data;
}

2. Batch Requests

Combine multiple operations into single API calls when possible:

// Instead of multiple individual requests
const users = await Promise.all(
  userIds.map(id => fetchUser(id)) // Multiple API calls
);

// Use batch endpoint if available
const users = await fetch('/api/service/users', {
  method: 'POST',
  body: JSON.stringify({ user_ids: userIds })
}).then(r => r.json());

3. Use Webhooks

For real-time updates, use webhooks instead of polling:

// Bad: Polling for changes
setInterval(async () => {
  const response = await fetch('/api/organizations/acme/audit-log');
  const events = await response.json();
  // Process events
}, 5000); // Wastes rate limit on polling

// Good: Use webhooks
app.post('/webhook', (req, res) => {
  const event = req.body;
  // Process event in real-time
  res.sendStatus(200);
});

4. Implement Circuit Breaker

Prevent cascading failures when rate limits are hit:

class CircuitBreaker {
  constructor(threshold = 5, timeout = 60000) {
    this.failureCount = 0;
    this.threshold = threshold;
    this.timeout = timeout;
    this.state = 'CLOSED'; // CLOSED, OPEN, HALF_OPEN
    this.nextAttempt = Date.now();
  }

  async execute(fn) {
    if (this.state === 'OPEN') {
      if (Date.now() < this.nextAttempt) {
        throw new Error('Circuit breaker is OPEN - too many rate limit errors');
      }
      this.state = 'HALF_OPEN';
    }

    try {
      const result = await fn();
      this.onSuccess();
      return result;
    } catch (error) {
      if (error.status === 429) {
        this.onFailure();
      }
      throw error;
    }
  }

  onSuccess() {
    this.failureCount = 0;
    this.state = 'CLOSED';
  }

  onFailure() {
    this.failureCount++;
    if (this.failureCount >= this.threshold) {
      this.state = 'OPEN';
      this.nextAttempt = Date.now() + this.timeout;
    }
  }
}

5. Monitor Rate Limit Usage

Track your rate limit usage to prevent hitting limits:

class RateLimitMonitor {
  constructor() {
    this.requests = [];
    this.window = 1000; // 1 second window
  }

  recordRequest() {
    const now = Date.now();
    this.requests.push(now);

    // Remove old requests outside the window
    this.requests = this.requests.filter(time => now - time < this.window);
  }

  getCurrentRate() {
    return this.requests.length;
  }

  async shouldThrottle(limit = 50) {
    if (this.getCurrentRate() >= limit) {
      const oldestRequest = this.requests[0];
      const waitTime = this.window - (Date.now() - oldestRequest);

      if (waitTime > 0) {
        await new Promise(resolve => setTimeout(resolve, waitTime));
      }
    }

    this.recordRequest();
  }
}

const monitor = new RateLimitMonitor();

async function makeRequest(url, options) {
  await monitor.shouldThrottle(50);
  return fetch(url, options);
}

Rate Limit Considerations

Device Flow Polling

The Device Authorization Flow (RFC 8628) requires polling the /auth/token endpoint. Follow these guidelines:

  • Polling Interval: Poll every 5 seconds (not faster)
  • Timeout: Stop polling after 5 minutes
  • Error Handling: Implement exponential backoff on errors
async function pollForToken(deviceCode) {
  const pollInterval = 5000; // 5 seconds
  const timeout = 300000; // 5 minutes
  const startTime = Date.now();

  while (Date.now() - startTime < timeout) {
    try {
      const response = await fetch('/auth/token', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ device_code: deviceCode })
      });

      if (response.status === 200) {
        return response.json(); // Success
      }

      if (response.status === 429) {
        const retryAfter = parseInt(response.headers.get('Retry-After') || '10');
        await new Promise(resolve => setTimeout(resolve, retryAfter * 1000));
        continue;
      }

      if (response.status === 400) {
        const error = await response.json();
        if (error.error_code === 'DEVICE_CODE_PENDING') {
          // User hasn't authorized yet, continue polling
          await new Promise(resolve => setTimeout(resolve, pollInterval));
          continue;
        }
        throw new Error(error.error);
      }
    } catch (error) {
      console.error('Polling error:', error);
      await new Promise(resolve => setTimeout(resolve, pollInterval));
    }
  }

  throw new Error('Device authorization timed out');
}

MFA Verification

MFA verification has a strict burst limit (3 requests) to prevent brute force attacks:

  • Maximum Attempts: 3 rapid attempts
  • Rate Reset: Wait 1 second between failed attempts
  • Account Lockout: Implementation-specific
async function verifyMFA(code, maxAttempts = 3) {
  for (let attempt = 1; attempt <= maxAttempts; attempt++) {
    try {
      const response = await fetch('/api/auth/mfa/verify', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ code })
      });

      if (response.status === 200) {
        return response.json();
      }

      if (response.status === 429) {
        throw new Error('Rate limit exceeded. Please wait before trying again.');
      }

      if (response.status === 400) {
        const error = await response.json();
        console.error(`MFA verification failed (attempt ${attempt}/${maxAttempts})`);

        if (attempt < maxAttempts) {
          await new Promise(resolve => setTimeout(resolve, 1000));
          continue;
        }
      }
    } catch (error) {
      console.error('MFA verification error:', error);
      throw error;
    }
  }

  throw new Error('MFA verification failed after maximum attempts');
}

Summary

Key Takeaways

  • IP-Based Limits: All rate limits apply per client IP address
  • Different Groups: Authentication, device flow, and MFA have different limits
  • Respect Retry-After: Always honor the Retry-After header
  • Implement Backoff: Use exponential backoff for automatic retry
  • Cache Aggressively: Reduce API calls through caching
  • Monitor Usage: Track your rate limit consumption

Quick Reference

Endpoint Group Rate Burst Size
Authentication Routes 1 request per second 20
Device Flow Routes 1 request every 5 seconds (0.2/sec) 10
MFA Setup Routes 1 request every 5 minutes (300 sec) 5
MFA Verification Routes 1 request per second 3