Amazon's "Design Rate Limiter" interview question focuses on creating a scalable system to control API request rates, preventing abuse in distributed environments.[1][3]
Design a distributed rate limiter that enforces configurable limits on HTTP requests from clients identified by user ID, IP address, or API key. It must support rules like "100 requests per minute per user" or endpoint-specific caps, rejecting excess requests with HTTP 429 status, remaining quota headers (X-RateLimit-Remaining), reset timestamps (X-RateLimit-Reset), and Retry-After.[3][1]
isRequestAllowed(clientId, ruleId) -> { passes: boolean, remaining: number, resetTime: timestamp }.No formal input/output test cases exist, as this is a system design question, but key scenarios include:
| Scenario | Input | Expected Output | Notes |
|----------|--------|-----------------|-------|
| Allowed request | clientId="user123", ruleId="100/min", current=50 requests | {passes: true, remaining: 50, resetTime: 1640995200} HTTP 200 | Within window.[1] |
| Exceeded limit | clientId="user123", ruleId="100/min", current=101 requests | HTTP 429 with X-RateLimit-Remaining: 0, X-RateLimit-Reset: 1640995200, Retry-After: 60, body: {"error": "Rate limit exceeded"} | Fixed window or token bucket algo.[1] |
| Scale test | 1M req/sec across shards | Sub-10ms p99 latency | Redis clustering, sharding by clientId.[1] |
Common algorithms: Token Bucket (fixed permits/tokens refilled over time), Leaky Bucket (constant drain rate), Sliding Window (Redis counters with EXPIRE).[7][1]