Network Throttling System

[ OK ] 385 — full content available

[ INFO ] category: System Design difficulty: hard freq: medium first seen: 2026-01-13

[HARD][SYSTEM DESIGN][MEDIUM]webbackendReliabilityinfrastructureInfrastructuremachine_learningRate Limiting

$ cat problem.md

Design a throttling system for Databricks’ serving infrastructure that protects both incoming traffic (clients → API gateway) and outgoing calls (API servers → downstream dependencies). The system must handle sudden bursts, prevent cascading failures, and enforce rate limits across a multi-instance, multi-region deployment. Support multiple limit dimensions (global, per-service, per-user, per-workspace) and make limits adaptive: when a dependency slows down, automatically reduce the rate of outgoing requests. The solution should impose minimal latency overhead (<1 ms on the critical path), scale to millions of decisions per second, and tolerate machine failures without losing accuracy. You may assume a micro-service architecture behind an Envoy-based ingress gateway, with services written in Scala/Java and a control plane that can push configuration changes within seconds.

user@intervues:~/databricks$