Practice/Spotify/Design an Ad Server
Design an Ad Server
System DesignMust
Problem Statement
Design an ad server system that selects and delivers targeted advertisements to users in real time whenever a web page or application requests an ad slot. Think of platforms like Google Ads or the ad breaks between songs on Spotify's free tier -- advertisers define campaigns with budgets and targeting criteria, publishers provide ad placements, and the ad server decides which ad to show to which user in milliseconds.
The fundamental challenge is making a high-quality ad selection decision under extreme time pressure. Each ad request triggers a multi-stage pipeline: retrieve the user's profile and context, filter thousands of eligible campaigns by targeting rules, run an auction to determine the winning ad, enforce budget and frequency caps, and return a creative -- all within 50-100ms. At scale, the system must handle hundreds of thousands of requests per second while maintaining accurate budget accounting and preventing overspend.
Interviewers use this problem to evaluate your ability to design latency-critical read paths, manage distributed counters under contention, and orchestrate multi-step decision flows where each stage has different consistency and performance requirements.
Key Requirements
Functional
- Campaign management -- advertisers create campaigns with targeting rules (audience segments, geography, device type, time of day), bid amounts, daily and total budgets, and creative assets
- Ad selection and auction -- given an ad request with user context and placement metadata, the system filters eligible campaigns, ranks them by bid and relevance score, and returns the winning creative
- Budget and frequency enforcement -- enforce per-campaign daily and total budget caps and per-user frequency caps so that no campaign overspends and no user sees the same ad excessively
- Event tracking -- record impressions, clicks, and conversions with deduplication, then aggregate metrics for advertiser dashboards and billing
Non-Functional
- Scalability -- handle 500,000 ad requests per second at peak, with 10 million active campaigns and 500 million user profiles
- Latency -- return an ad decision within 100ms at p99, including targeting lookup, auction, and cap checks
- Reliability -- 99.95% uptime for the ad serving path; degrade gracefully to backfill or house ads if personalization components fail
- Consistency -- budget counters must be accurate within a small margin (less than 1% overspend); impression logs must be durable and deduplicated for correct billing
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Low-Latency Ad Selection Pipeline
The core serving path must complete multiple steps (user lookup, targeting filter, auction, cap check, creative fetch) within a strict latency budget. Interviewers want to see how you parallelize work and keep each stage fast.
Hints to consider:
- Precompute and cache targeting indices (inverted indexes mapping segment to campaign IDs) in Redis or an in-memory store so that filtering eligible campaigns takes under 10ms
- Run user profile lookup and targeting index lookup in parallel since they are independent
- Use a two-phase approach: a fast pre-filter reduces thousands of campaigns to dozens of candidates, then a more expensive ranking model scores the short list
- Keep creative assets on a CDN and return only a creative URL in the ad response to minimize payload size and serving latency
2. Distributed Budget and Frequency Counters
Budget caps and frequency caps create hot counters that every ad decision must read and write. Without careful design, these become bottlenecks that cause either overspend or excessive latency.
Hints to consider:
- Shard budget counters by campaign ID across multiple Redis nodes; each serving instance maintains a local counter and periodically syncs to the central store
- Use a "budget reservation" pattern: each serving node reserves a small chunk of remaining budget (e.g., 1% of daily budget) and decrements locally, reducing the frequency of distributed writes
- For frequency caps, store per-user impression counts in Redis with TTLs matching the cap window (e.g., 24 hours); use approximate checks on the fast path and reconcile asynchronously
- Accept that a small amount of overspend (under 1%) is tolerable in exchange for not adding a synchronous distributed lock to every ad decision
3. Auction Mechanics and Pacing
Selecting the highest bidder is not sufficient -- the system must also pace delivery evenly across the day and optimize for long-term revenue rather than immediately exhausting high-bid campaigns.
Hints to consider:
- Implement a second-price auction where the winner pays one cent above the second-highest bid, incentivizing truthful bidding
- Add a pacing multiplier to each campaign's effective bid based on how much of the daily budget has been spent relative to the time elapsed in the day
- Combine bid price with a predicted click-through rate to compute an expected revenue score (eCPM), ensuring the system optimizes for revenue rather than raw bid amount
- Reserve a percentage of inventory for backfill and house ads to maintain fill rate even when premium demand is low
4. Event Logging, Deduplication, and Attribution
Impressions and clicks drive billing, so the logging pipeline must be durable and deduplicated. Interviewers probe how you handle duplicate events, late-arriving data, and conversion attribution windows.
Hints to consider:
- Write impression and click events to Kafka with an idempotency key (request ID plus event type) to enable exactly-once processing downstream
- Use a streaming processor (Flink or Kafka Streams) to deduplicate events within a time window before writing to the billing aggregation store
- Store raw events in a data lake (S3 plus Parquet) for audit and reconciliation; maintain real-time aggregates in a fast OLAP store (ClickHouse or Druid) for advertiser dashboards
- Attribution windows for conversions (e.g., 7-day click-through, 1-day view-through) require joining conversion events against a lookback of impressions and clicks per user