Practice/Microsoft/Design a Notification System for Marketing Campaigns
Design a Notification System for Marketing Campaigns
System DesignOptional
Problem Statement
Design a large-scale event ticketing platform similar to Ticketmaster or Eventbrite that handles ticket sales for concerts, sports games, and live entertainment worldwide. The system must support event organizers creating and managing events, handle high-traffic sales windows when popular events go on sale, prevent overselling through accurate inventory management, and process millions of transactions daily across different currencies and regions.
The platform needs to manage the entire lifecycle: event creation and pricing tiers, coordinated sale launches with waitlists and presales, real-time inventory tracking as tickets are held and purchased, payment processing with multiple providers, ticket delivery via email and mobile apps, and secondary market resales with authenticity verification. The biggest challenges are handling traffic spikes during major sales (100x normal load in seconds), maintaining inventory accuracy under concurrent purchases, preventing scalping and fraud, and ensuring fairness in high-demand scenarios where tickets sell out in minutes.
Key Requirements
Functional
- Event Management -- Organizers create events with multiple ticket tiers, pricing rules, seating charts, and sales schedules including presales and general availability
- Coordinated Sale Launch -- Support scheduled releases where thousands of buyers compete for limited inventory at exact timestamps
- Inventory Reservation -- Hold tickets during checkout with time-limited reservations that automatically release if payment fails or times out
- Payment Processing -- Accept payments through multiple providers and currencies with fraud detection and PCI compliance
- Ticket Delivery and Transfer -- Generate secure digital tickets with QR codes, support mobile wallets, and allow authorized transfers between users
- Secondary Market -- Enable verified resales with price controls and authenticity guarantees to combat scalping
Non-Functional
- Scalability -- Handle 500,000 concurrent users during major sales, process 10,000 checkout attempts per second, support catalog of 100,000+ active events
- Reliability -- 99.99% uptime during sale windows, zero double-booking or overselling, durable payment records with audit trails
- Latency -- Sub-500ms response for inventory checks and seat selection, checkout completion within 3 seconds, real-time inventory updates across all users
- Consistency -- Strong consistency for inventory to prevent overselling, eventual consistency acceptable for analytics and recommendations
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Inventory Consistency Under Concurrent Load
The core challenge is preventing overselling when thousands of buyers simultaneously attempt to purchase the last available tickets. Interviewers want to see if you understand distributed locking, optimistic vs pessimistic concurrency control, and reservation state machines.
Hints to consider:
- Discuss the difference between "checking availability" (fast read) versus "reserving a ticket" (write with lock)
- Consider using database-level atomic operations (SELECT FOR UPDATE, compare-and-swap) or distributed locks (Redis) for reservation
- Design a reservation timeout mechanism that automatically releases unpaid holds after 10-15 minutes to keep inventory flowing
- Explain how to handle race conditions when multiple users click "buy" on the same seat within milliseconds
2. Handling Traffic Spikes at Sale Launch
When a popular concert goes on sale at exactly 10:00 AM, traffic can spike from 1,000 to 100,000 concurrent users in seconds. This tests your understanding of autoscaling, queue-based load leveling, and graceful degradation.
Hints to consider:
- Implement a virtual waiting room that queues users before they hit the purchase flow, releasing them gradually
- Use CDN and edge caching for static content (event details, images), but ensure inventory reads always hit authoritative source
- Consider read replicas for browse/search traffic but route all writes through primary database with connection pooling
- Design rate limiting per user session to prevent bots from hammering the system with automated requests
- Discuss pre-warming infrastructure and database connections 30 minutes before scheduled major sales
3. Payment Processing and Transaction Safety
The system must handle payment failures, network timeouts, and user abandonment without losing money or double-charging. Interviewers look for understanding of idempotency, two-phase commits, and reconciliation.
Hints to consider:
- Use an outbox pattern to durably record the intent to charge before calling payment provider APIs
- Generate idempotency keys for payment requests so retries don't create duplicate charges
- Implement a state machine (Reserved → PaymentPending → PaymentComplete → TicketIssued) with explicit timeout and retry policies
- Design webhook handlers for asynchronous payment confirmations that are idempotent and can handle out-of-order delivery
- Discuss daily reconciliation jobs that compare internal records against payment provider settlements
4. Fraud Prevention and Fair Access
Scalpers use bots to purchase thousands of tickets instantly for resale. The system needs bot detection, rate limiting, and fairness mechanisms without harming legitimate users.
Hints to consider:
- Implement CAPTCHA challenges triggered by suspicious behavior (too many requests, unusual patterns)
- Use device fingerprinting and behavioral analysis to detect automated scripts versus human users
- Enforce purchase limits per user/credit card/address for high-demand events
- Consider lottery or queue randomization instead of pure first-come-first-served to reduce advantage of bots
- Design an API rate limiter with token buckets per user session, IP address, and account