[ OK ]c5cb4646-61d7-4ece-bb4e-c2cf60e9db5f — full content available
[ INFO ]category: System Design difficulty: unknown freq: first seen: 2026-04-04
[UNKNOWN][SYSTEM DESIGN]New
$catproblem.md
System Design - Design an Ad Event Aggregator
Netflix is looking for a solution to aggregate high-volume ad events such as impressions, clicks, quartile completions, and conversions. The system should enrich these events with campaign and device metadata and serve both real-time and batch processing needs.
Problem Statement
Design an Ad Event Aggregator system that:
Collects high-volume ad events such as impressions, clicks, quartile completions, and conversions.
Enriches these events with campaign and device metadata.
Serves both real-time and batch processing needs.
Examples
An ad is displayed to a user, generating an impression event.
The user clicks on the ad, generating a click event.
The user watches 25%, 50%, 75%, or 100% of the ad, generating quartile completion events.
The user completes a conversion action, such as signing up for a service or making a purchase, generating a conversion event.
Constraints
High volume of ad events: The system should handle a large number of events per second.
Low latency: Real-time processing should have minimal latency.
Scalability: The system should be scalable to handle increasing event volumes.
Data consistency: The system should maintain data consistency across different components.
Hints
Use a message queue (e.g., Kafka) to handle high-volume event ingestion.
Implement a stream processing system (e.g., Apache Flink or Spark Streaming) for real-time event processing.
Use a distributed database (e.g., Cassandra or HBase) for storing enriched event data.
Consider using a batch processing system (e.g., Apache Hadoop or Spark) for offline analytics and reporting.
Solution
Event Ingestion:
Use a message queue like Kafka to handle high-volume event ingestion. Kafka can provide high-throughput and low-latency messaging.
Producers publish ad events to Kafka topics, and consumers subscribe to these topics for real-time and batch processing.
Real-time Processing:
Implement a stream processing system like Apache Flink or Spark Streaming to process events in real-time.
Enrich events with campaign and device metadata using lookup tables or external APIs.
Aggregate events (e.g., count impressions, clicks, and conversions) and update metrics in real-time.
Publish aggregated results to a message queue or database for further processing or storage.
Batch Processing:
Use a distributed database like Cassandra or HBase to store enriched event data.
Periodically run batch processing jobs using Apache Hadoop or Spark to analyze historical data and generate reports.
Store aggregated results in a data warehouse (e.g., Amazon Redshift or Google BigQuery) for ad-hoc querying and analysis.
Scalability and Fault Tolerance:
Scale Kafka, stream processing, and batch processing components horizontally to handle increasing event volumes.
Implement replication and partitioning in Kafka and databases to ensure fault tolerance and data consistency.
Monitoring and Alerting:
Monitor system performance and event metrics using tools like Prometheus and Grafana.
Set up alerting mechanisms to notify engineers of system issues or anomalies in event processing.
By following these guidelines, you can design a scalable and robust Ad Event Aggregator system that meets Netflix's requirements for real-time and batch processing of high-volume ad events.