ML System Design - Netflix Sentiment Tracking — Netflix

Problem Statement

Design a system that tracks overall public sentiment toward Netflix on social media over time. The system should ingest social media posts, classify sentiment, aggregate results, and surface trends — enabling stakeholders to monitor how public perception shifts in response to content releases, pricing changes, or other events.

This is an ML system design problem that combines data engineering, natural language processing, and analytics infrastructure. You should be prepared to discuss data ingestion pipelines, sentiment classification models, aggregation strategies, and how to visualize trends for business stakeholders.

Key Requirements

Functional

Social media data ingestion -- collect posts from Twitter, Reddit, Instagram, and other platforms that mention Netflix or related keywords
Sentiment classification -- apply an ML model to classify each post as positive, negative, or neutral
Aggregation and trending -- aggregate sentiment scores over time windows (hourly, daily, weekly) to detect shifts
Event correlation -- enable stakeholders to correlate sentiment changes with specific events (new show releases, pricing announcements)
Dashboard and alerting -- surface aggregated sentiment metrics and alert when sentiment drops significantly

Non-Functional

High throughput -- process millions of social media posts per day across multiple platforms
Near real-time processing -- sentiment scores should be updated frequently enough to catch rapid shifts
Scalability -- the ingestion and processing pipeline should scale horizontally as data volume grows
Model accuracy -- sentiment classification should be accurate enough to provide actionable insights
Data retention -- store historical sentiment data for trend analysis over months or years

Based on real interview experiences, these are the areas interviewers probe most deeply:

Interviewers want to see how you collect data from multiple social media platforms and handle the volume and variety of posts.

Use streaming APIs (Twitter API, Reddit API) and batch ingestion for platforms without streaming support
Design a message queue (Kafka) to buffer incoming posts before processing
Handle rate limits and API quotas from social media platforms
Filter posts to those relevant to Netflix using keyword matching or entity recognition
Store raw posts for reprocessing if the sentiment model is updated

Interviewers probe on your ML model choice and how you train and deploy it.

Pre-trained models (BERT, RoBERTa) fine-tuned on social media sentiment datasets
Simpler baseline models (logistic regression on TF-IDF features) for comparison
Handling emoji, slang, and informal language common in social media
Multi-class classification (positive, negative, neutral) or regression (sentiment score)
Consider domain-specific vocabulary (show names, character names, Netflix-specific terms)

Interviewers want to see how you aggregate individual sentiment scores into meaningful trends.

Interviewers expect you to discuss how the sentiment model is trained and retrained over time.

Labeled training data from public sentiment datasets or manual annotation
Periodic retraining to adapt to evolving language and new shows/events
Evaluation metrics (accuracy, F1 score per class) with special attention to class imbalance
A/B testing framework to compare model versions before full deployment
Batch processing pipeline (Spark or Airflow) for training data preparation

Interviewers want to see how you surface insights to stakeholders.

Dashboard showing sentiment trend lines over time with drill-down by platform or keyword
Heatmaps or geographic breakdowns if location data is available
Alerting system that triggers when sentiment drops below a threshold or changes rapidly
Correlation view that overlays sentiment trends with event timelines (show releases, announcements)
Data Ingestion Layer -- API connectors for social media platforms feeding into a message queue (Kafka)
Stream Processing -- filters and preprocesses posts (deduplication, keyword filtering)
Sentiment Classification Service -- stateless service that loads the ML model and returns sentiment scores
Aggregation Layer -- stream processor (Flink or Spark Streaming) that computes time-bucketed aggregates
Storage -- raw post storage (S3 or data lake), aggregated metrics in time-series DB
Offline Training Pipeline -- batch pipeline for model training and evaluation
Dashboard and Alerting -- visualization layer (Grafana or custom web app) with alerting rules

Discuss your model choice (transformer-based vs simpler models), training data sources, evaluation metrics, and how you handle social media-specific challenges (sarcasm, emojis, slang).

Explain how you compute aggregated sentiment over time windows, how you weight posts, and how you detect significant sentiment shifts.

Discuss how you monitor model accuracy in production, collect feedback for retraining, and iterate on the system as new platforms or data sources are added.

"Design a system that can track social media sentiment about Netflix over time. I structured my answer around three pillars: data ingestion from social media sources, aggregation of sentiment signals, and offline ML training for the sentiment classification model."