Design a photo-sharing social media platform where users can upload photos, follow other users, and view a personalized feed of posts from people they follow. Users expect fast uploads, instantly visible posts, and a smooth, duplicate-free infinite scroll experience.
Instagram is a media-sharing social network where people post photos and videos and scroll a personalized feed from accounts they follow. The system must handle two fundamentally different challenges simultaneously: reliably uploading large media files and serving low-latency, high-scale feeds. Feed generation for celebrity accounts with millions of followers introduces fan-out complexity, while maintaining scroll consistency requires stable cursor-based pagination.
Interviewers at Uber ask this to test whether you can design for high-scale media uploads alongside low-latency feed delivery, balancing write amplification from fan-out with read performance. They expect clear API contracts, pragmatic caching strategies, a hybrid feed approach for high-follower accounts, robust pagination, and efficient media storage using direct-to-object-storage patterns.
Based on real interview experiences at Uber, these are the areas interviewers probe most deeply:
Handling large file uploads reliably determines system cost and user experience. Interviewers want to see you avoid routing binary data through application servers and instead use direct-to-storage patterns.
Hints to consider:
The feed is the highest-traffic read path and the most complex write path. Interviewers evaluate whether you understand the tradeoffs between fan-out-on-write and fan-out-on-read, especially for celebrity accounts.
Hints to consider:
Users expect a smooth scrolling experience without seeing duplicates or missing posts. Interviewers probe how you maintain consistent pagination under concurrent writes.
Hints to consider:
Serving media at scale requires a content delivery strategy that minimizes latency and bandwidth costs. Interviewers look for CDN integration and multi-resolution support.
Hints to consider:
Confirm the scope of media types (photos only or also videos), expected scale (DAU, posts per second, feed reads per second), and whether features like likes, comments, or stories are in scope. Clarify the consistency requirements for feed visibility and the maximum acceptable delay from posting to appearing in followers' feeds. Ask about the follow graph scale (average followers, maximum followers) since this drives the fan-out strategy.
Sketch the core components: API gateway, post service (handles creation and metadata), media storage (object storage + CDN), feed service (precomputed feeds in Redis), fan-out workers consuming from a message queue, user/follow graph service, and a relational database for metadata. Show two key flows: the upload flow (client to pre-signed URL to object storage to event to processing workers) and the feed flow (client to feed service to merge precomputed feed with celebrity posts to return paginated results).
Walk through what happens when a user creates a post. The post service stores metadata, emits an event to Kafka partitioned by poster ID. Fan-out workers consume events, look up the poster's follower list, and for each follower push the (timestamp, post_id) into their Redis sorted set (ZADD). For celebrity accounts above the follower threshold, skip fan-out and instead flag the post for fan-out-on-read. When a user loads their feed, the feed service reads their precomputed sorted set, merges in recent posts from followed celebrities (fetched from a celebrity post index), deduplicates, and returns a paginated response with a cursor. Discuss how you handle new follows (backfill recent posts) and unfollows (lazy cleanup).
Cover media processing: asynchronous thumbnail generation and transcoding via worker pools with dead letter queues for failures. Discuss storage lifecycle policies (move originals to cold storage after 90 days). Address feed cache invalidation and TTL strategies. Explain monitoring for upload success rates, fan-out lag, and feed latency percentiles. Mention horizontal scaling: shard Redis by user ID, partition Kafka by poster ID, and use read replicas for the metadata database.