Practice/ServiceTitan/Design Paste Bin with Monitoring
Design Paste Bin with Monitoring
System DesignMust
Problem Statement
Design a paste bin service similar to Pastebin that allows users to create, share, and access text snippets with optional expiration dates and monitoring capabilities. The system should handle approximately 4 writes per second, support unlimited content size, and include monitoring features to track usage and performance.
Users create a paste, receive a short link, and share it so others can view the content. Pastes can optionally expire, and the service must support arbitrarily large content sizes plus provide usage and performance monitoring. This is a read-heavy system where pastes may be viewed thousands of times but created infrequently, making retrieval latency optimization and viral traffic handling critical concerns.
Interviewers ask this to see if you can keep a simple product simple while making sound choices for blob storage, read scaling, TTL cleanup, and observability. It tests whether you separate metadata from large payloads, design for read-heavy workloads, and instrument the system with meaningful SLOs and alerts without over-engineering for a modest write rate.
Key Requirements
Functional
- Paste creation -- users submit text content of unlimited size and receive a unique, shortened URL for sharing
- Paste retrieval -- anyone with the URL can view the content with low latency, regardless of content size
- Expiration management -- users can set optional expiration times and the system automatically purges expired content
- Access control and analytics -- users can mark pastes as public or unlisted, view basic metrics (view count, last access time), and operators can monitor system health with detailed telemetry
Non-Functional
- Scalability -- handle 4 writes/second and 1000+ reads/second with ability to absorb traffic spikes 10x normal load
- Reliability -- 99.9% uptime with redundancy for both metadata and content storage
- Latency -- P99 retrieval latency under 200ms for small pastes (under 1MB) and efficient streaming for large content
- Consistency -- strong consistency for paste creation and eventual consistency for view counts; newly created pastes must be immediately visible
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Handling Unlimited Content Size
The unlimited size requirement fundamentally shapes the architecture. Storing large blobs directly in a traditional database is a non-starter and you need a clear separation between metadata and content storage.
Hints to consider:
- Store metadata (URL, expiration, visibility, owner) separately from the actual paste content
- Use object storage (S3, GCS) for the paste content itself and store only a reference pointer in the metadata database
- For large files, implement streaming reads and writes rather than loading entire content into memory
- Consider presigned URLs or CDN integration to serve content directly from object storage, bypassing application servers
2. Read Optimization and Caching Strategy
With a roughly 250:1 read-to-write ratio, the system must be heavily optimized for retrieval. Interviewers probe your caching strategy at multiple layers and how you handle cache invalidation.
Hints to consider:
- Implement a multi-tier caching approach: CDN for content, application cache (Redis) for metadata, and browser caching headers
- Cache hot pastes aggressively and implement negative caching for non-existent URLs to prevent database hammering
- Design cache keys to include version information for easy invalidation when pastes are updated or deleted
- Consider read replicas for the metadata database to distribute read load across geographic regions
3. Expiration and Cleanup Pipeline
The expiration feature requires a reliable background process that cleans up both metadata and actual content. Interviewers want to see a concrete design for purging expired content without impacting live traffic.
Hints to consider:
- Use database-native TTL features (like DynamoDB TTL) to automatically expire metadata records, triggering cleanup workflows
- Implement an asynchronous cleanup worker that listens for expiration events and deletes both cache entries and object storage blobs
- Design the cleanup process to be idempotent so retries do not cause issues if the worker fails partway through
- Schedule periodic sweeper jobs as a safety net to catch any content not cleaned up by the event-driven pipeline
4. Monitoring and Observability
The monitoring requirement goes beyond basic metrics. Interviewers expect concrete SLOs, clear measurement strategies, and actionable alerts rather than hand-wavy dashboards.
Hints to consider:
- Define clear SLOs: paste creation success rate (99.9%), retrieval latency (P99 under 200ms), and availability (99.9% uptime)
- Emit structured events for key operations (create, read, delete, expire) to a streaming platform like Kafka for analysis
- Aggregate view counts and analytics asynchronously to avoid impacting read path performance
- Implement distributed tracing to identify bottlenecks across the request path from load balancer through app server, cache, database, and object storage
5. URL Generation and Collision Handling
Creating short, unique URLs requires a thoughtful ID generation strategy. Interviewers will ask about collision probability, URL length, and uniqueness guarantees.
Hints to consider:
- Use base62 encoding for compact URLs from numeric or hash-based IDs
- Consider a hash-based approach (hash content plus timestamp) with collision detection, or a counter-based approach with distributed ID generation
- Decide URL length based on expected total pastes (7 characters give 62^7, about 3.5 trillion possibilities)
- Implement retry logic with a different salt or increment if a collision is detected during creation