[ OK ]1e018969-55bf-4ed6-ba8a-a1b9222657ee — full content available
[ INFO ]category: Coding difficulty: unknown freq: first seen: 2026-03-13
[UNKNOWN][CODING]
$catproblem.md
Based on interview reports and technical design patterns frequently associated with Rippling, the "Latency Tracker" problem is a core Low-Level Design (LLD) or System Design question that focuses on efficiently calculating percentiles (P50cap P 50𝑃50, P90cap P 90𝑃90, P99cap P 99𝑃99) from a stream of data. aerospike.com +1
Problem Statement Overview
You are tasked with designing a Latency Tracker system that can ingest high volumes of request latency data and provide real-time (or near real-time) percentile metrics.
Core Requirements:
Data Ingestion: A method to record a new latency value (e.g., recordLatency(long ms)).
Percentile Query: A method to retrieve a specific percentile (e.g., getPercentile(double p)), where P99cap P 99𝑃99 means 99% of recorded latencies are below or equal to the returned value.
Efficiency: The solution must handle a high volume of writes (ingestion) and should ideally provide queries in constant or logarithmic time. aerospike.com +4
Common Constraints & Follow-ups
Interviewer expectations often evolve during the 60-minute session:
Sliding Window: Track percentiles only for the last Ncap N𝑁 minutes or Kcap K𝐾 requests rather than all-time data.
Memory Constraints: If the volume of data is too large to store every individual latency, how do you approximate percentiles (e.g., using T-Digest, HdrHistogram, or fixed-size buckets)?
Concurrency: The tracker must be thread-safe as multiple services may report latencies simultaneously.
Distributed Scale: How would you extend this to work across multiple servers (e.g., merging histograms from different nodes)? YouTube +4
Technical Strategy
For Small/Exact Data: Use a Sorted List or two Heaps (similar to the Median of Two Sorted Arrays problem often asked at Rippling) for 𝑂(1) or 𝑂(log𝑁) retrieval.
For Large-Scale Streams: Use Bucketization (counting occurrences in predefined time ranges like 0–10ms, 10–20ms, etc.) to allow for fast approximate calculations without storing every data point. github.com +3