Coding - Latency Tracker Percentile

[ OK ] 1e018969-55bf-4ed6-ba8a-a1b9222657ee — full content available

[ INFO ] category: Coding difficulty: unknown freq: first seen: 2026-03-13

[UNKNOWN][CODING]

$ cat problem.md

Based on interview reports and technical design patterns frequently associated with Rippling, the "Latency Tracker" problem is a core Low-Level Design (LLD) or System Design question that focuses on efficiently calculating percentiles (P50cap P 50𝑃50, P90cap P 90𝑃90, P99cap P 99𝑃99) from a stream of data. aerospike.com +1

Problem Statement Overview

You are tasked with designing a Latency Tracker system that can ingest high volumes of request latency data and provide real-time (or near real-time) percentile metrics.

Core Requirements:

Data Ingestion: A method to record a new latency value (e.g., recordLatency(long ms)).
Percentile Query: A method to retrieve a specific percentile (e.g., getPercentile(double p)), where P99cap P 99𝑃99 means 99% of recorded latencies are below or equal to the returned value.
Efficiency: The solution must handle a high volume of writes (ingestion) and should ideally provide queries in constant or logarithmic time. aerospike.com +4

Common Constraints & Follow-ups

Interviewer expectations often evolve during the 60-minute session:

Sliding Window: Track percentiles only for the last Ncap N𝑁 minutes or Kcap K𝐾 requests rather than all-time data.
Memory Constraints: If the volume of data is too large to store every individual latency, how do you approximate percentiles (e.g., using T-Digest, HdrHistogram, or fixed-size buckets)?
Concurrency: The tracker must be thread-safe as multiple services may report latencies simultaneously.
Distributed Scale: How would you extend this to work across multiple servers (e.g., merging histograms from different nodes)? YouTube +4

Technical Strategy

For Small/Exact Data: Use a Sorted List or two Heaps (similar to the Median of Two Sorted Arrays problem often asked at Rippling) for 𝑂(1) or 𝑂(log𝑁) retrieval.
For Large-Scale Streams: Use Bucketization (counting occurrences in predefined time ranges like 0–10ms, 10–20ms, etc.) to allow for fast approximate calculations without storing every data point. github.com +3

[0] - What Is P99 Latency? Understanding the 99th Percentile of Performance [1] - AhmedSoliman/adaptive-timeout: Rust crate for ... - GitHub [2] - Design LeetCode - Interviewing.io [3] - Rippling Interview Experiences (2026) - Taro [4] - Netflix - StaffEngPrep [5] - Solving Rippling Frontend Interview Question | Asynchronous ... [6] - A Prediction Based Replica Selection Strategy for Reducing ... [7] - The essential guide for Rippling's Software Engineer interview [8] - Interview Experience - 154 - Rippling - Roundz Newsletter [9] - Rippling Interview Questions (2026) - Taro [10] - Federated Learning: FedAvg vs FedAdagrad Analysis [11] - I recently gave a coding round with Rippling. They ask 2 questions in ...

user@intervues:~/netflix$