Practice/LinkedIn/Design a Malicious IP Detection System

Design a Malicious IP Detection System

System DesignMust

Problem Statement

Design a system that can detect and block malicious IP addresses for an enterprise, including geo-distributed blacklists, edge processing capabilities, and efficient filtering mechanisms like bloom filters to prevent attacks at scale. The system must separate a control plane (detection, list management, distribution) from a data plane (fast, safe enforcement), while handling global updates, failure modes, and adversarial traffic.

The system must trade off accuracy vs latency (false positives/negatives), reason about consistency and propagation across regions, and design safe rollout and observability for security changes at scale.

Key Requirements

Functional

Policy management -- users define, publish, and manage IP reputation policies and lists (block, allow, score) with TTLs and scopes (global, region, application)
Edge enforcement -- real-time IP blocking at the edge with sub-millisecond checks, continuing operation during control-plane outages
Near-real-time distribution -- edge blacklists receive updates across regions with versioning, rollback, and partial rollouts
Audit and analysis -- searchable logs with reason codes to investigate and tune blocking policies

Non-Functional

Scalability -- handle millions of requests per second at the edge with IP reputation checks on every inbound request
Reliability -- edge enforcement continues during control-plane outages using locally cached blacklists
Latency -- sub-millisecond IP checks at the edge; blacklist updates propagated globally within seconds
Consistency -- eventual consistency for blacklist propagation; accept brief windows where new threats are not yet blocked at all edges

Interview Reports from Hello Interview

12 reports from candidates. Most recently asked at LinkedIn in Early January 2026.

This question is primarily asked at LinkedIn (all 12 reports are from LinkedIn interviews).

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Edge-Local Detection and Blocking

Every inbound request must be checked against the blacklist with minimal latency. Interviewers want to see how you avoid synchronous remote lookups on the hot path.

Hints to consider:

Use in-process Bloom filters at the edge for ultra-fast negative lookups (sub-microsecond for IPs not in the blacklist)
Maintain an L2 exact set (Redis or local hash set) to confirm Bloom filter hits and eliminate false positives
Download versioned blacklist snapshots periodically and apply delta updates between snapshots
Design for graceful degradation: if the control plane is unreachable, continue blocking with the last known blacklist

2. Bloom Filter Management and False Positive Control

Bloom filters are critical for performance but introduce false positives that can block legitimate traffic. Interviewers expect a concrete mitigation plan.

Hints to consider:

Configure target false positive rates (e.g., 0.1%) based on blacklist size and available memory
Use versioned Bloom filter artifacts with atomic swap on update to ensure consistency during refreshes
Implement an allowlist mechanism that overrides Bloom filter matches for known-good IPs
Monitor false positive rates in production and auto-tune filter parameters

3. Control Plane: Detection Pipeline and List Management

Interviewers want to see how malicious IPs are identified, scored, and promoted to the blacklist.

Hints to consider:

Use Flink to aggregate security signals (failed auth attempts, request patterns, geographic anomalies) and compute per-IP reputation scores
Promote IPs exceeding a threshold to the blacklist with configurable TTLs (auto-generated blocks expire; manual blocks persist)
Support canary deployments for new blocking rules: apply to a subset of edge nodes, monitor impact, then roll out globally
Implement instant rollback capability with signed, versioned blacklist artifacts

4. Safe Rollout and Observability

Pushing untested rules globally can lock out legitimate users. Interviewers expect production safety mechanisms.

Hints to consider:

Log every block decision with reason codes (which rule, which blacklist version) for post-incident analysis
Implement an emergency allowlist that can be activated instantly to unblock false positives
Monitor block rates per edge node, per region, and per rule; alert on unexpected spikes that might indicate a bad rule
Support A/B testing of new detection algorithms by applying them in shadow mode before enforcement

Practice/LinkedIn/Design a Malicious IP Detection System

Design a Malicious IP Detection System

System DesignMust

Problem Statement

Key Requirements

Functional

Policy management -- users define, publish, and manage IP reputation policies and lists (block, allow, score) with TTLs and scopes (global, region, application)
Edge enforcement -- real-time IP blocking at the edge with sub-millisecond checks, continuing operation during control-plane outages
Near-real-time distribution -- edge blacklists receive updates across regions with versioning, rollback, and partial rollouts
Audit and analysis -- searchable logs with reason codes to investigate and tune blocking policies

Non-Functional

Scalability -- handle millions of requests per second at the edge with IP reputation checks on every inbound request
Reliability -- edge enforcement continues during control-plane outages using locally cached blacklists
Latency -- sub-millisecond IP checks at the edge; blacklist updates propagated globally within seconds
Consistency -- eventual consistency for blacklist propagation; accept brief windows where new threats are not yet blocked at all edges

Interview Reports from Hello Interview

12 reports from candidates. Most recently asked at LinkedIn in Early January 2026.

This question is primarily asked at LinkedIn (all 12 reports are from LinkedIn interviews).

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Edge-Local Detection and Blocking

Every inbound request must be checked against the blacklist with minimal latency. Interviewers want to see how you avoid synchronous remote lookups on the hot path.

Hints to consider:

Use in-process Bloom filters at the edge for ultra-fast negative lookups (sub-microsecond for IPs not in the blacklist)
Maintain an L2 exact set (Redis or local hash set) to confirm Bloom filter hits and eliminate false positives
Download versioned blacklist snapshots periodically and apply delta updates between snapshots
Design for graceful degradation: if the control plane is unreachable, continue blocking with the last known blacklist

2. Bloom Filter Management and False Positive Control

Bloom filters are critical for performance but introduce false positives that can block legitimate traffic. Interviewers expect a concrete mitigation plan.

Hints to consider:

Configure target false positive rates (e.g., 0.1%) based on blacklist size and available memory
Use versioned Bloom filter artifacts with atomic swap on update to ensure consistency during refreshes
Implement an allowlist mechanism that overrides Bloom filter matches for known-good IPs
Monitor false positive rates in production and auto-tune filter parameters

3. Control Plane: Detection Pipeline and List Management

Interviewers want to see how malicious IPs are identified, scored, and promoted to the blacklist.

Hints to consider:

Use Flink to aggregate security signals (failed auth attempts, request patterns, geographic anomalies) and compute per-IP reputation scores
Promote IPs exceeding a threshold to the blacklist with configurable TTLs (auto-generated blocks expire; manual blocks persist)
Support canary deployments for new blocking rules: apply to a subset of edge nodes, monitor impact, then roll out globally
Implement instant rollback capability with signed, versioned blacklist artifacts

4. Safe Rollout and Observability

Pushing untested rules globally can lock out legitimate users. Interviewers expect production safety mechanisms.

Hints to consider:

Log every block decision with reason codes (which rule, which blacklist version) for post-incident analysis
Implement an emergency allowlist that can be activated instantly to unblock false positives
Monitor block rates per edge node, per region, and per rule; alert on unexpected spikes that might indicate a bad rule
Support A/B testing of new detection algorithms by applying them in shadow mode before enforcement