[ OK ]19131687-ad6b-41ca-9bc3-3d0fa552e1b2 — full content available
[ INFO ]category: System Design difficulty: unknown freq: first seen: 2026-02-18
[UNKNOWN][SYSTEM DESIGN]High Frequency
$catproblem.md
The Crypto Exchange Order Flow problem is a common system design and data modeling interview question at Snowflake. It typically tests your ability to design a scalable, real-time data pipeline that handles high-velocity transactional data while ensuring correctness and low-latency analytics.
Problem Statement Overview
You are tasked with designing a data warehouse solution for a high-frequency cryptocurrency exchange. The system must ingest, store, and analyze the Order Flow, which consists of every action taken on the exchange's order book (e.g., new orders, cancellations, and trades).
Key Requirements & Constraints
High Velocity: The system must handle millions of events per second during peak trading periods.
Data Types: You must manage structured data (trade amounts, prices) and semi-structured data (JSON payloads containing order metadata).
Latency: The "freshness" of the data is critical; analytics should ideally reflect the state of the exchange within minutes or seconds.
Transactional Integrity: The design must ensure no events are lost or duplicated, maintaining strict ACID compliance for financial accuracy. YouTube +3
Core Design Components
To address this problem, candidates typically use the following Snowflake-specific features:
Use Snowpipe for automated, micro-batch loading from external cloud storage (like AWS S3) as soon as order events are generated.
Use Snowpipe for automated, micro-batch loading from external cloud storage (like AWS S3) as soon as order events are generated.
Implement Snowflake Streams to track new records in the raw landing table.
Use Tasks to trigger transformation jobs (e.g., calculating "Market Depth" or "Volume Weighted Average Price") only when new data is present.
Implement Snowflake Streams to track new records in the raw landing table.
Use Tasks to trigger transformation jobs (e.g., calculating "Market Depth" or "Volume Weighted Average Price") only when new data is present.
Design a Snowflake Schema (normalized dimension tables) to reduce redundancy for complex order types and user metadata.
Separate Fact tables for trades and Dimension tables for currency pairs and users.
Design a Snowflake Schema (normalized dimension tables) to reduce redundancy for complex order types and user metadata.
Separate Fact tables for trades and Dimension tables for currency pairs and users.
Utilize Clustering Keys on timestamps or currency pairs to optimize query performance for time-series analysis.
Consider Materialized Views for frequently accessed aggregations like daily high/low prices. Medium +2
Utilize Clustering Keys on timestamps or currency pairs to optimize query performance for time-series analysis.
Consider Materialized Views for frequently accessed aggregations like daily high/low prices. Medium +2
Common Follow-up Questions
Handling Data Spikes: How do you use Multi-cluster Warehouses to scale compute power horizontally when trading volume surges?
Historical Analysis: How would you use Time Travel to audit a specific trade or recover from a botched data transformation?
Security: How would you implement Row-Level Security to ensure traders only see their own private order history while still allowing exchange-wide analytics? Medium
Would you like me to walk through a sample SQL implementation for the ingestion part of this problem?