Data Engineer · Full Journey · Multiple Types — Meta

Meta — Data Engineer ❌ Failed

Level: Senior-Level

Round: Full Journey · Type: Multiple Types · Difficulty: 7/10 · Duration: 225 min · Interviewer: Friendly

Topics: SQL, Python, ETL, Data Warehousing, Scalability, System Design, Database Design, Leadership, Influence Without Authority, Conflict Resolution

Location: Remote

Interview date: 2022-11-01

Got offer: False

Summary

1. Leadership & Ownership (Behavioral)

The judging rubric focuses on three signals: Ownership, Scale, and Influence.

General Philosophy

What do you think makes a good data engineer?
How would people characterize your leadership?

Situational (STARR Method)

Tell me about a time when you had to convince someone to your point of view.
Tell me about a time when you brought about change in your project.
Tell me about a time you disagreed with a coworker/manager.
What accomplishment are you most proud of?

2. Technical Screen: Coding & ETL

SQL Challenges

The "Social User" Problem (Hard): * Task: Given a FB-like schema, calculate the top 10 institutions that produce "social users."
- Logic: Define "social users" (e.g., users with > average friends).
- Skills: AVG(), HAVING, GROUP BY, subqueries, joins.
DAU Segmentation: * Schema: userId, first_login, last_seen, previous_last_seen, todays_date.
- Task: Calculate segments (Active + Returning + New - Churned) / Total.
- Efficiency: Use CASE WHEN to calculate all metrics in a single table scan.
Daily Load:
- Task: Update a table for the next day given transactional login events (userId, login_timestamp).
- Skills: COALESCE, full outer joins, date calculations.

Python Challenges

Friendship Graph (Easy): * Task: Count friends for each user given a list of edges.
- Input: [[A,B],[C,D],[B,D],[E]]
- Output: {A: 1, B: 2, C: 1, D: 2, E: 0}

3. Product Sense & Data Modeling

Case Study: Ride-Sharing (Uber/Lyft)

Product Sense: How would you deploy in a new region? What metrics matter for existing users?
Modeling: Design a schema for a ride-sharing app.
Analytics: * Calculate average wait time.
- Find users who use the app for airport services only.

Case Study: Professional Network (LinkedIn)

Product Sense: What metrics are important?
Modeling: Design a data model for a specific feature (e.g., Marketplace, Feed).
Scale: How do you handle scale, partitioning, and denormalization?

Case Study: File Sharing (Dropbox)

Product Sense: Track file uploads, storage growth, and DAU.
ETL Anomaly: If a graph is "spiky" or Metric A increases by 10% but Metric B only by 5%, how do you diagnose the root cause?

4. System Design & Architecture

Scenario: Monthly reports for an Advertising Department.
Components: * Batch vs. Streaming data processing.
- Source vs. Target data modeling.
- Handling FB-scale (Petabytes).
Data Quality: How to handle the four pillars:
1. Accuracy
2. Consistency
3. Validity
4. Completeness

Details

5. Summary of Key Resources

Books: The Data Warehouse Toolkit (Kimball), Cracking the PM Interview.
Practice: LeetCode (Easy/Medium), StrataScratch (Medium/Hard), DataLemur.
Tips: * Speed: 90% completion is often a fail. You must finish.
- Environment: Coderpad (no auto-complete).
- Communication: Speak out loud. If you get a hint, use it immediately.