Data Engineer · Full Journey · Multiple Types — Meta

Meta — Data Engineer ✅ Passed

Level: Senior-Level

Round: Full Journey · Type: Multiple Types · Difficulty: 7/10 · Duration: 240 min · Interviewer: Unfriendly

Topics: SQL, Python, ETL, Data Modeling, Product Sense, Behavioral, Metrics, Data Visualization, Data Structures, Algorithms

Location: San Francisco Bay Area

Interview date: 2020-02-15

Summary

Interview Rounds Overview

Round 1: HR Interview
Round 2: Phone Screen
Round 3: Onsite (ETL, Data Modeling, ETL, Behavioral)
Round 4: Data Modeling (Follow

Details

I interviewed for a Data Engineer position. Here's my experience:

Round 1: HR Interview (2019/12) This round involved background introduction and quick-fire questions on SQL knowledge. These are commonly asked questions in interviews.

Round 2: Phone Screen (2020/01) The interview was split into SQL and Python, with 30 minutes for each. I was required to run test cases for each question.

SQL: Focused on four tables (SALES, PRODUCT, STORE, ORDER) and various nested SQL queries. Practicing frequently asked interview questions is helpful.
Python: Mostly standard interview questions. Other programming languages were optional.

Phone Screen Experience Notes:

Completing four questions in each of the SQL and Python sections is a good benchmark.
Communication with the interviewer is crucial. I don't use SQL often, so I was a bit nervous and slow. It's important to verbalize your thought process. Even if you get stuck, the interviewer may provide hints to help you through.

Round 3: Onsite (2020/01) Although I applied for Seattle, the interview was arranged in Menlo Park due to staffing issues. The format included two rounds of ETL, one round of data modeling, and one behavioral round. The HR representative provided an overview of the interview format and question types beforehand, which was very helpful. The main areas of focus were product sense, SQL, and Python coding, with the first two being more heavily weighted.

First Hour: ETL
- Given a Meta product, I was asked to list metrics to evaluate its success.
- What could be the possible reasons if the monthly active users suddenly decreased one day?
- How would you define a new user, a churned user, and a returned user?
- How to design a table to calculate the ratios of these three types of users?
  - Table headers: UID, first_active_date, last_active_date, previous_active_date
  - SQL: How to calculate the above three types of users; how to update this table based on daily partition data.
  - Rewrite the above SQL using Python.
Second Hour: Data Modeling
- Design data tables for a class of Uber products. The goal was to list as many dimensions as possible and expand the table relationships.
- Write various SQL queries based on the designed tables. I was also asked the standard question of how to calculate drivers who only handle airport pickups.
Lunch Break
Third Hour: ETL
- Product Sense: For a photo upload product, how would you design metrics to determine if the product is successful?
- Visualization: How to use a dashboard to display these metrics (how to draw the graphs).
- Then I was tested on SQL, and then wrote stream data processing using Python, but I don't remember the specifics.
Last Half Hour: Behavioral
- I had a video chat with a DE Manager. The questions were frequently asked interview questions.

One week later, the HR representative gave feedback that my product sense was good, but my data modeling and SQL skills were not strong, leading to an additional round.

Additional Round: Data Modeling (2020/02) The format was the same as before: product discussion, table design, and SQL writing.

Discussed a cloud product (similar to Dropbox): What are its features? How do you define its success? How do you find new growth opportunities?

Draw a data model, including the dimensions. Based on the tables, I answered several SQL questions.