OpenAI — Mobile Engineer ✅ Passed

Level: Senior-Level

Round: Phone Screen · Type: Coding · Difficulty: 6/10 · Duration: 60 min · Interviewer: Neutral

Topics: Machine Learning, Position Embedding, KV Cache, KNN, FFN, L2 Distance

Location: San Francisco Bay Area

Interview date: 2026-01-25

Summary

Round 1: Coding

Question: Debugging a Transformer model (position embedding initialization, mask setting, missing loss.backward(), projection layer dimensions). Follow-up involved KV cache implementation.

Round 2: ML Puzzle

Question: Implementing One-NN (basic KNN) and then implementing it using a basic feed-forward network (FFN) and activation layer. The key is to convert L2 distance into a linear transformation (Y = WX + b) and then use softmax activation.

Details

The first round involved debugging a Transformer model with the following issues:

Position embedding initialization
Mask not set to -inf
Missing loss.backward()
Incorrect dimensions in the projection layer's nn.Linear module

The follow-up involved KV cache, where I had to insert and modify the position embedding during attention calculation, passing the parameters correctly.

For the ML puzzle, I had to:

Implement One-NN (KNN's simplest form).
Implement One-NN using a basic feed-forward network (FFN) and an activation layer. The trick was realizing that L2 distance can be converted into a Y = WX + b problem. I used softmax as the activation function.

---- Layer 1 (Linear): scores_i = 2 * x^T x_i - ||x_i||^2 ----

W1: (d, n), b1: (n,)

W1 = 2.0 * X.T # columns are 2*x_i b1 = -np.sum(X * X, axis=1) # -||x_i||^2 `

OpenAI — Mobile Engineer ✅ Passed

Level: Senior-Level

Round: Phone Screen · Type: Coding · Difficulty: 6/10 · Duration: 60 min · Interviewer: Neutral

Topics: Machine Learning, Position Embedding, KV Cache, KNN, FFN, L2 Distance

Location: San Francisco Bay Area

Interview date: 2026-01-25

Summary

Round 1: Coding

Question: Debugging a Transformer model (position embedding initialization, mask setting, missing loss.backward(), projection layer dimensions). Follow-up involved KV cache implementation.

Round 2: ML Puzzle

Details

The first round involved debugging a Transformer model with the following issues:

Position embedding initialization
Mask not set to -inf
Missing loss.backward()
Incorrect dimensions in the projection layer's nn.Linear module

The follow-up involved KV cache, where I had to insert and modify the position embedding during attention calculation, passing the parameters correctly.

For the ML puzzle, I had to:

Implement One-NN (KNN's simplest form).
Implement One-NN using a basic feed-forward network (FFN) and an activation layer. The trick was realizing that L2 distance can be converted into a Y = WX + b problem. I used softmax as the activation function.

---- Layer 1 (Linear): scores_i = 2 * x^T x_i - ||x_i||^2 ----

W1: (d, n), b1: (n,)

W1 = 2.0 * X.T # columns are 2*x_i b1 = -np.sum(X * X, axis=1) # -||x_i||^2 `

Mobile Engineer · Phone Screen · Coding

OpenAI — Mobile Engineer ✅ Passed

Summary

Round 1: Coding

Round 2: ML Puzzle

Details

---- Layer 1 (Linear): scores_i = 2 * x^T x_i - ||x_i||^2 ----

W1: (d, n), b1: (n,)

Mobile Engineer · Phone Screen · Coding

OpenAI — Mobile Engineer ✅ Passed

Summary

Round 1: Coding

Round 2: ML Puzzle

Details

---- Layer 1 (Linear): scores_i = 2 * x^T x_i - ||x_i||^2 ----

W1: (d, n), b1: (n,)