Level: Intern
Round: Full Journey · Type: Multiple Types · Difficulty: 4/10 · Duration: 180 min · Interviewer: Unfriendly
Topics: Machine Learning, NLP, Reinforcement Learning, Precision/Recall, ROC/AUC, Regularization, Classification Metrics, RAG System, Pseudo Code
Location: Seattle, WA, US
Interview date: 2026-01-15
Question: Initial 30-minute chat about my past projects and some details.
Question: An applied scientist questioned me on the technical details of past projects and the design of ablation studies.
Question: A senior architect asked about current NLP technologies like RL and agents, ML concepts such as precision/recall, ROC/AUC, L1/L2 regularization, classification metrics, and a simple coding question.
Question: A software engineer gave me a pseudo code question involving precision and recall for dog images from files returned by an API, along with follow-up questions about handling abnormal API responses and designing a RAG system.
I had a 30-minute chat where I discussed my past projects, and the recruiter asked a few clarifying questions.
An applied scientist asked very detailed technical questions about my previous projects, including the design of ablation studies.
A senior architect questioned me about current NLP technologies, such as RL and agents. I was also asked machine learning questions covering precision/recall, the purpose of ROC/AUC, the differences between L1 and L2 regularization and when to use each, and classification metrics. With about 10 minutes left, I received a coding question, which the interviewer said I didn't necessarily need to finish. It was relatively easy. The prompt was:
Input a list: [-1,-2, -4,-5,7,10]. Find the minimum difference between numbers and output the corresponding number pairs: [[-5,-4],[-2,-1]]
A software engineer asked me a coding question (but not an algorithm question):
Assume you have 10 files, and each file contains a picture of an animal. You can call an API to return k files. Write pseudo code to calculate the precision and recall of dog images in the returned results. Follow-up questions included: What if the API returns an abnormal result (e.g., returns None or does not return exactly k files)? How would you handle and calculate precision/recall in that case? I was also asked how to design a RAG system. Since I don't specialize in RAG, my answer was average; I combined it with agentic tool concepts.