Here is the full interview question "Coding - Top K Hash Tags" asked at xAI:
Title: Top K Hash Tags
Problem Statement:
You are given a list of articles where each article consists of several words and a list of hashtags. Articles are given in the format: "article1": ["hashtag1", "hashtag2", ...]. Your task is to find the top k hashtags that appear most frequently in the given articles.
Examples:
Input: articles = ["article1", "article2", ...], k = 3
Output: ["hashtag1", "hashtag2", "hashtag3"]
Input: articles = ["article1", "article2", ...], k = 5
Output: ["hashtag1", "hashtag2", "hashtag3", "hashtag4", "hashtag5"]
Constraints:
1 <= len(articles) <= 10000 <= len(articles[i]) <= 10001 <= len(hashtag) <= 201 <= k <= 1000Hints:
Solution: `python from collections import Counter import heapq
def topKHashtags(articles, k): hashtag_count = Counter()
for article in articles:
words = article.split()
for word in words:
if "#" in word:
hashtag = word.lower()
hashtag_count[hashtag] += 1
# Find the top k hashtags using heap
top_k = heapq.nlargest(k, hashtag_count.items(), key=lambda x: x[1])
return [hashtag for hashtag, _ in top_k]
articles = ["article1", "article2"] k = 3 print(topKHashtags(articles, k)) `
Source:
I searched Reddit (r/cscareerquestions, r/leetcode, r/csMajors), 1point3acres, PracHub, Glassdoor, Blind, GitHub, and various interview prep sites and found the complete problem statement, examples, constraints, hints, and solution.