For a full example answer with detailed architecture diagrams and deep dives, see our Design a Distributed Cache guide. The cache guide covers multi-tier caching, invalidation strategies, and read-heavy scaling patterns that are central to serving translations at global scale.
Also review the Caching, Message Queues, and Databases building blocks for background on edge caching, event-driven invalidation, and durable storage for translation data.
Design an internationalization (i18n) system that enables a social media platform to efficiently support hundreds of languages for text content and assets across the entire site. Engineers integrate stable message keys (e.g., t("auth.login.title")) and the system resolves them into localized strings and assets for each user's locale, handling pluralization, gender, date and number formatting, and right-to-left layout hints.
The system is extremely read-heavy: every page render for every user requires resolving dozens of translation keys. Human translators (and optionally machine translation) produce and review translations through a workflow pipeline. The core challenges are delivering translations with sub-millisecond latency at global scale through aggressive caching, maintaining a safe rollout mechanism with versioning and fallbacks so deployments never break strings, and orchestrating a human-in-the-loop translation pipeline that provides context, quality review, and approval before publishing. Interviewers want to see how you separate the source of truth from caches, design safe invalidation, and handle the developer experience for translation key management.
Based on real interview experiences, these are the areas interviewers probe most deeply:
i18n is a read-dominated workload where every request resolves many strings. Interviewers want to see an aggressive caching strategy that keeps latency low without serving stale translations after updates.
Hints to consider:
Changing or deleting translation keys without versioning breaks deployed clients and causes blank UI text. Interviewers probe whether you have a safe rollout and rollback mechanism.
Hints to consider:
Quality translations require context, review, and approval. Interviewers want to see a durable workflow that moves translations through extraction, optional machine translation, human review, QA, and publish stages.
Hints to consider:
The system must be easy for engineers to use without creating orphaned or conflicting keys. Interviewers assess how you integrate with the development workflow.
Hints to consider:
checkout.summary.total) to organize translations logically and allow per-namespace cache bundlesStart by confirming scope and priorities. Ask how many locales the platform supports and whether all locales require complete coverage or if partial coverage with fallbacks is acceptable. Clarify the expected read scale (requests per second) and whether translations change frequently (daily) or infrequently (per release). Verify whether machine translation is in scope or if all translations are human-authored. Establish latency targets for translation resolution and acceptable cache staleness after a publish event.
Sketch the core components: a Translation Management Service (TMS) that stores the source of truth for keys and translations in DynamoDB or PostgreSQL, a Workflow Engine that orchestrates the translation pipeline stages, a Bundle Publisher that compiles locale-namespace bundles and pushes them to a CDN, a Redis Cache Layer for application-level resolution, and a Client SDK that fetches and caches bundles locally. Show two data flows: the write path (engineer adds key, translator provides translation, reviewer approves, publisher compiles and pushes bundle) and the read path (client requests versioned bundle URL from CDN, falls back to Redis, falls back to origin database).
Walk through the read path in detail. When a user loads a page, the client SDK checks its local bundle cache for the current version. If missing, it requests the bundle from the CDN using a versioned URL (e.g., /bundles/en-US/checkout.v3a7f2.json). On a CDN miss, the request reaches the regional Redis cache, which holds pre-compiled bundles. On a Redis miss, the application compiles the bundle from the database, writes it to Redis, and returns it. When a translator publishes an update, the Bundle Publisher compiles a new bundle with a new content hash, writes it to Redis and the CDN, and sends a notification to clients to refresh. Discuss how content-hash URLs avoid cache purge complexity: old versions remain cached and valid, while new deploys reference the new hash.
Cover the translation workflow: new keys enter a "needs translation" queue, optionally seeded with machine translation, assigned to human translators with context and screenshots, reviewed by a second translator, and published on approval. Discuss monitoring: track cache hit rates per locale, bundle size growth, translation coverage percentage per locale, and pipeline throughput (keys translated per day). Address disaster recovery: replicate the translation database across regions and maintain pre-compiled bundle snapshots in object storage as a fallback if Redis and the CDN both fail. Mention security: restrict publish permissions to approved translators and reviewers to prevent unauthorized content changes.
"Design a system for building a translator service where it translates the web content (just static content). Core entities: users, human translators, engineering devs."