Practice/Meta/Design Internal App Store
Design Internal App Store
System DesignMust
Problem Statement
Design a centralized software distribution platform for a large enterprise with 50,000+ employees across multiple geographic regions. The system should enable employees to discover, install, and update internal and approved third-party applications on their work devices. Think of this as a private version of an app marketplace, tailored for corporate governance and compliance needs.
The platform must handle diverse application types (desktop applications, browser extensions, mobile apps), enforce role-based access controls, support phased rollouts of new versions, and provide IT administrators with visibility into installation status, usage metrics, and security compliance. The system should scale to support thousands of concurrent installations during peak hours while maintaining low latency for catalog browsing and search across global offices.
Key Requirements
Functional
- Application Catalog -- employees browse and search available software by category, department, or keyword with personalized recommendations based on role and team
- Installation Management -- users initiate installs on registered devices with progress tracking, automatic retry on failure, and clear error messages for troubleshooting
- Update Orchestration -- system delivers automatic and manual updates with configurable policies (optional vs. mandatory), supports delta updates to minimize bandwidth, and provides rollback capability
- Administrative Controls -- IT teams publish new applications, define access policies by role or department, configure staged rollouts with health monitoring, and audit all distribution activities
Non-Functional
- Scalability -- support 10,000+ concurrent downloads during business hours, 50TB+ monthly bandwidth across global CDN, and catalog of 5,000+ applications
- Reliability -- 99.9% uptime for catalog and metadata services, resilient download resumption after network failures, and automated failover for critical infrastructure
- Latency -- catalog search results within 200ms at p95, download initiation within 500ms, and real-time progress updates with sub-second refresh
- Consistency -- strong consistency for entitlements and approval workflows, eventual consistency acceptable for telemetry and analytics, version conflicts resolved through administrative override
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Binary Distribution Architecture
Interviewers want to see how you separate metadata management from large binary delivery while ensuring security and efficiency. Simply storing installers in a database or serving them from application servers demonstrates a fundamental misunderstanding of blob storage patterns.
Hints to consider:
- Design separate paths for metadata (version info, entitlements, checksums) versus binary payloads with different caching strategies for each
- Leverage CDN or regional storage caches to minimize latency and bandwidth costs for global offices, with signed URLs for secure time-limited access
- Implement chunked downloads with resume capability and integrity verification using checksums to handle network interruptions gracefully
- Consider delta patching for updates to reduce bandwidth consumption when transitioning between versions
2. Entitlement and Access Control
The system must enforce who can see, request, and install which applications while maintaining performance. Naive approaches that check permissions on every search result or download chunk will not scale.
Hints to consider:
- Model entitlements at multiple levels (organization-wide, department, role, individual) with inheritance and override rules
- Cache resolved permissions in a fast datastore (Redis) with TTL-based invalidation when policies change
- Separate visibility rules (what appears in search) from installation authorization (what can actually be downloaded) to prevent enumeration attacks
- Design an approval workflow for restricted applications with async notification and audit trail
3. Phased Rollout and Health Monitoring
Pushing application updates to tens of thousands of devices requires careful orchestration to avoid large-scale incidents. Interviewers expect you to discuss progressive delivery patterns.
Hints to consider:
- Implement canary deployments with configurable percentage-based rollout stages (1%, 5%, 25%, 100%) and automatic pause on health threshold violations
- Collect installation success/failure telemetry, crash reports, and performance metrics to compute health scores for each rollout stage
- Design rollback mechanism that can revert devices to previous versions either automatically on health degradation or via manual administrator trigger
- Handle rollout state in a durable workflow system to survive service restarts and maintain consistency across long-running multi-day deployments
4. Search and Discovery
With thousands of applications, employees need fast, relevant search with filtering and personalized results. Basic SQL queries will not meet latency requirements at scale.
Hints to consider:
- Use a dedicated search engine (Elasticsearch or similar) with denormalized documents containing app metadata, tags, categories, and computed relevance scores
- Implement faceted search with filters for category, platform, department access, and popularity while returning counts for each facet option
- Personalize results using collaborative filtering (users in similar roles installed these apps) and role-based boosting without exposing unauthorized applications
- Keep search index synchronized with authoritative datastore via event-driven updates to ensure newly published apps appear quickly
5. Observability and Compliance
Enterprise systems require extensive auditing, monitoring, and reporting for security and compliance purposes. Interviewers want to see operational maturity in your design.
Hints to consider:
- Log all access attempts, installations, updates, and administrative actions with immutable audit trail in append-only storage
- Track device inventory with installed application versions to identify compliance violations or security vulnerabilities requiring forced updates
- Expose metrics for installation success rates, download speeds, popular applications, and capacity planning to administrators via dashboards
- Design data retention policies that balance compliance requirements with storage costs, using archival storage for older audit logs