Practice/Oracle/Design a collaboration research document system for Healthcare records
Design a collaboration research document system for Healthcare records
System DesignOptional
Problem Statement
Design a system that combines Google Docs-style real-time collaboration with healthcare-grade compliance requirements. Clinicians and researchers co-author study protocols, clinical notes, and reports in real time, attach clinical artifacts (labs, PDFs, images), and maintain a complete compliance audit trail. The system must support presence cursors, comments, version history, and strict access controls with encryption and immutable logs for protected health information (PHI).
Interviewers ask this because it blends two hard problems: low-latency collaborative editing and healthcare-grade compliance. You must reason about OT/CRDT concurrency control, real-time fanout, and storage while also handling RBAC/ABAC, audit trails, encryption/key management, retention/legal hold, and multi-tenant isolation.
Key Requirements
Functional
- Real-time co-authoring -- users co-author documents with presence indicators, comments, and tracked changes
- Sharing and permissions -- users share documents across organizations with fine-grained permissions, time-bound access, and purpose-of-use tagging
- Version history and audit -- users view full version history and immutable audit trails for edits, access, and sharing events
- Clinical attachments -- users attach and view clinical files (PDFs, images) and reference structured clinical data while preserving PHI protections
Non-Functional
- Scalability -- support thousands of concurrent editors across multiple institutions with hundreds of active documents
- Reliability -- ensure no data loss for clinical documents; maintain immutable audit logs that cannot be tampered with
- Latency -- propagate edits to collaborators within 200ms; document load in under 2 seconds
- Consistency -- strong consistency for document state (all editors converge); eventual consistency for audit log replication across regions
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Collaborative Editing and Conflict Resolution
Interviewers want to see how you handle multiple users editing the same document simultaneously without losing changes.
Hints to consider:
- Choose between Operational Transformation (OT) and Conflict-Free Replicated Data Types (CRDTs) with clear tradeoff reasoning
- OT is more compact but requires a central server to transform operations; CRDTs are decentralized but generate larger state
- Assign logical timestamps or vector clocks to establish causal ordering of edits
- Implement periodic snapshotting to avoid replaying the entire operation history when loading documents
2. Healthcare Compliance and Access Control
Interviewers expect concrete mechanisms for HIPAA compliance, not just hand-waving about "security."
Hints to consider:
- Implement role-based and attribute-based access control (RBAC + ABAC) with purpose-of-use tagging for every access
- Use envelope encryption with per-document keys, stored in a key management service with rotation support
- Maintain immutable, append-only audit logs for every access, edit, share, and permission change event
- Support break-glass emergency access with mandatory justification logging and post-hoc review workflows
3. Real-Time Communication Protocol
Collaborative editing requires efficient bidirectional communication between clients and servers.
Hints to consider:
- Use WebSocket connections for bidirectional operation streaming between editors and the collaboration server
- Implement operation batching to reduce message frequency without sacrificing perceived responsiveness
- Handle client disconnects and reconnects by replaying missed operations from the server's operation log
- Use a pub/sub system to route operations to the correct collaboration server when a document session spans multiple nodes
4. Document Storage and Version History
Clinical documents require durable storage with complete version history and the ability to reconstruct any past state.
Hints to consider:
- Store documents using event sourcing: an append-only operation log with periodic materialized snapshots
- Keep snapshots in object storage with metadata linking to the last operation included
- Implement retention policies and legal hold capabilities that prevent deletion of documents under regulatory review
- Separate clinical attachments (stored in encrypted object storage) from document metadata and operations