Practice/DoorDash/Design a Donations Website
Design a Donations Website
System DesignMust
Problem Statement
Design a short-term (24-48 hour) donations website where users can contribute to pre-selected charities, with focus on payment processing, failure handling, and backup payment methods. During the event window, thousands of users visit the site simultaneously to donate money to one of several pre-approved charitable organizations. The platform must display real-time fundraising progress, process payments securely, and ensure no donation is lost or double-charged.
The system must gracefully handle the complete event lifecycle: before the event starts, users see a countdown page; during the event, the site processes donations and shows live progress counters; after the event closes, visitors see final totals and a thank-you message. The short, high-visibility nature of the event means there is no time for extended debugging or manual recovery, so the design must prioritize correctness in payment processing, resilience against payment provider failures, and accurate live statistics without creating database bottlenecks.
Interviewers ask this question to test your ability to design a reliable payments-facing system under a strict time window. They probe for idempotency across retries and webhooks, failure handling when payment processors degrade, hot-counter contention for live totals, and clean reconciliation at the end of the event. The best answers balance correctness, simplicity, and operational readiness.
Key Requirements
Functional
- Charity browsing -- Users can view a curated list of participating charities with descriptions, images, and current donation totals
- Secure donation flow -- Users complete a multi-step checkout to contribute a specified amount to their chosen charity, receiving a confirmation and digital receipt
- Live progress tracking -- Real-time counters show total funds raised per charity and overall, updating within seconds as donations are processed
- Event time-gating -- The site displays different content based on event state: pre-event countdown, active donation flow during the window, and post-event summary with final totals
Non-Functional
- Scalability -- Support 10,000 concurrent users with 500 donations per minute during peak periods
- Reliability -- Zero tolerance for lost or duplicate donations; system must handle payment provider outages and webhook delivery failures
- Latency -- Donation submission should complete within 3 seconds; live counter updates should reflect within 5 seconds of payment confirmation
- Consistency -- Strong consistency for payment records and donation amounts; eventual consistency acceptable for public-facing counters and leaderboards
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Payment Idempotency and State Management
Payment flows involve multiple asynchronous steps across your service and external processors. Interviewers want to see how you prevent duplicate charges when users refresh, requests time out, or webhooks arrive multiple times.
Hints to consider:
- Generate server-side idempotency keys before calling the payment provider so retries are safe by default
- Model the donation lifecycle as a finite state machine with monotonic transitions (pending, authorized, captured, completed)
- Store webhook event IDs and use database constraints to reject duplicate payment confirmations
- Consider how client retries, server retries, and webhook retries all interact with your state machine
2. Payment Provider Resilience
External payment services experience downtime, slow responses, and partial outages. Interviewers expect fallback strategies that maintain availability without compromising payment integrity.
Hints to consider:
- Implement circuit breakers with appropriate thresholds to fail fast when the primary provider is degraded
- Design a graceful degradation path: queue donations for later processing or switch to a backup payment provider
- Distinguish carefully between "payment unknown" (timeout) and "payment failed" (explicit rejection) states
- Plan for manual reconciliation during the short event window when automated recovery is not sufficient
3. Hot Counter Contention
When thousands of users donate simultaneously, naively updating a single database row for each charity's total creates lock contention and query timeouts.
Hints to consider:
- Use Redis atomic increment operations to absorb high-frequency counter updates with sub-millisecond latency
- Shard counters by charity ID or time bucket to distribute write load across multiple keys
- Periodically reconcile fast Redis counters with authoritative PostgreSQL records to detect and correct drift
- Consider write-behind patterns where counter updates are batched and flushed asynchronously to the database
4. Real-Time Update Distribution
The platform must push live donation totals to thousands of connected browsers without overwhelming the backend or creating stale data scenarios.
Hints to consider:
- Use Server-Sent Events or WebSockets to push counter updates from the server to connected clients
- Leverage a pub/sub system (Redis pub/sub) to fan out counter change events from the donation pipeline to connection handlers
- Implement client-side exponential backoff and jitter for reconnection to prevent thundering herd reconnection storms
- Cache current totals at the edge or in a fast read layer to reduce backend queries for users joining mid-event
Suggested Approach
Step 1: Clarify Requirements
Confirm the scale and boundaries of the problem. Ask about expected donation volume (concurrent users, transactions per second) and the number of participating charities. Clarify supported payment methods (credit cards, digital wallets) and minimum/maximum donation amounts. Establish whether the platform supports only one-time donations or also recurring contributions. Verify regulatory requirements like PCI DSS compliance scope and tax receipt generation. Ask about monitoring expectations for the short event window.