This is a reconstructed version of an Amazon Online Assessment question, commonly titled “Minimum Storage Cost for AWS S3 Backup Batches.” Wording and constraints may differ slightly from the original internal statement.
You are given a batch of n files, labeled from 1 to n in order. These files must be backed up into Amazon S3. Among them, K files are considered sensitive and must be encrypted. The sensitive files are specified by their indices in an array sensitiveFiles.
The batch can be recursively split into smaller batches subject to the following rules:
Consider any batch (contiguous segment of the original order) containing M files, of which X are sensitive:
X ≥ 1), then storing this batch as a whole costsM * X * encCost.X = 0), then storing this batch as a whole costsflatCost (a constant, independent of M).If M is even (i.e., the batch size is divisible by 2), then you are allowed (but not required) to split this batch into two equal-sized contiguous sub-batches.
For example, given a batch [1, 4, 2, 6], the only valid split into two equal parts is {1, 4} and {2, 6}. You cannot rearrange it into {1, 2} and {4, 6}. Further recursive splits (e.g., {1, 4} → {1} and {4}) are allowed whenever the sub-batch size is even.
Your task is to compute the minimum total storage cost achievable for the entire original batch [1, 2, ..., n] by choosing, at each batch, whether to store it as a whole or (if possible) split it into two equal halves and recursively process them.
You must return this minimum possible storage cost.
Use the following function signature (or its equivalent in your language):
text long getMinimumStorageCost(int n, int encCost, int flatCost, int[] sensitiveFiles)
n (int):
Number of files in the batch. Files are labeled from 1 to n in order, forming the initial batch [1, 2, ..., n].
encCost (int):
Encryption cost multiplier for sensitive files. Used in the formula M * X * encCost when a batch with X ≥ 1 sensitive files is stored as a whole.
flatCost (int):
Cost of storing a batch with no sensitive files (X = 0) as a whole. This cost does not depend on the batch size M.
sensitiveFiles (int[]):
An array containing the indices of the K sensitive files. Each value is in the range [1, n], and indices are unique. A file i is sensitive if and only if i appears in this array.
long:
[1, 2, ..., n] following the rules above.M * X * encCost can exceed 32-bit integer range.Input
text n = 4 encCost = 5 flatCost = 3 sensitiveFiles = []
Output
text 3
Explanation
{1, 2, 3, 4} with M = 4.X = 0.
flatCost = 3.M = 4 is divisible by 2, splitting is allowed:
{1, 2, 3, 4} into {1, 2} and {3, 4}.0 sensitive files, so each costs flatCost = 3.3 + 3 = 6.flatCost, and the total would never be less than 3.3.Input
text n = 4 encCost = 2 flatCost = 1 sensitiveFiles = [1, 3]
Output
text 6
Explanation
Sensitive files are at indices 1 and 3.
{1, 2, 3, 4}:
M = 4, X = 2 (files 1 and 3 are sensitive).4 * 2 * 2 = 16.Now consider splitting:
Split {1, 2, 3, 4} into two equal halves {1, 2} and {3, 4}.
For {1, 2}:
M = 2, X = 1 (file 1 is sensitive).2 * 1 * 2 = 4.{1} and {2}:
{1}: M = 1, X = 1 → cost 1 * 1 * 2 = 2.{2}: M = 1, X = 0 → cost flatCost = 1.2 + 1 = 3.{1, 2} is min(4, 3) = 3.For {3, 4}:
M = 2, X = 1 (file 3 is sensitive).3.Total cost when splitting the original batch once and optimizing sub-batches:
3 (for {1, 2}) + 3 (for {3, 4}) = 6.Comparing strategies:
16.6.So the minimum storage cost is 6.
Input
text n = 4 encCost = 3 flatCost = 10 sensitiveFiles = [1, 2, 3, 4]
Output
text 12
Explanation
All files are sensitive.
{1, 2, 3, 4}:
M = 4, X = 4.4 * 4 * 3 = 48.Consider splitting:
Split {1, 2, 3, 4} into {1, 2} and {3, 4}.
For {1, 2}:
M = 2, X = 2.2 * 2 * 3 = 12.{1} and {2}:
{i} has M = 1, X = 1.1 * 1 * 3 = 3.3 + 3 = 6.{1, 2} is min(12, 6) = 6.For {3, 4}:
6.Total cost with optimal splitting:
6 (for {1, 2}) + 6 (for {3, 4}) = 12.Comparing:
48.12.Hence, the minimum storage cost is 12.
The commonly available descriptions of this problem (from interview-prep and OA practice sites) do not specify explicit numeric constraints. A reasonable reconstruction, consistent with typical Amazon OA settings and the intended dynamic programming solution, is:
1 ≤ n0 ≤ K ≤ n1 ≤ encCost1 ≤ flatCost1 ≤ sensitiveFiles[i] ≤ nsensitiveFiles are distinct.To ensure the solution runs efficiently in a coding interview or OA environment, the intended algorithm should meet roughly:
Time complexity:
$$O(n)$$ or $$O(n \log n)$$, using recursion + memoization or bottom-up dynamic programming and prefix sums.
Space complexity:
$$O(n)$$ for prefix sums and memoization of batch costs.
Use long (64-bit) for cumulative costs to avoid overflow.
Observe that every valid batch is always a contiguous segment created by repeatedly splitting in half, so the problem has optimal substructure: the best cost for a batch is the minimum of (cost of storing it as a whole) and (sum of minimal costs of its two equal halves, if its length is even). Precompute a prefix sum of sensitive-file counts along [1..n], then use a recursive divide-and-conquer with memoization (or equivalent bottom-up DP on these segments) to compute the minimal cost for each reachable segment, returning the cost for the full segment [1, n].