This is a reconstructed version of the Amazon online assessment problem commonly referred to as “Genome Mutation Time”, based on multiple independent public descriptions of the same question. Wording and explicit constraints may differ slightly from the original internal statement, but the core logic, examples, and required output are consistent across sources.[1][2][3][4]
Data scientists at Amazon are working on sequencing the genome of a microorganism. A genome is the complete set of DNA instructions, where each DNA strand is composed of various nucleotides represented as characters in a string.[3]
They have discovered a particular mutated nucleotide (always the same character, e.g., 'm' or 'z') that removes nucleotides to its left over time. The behavior is as follows:[2][3]
Formally, at each time step:
i such that genome[i] == mutation:
i - 1 >= 0, then the nucleotide at index i - 1 is marked for deletion.You are given:
genome denoting the current genome of the organism, andmutation denoting the mutated nucleotide.Your task is to compute the earliest time (in integer units) after which no further removals are possible. Equivalently, this is the smallest non‑negative integer t such that after performing the above removal process for t time steps, there is no occurrence of the mutated nucleotide that has any nucleotide immediately to its left (so in the next time step, nothing would be removed).[2][3]
0.[4]The original OA is given as a function to implement. A natural abstraction consistent with public descriptions is:[3][4][2]
text int genomeMutationTime(string genome, char mutation)
Input
genome: a non-empty string representing the nucleotides in the organism’s genome.
'a'–'z'.mutation: a single lowercase English letter representing the mutated nucleotide.Output
For platforms that use standard input for custom testing (as in some descriptions):
genome = 'tamem'mutation = 'm'Input
text genome = "tamem" mutation = 'm'
Output
text 2
Explanation
Initial genome: "tamem"
Indices: 0:'t', 1:'a', 2:'m', 3:'e', 4:'m'
Time unit 1:
'm') removes the left neighbor at index 1 ('a').'m') removes the left neighbor at index 3 ('e').Both removals happen simultaneously, so after deleting indices 1 and 3, the remaining characters are:
"tmm"Now indices are re-labeled: 0:'t', 1:'m', 2:'m'.
Time unit 2:
'm') removes index 0 ('t').'m') removes index 1 ('m').Again, deletions are simultaneous; indices 0 and 1 are removed, leaving only the last 'm'. New genome: "m".
At this point, the single remaining mutated nucleotide has no left neighbor, so no further removals are possible. Thus, the earliest time after which no more removals can occur is 2.
Input
text genome = "luvzliz" mutation = 'z'
Output
text 3
Explanation
Initial genome: "luvzliz"
Indices: 0:'l', 1:'u', 2:'v', 3:'z', 4:'l', 5:'i', 6:'z'[2]
Time unit 1:
'z' at index 3 removes index 2 ('v').'z' at index 6 removes index 5 ('i').After removing these indices, the remaining characters in order are:
"luzlz"Relabel indices: 0:'l', 1:'u', 2:'z', 3:'l', 4:'z'.
Time unit 2:
'z' at index 2 removes index 1 ('u').'z' at index 4 removes index 3 ('l').After removal: indices 0:'l', 2:'z', 4:'z' → "lzz"
Relabel indices: 0:'l', 1:'z', 2:'z'.
Time unit 3:
'z' at index 1 removes index 0 ('l').'z' at index 2 removes index 1 ('z').After removal, only one 'z' remains. New genome: "z".
Now the only mutated nucleotide has no left neighbor, so no further removals are possible. The earliest time after which no removals can occur is 3.
Input
text genome = "aa" mutation = 'a'
Output
text 1
Explanation
Initial genome: "aa"
Indices: 0:'a', 1:'a' (both are the mutated nucleotide).
Time unit 1:
'a' at index 1 removes index 0 ('a').'a' at index 0 has no left neighbor, so it removes nothing.After simultaneous deletion, only one 'a' remains. New genome: "a".
Now this remaining 'a' has no left neighbor, so no more removals can happen. The earliest time after which no removals are possible is 1.[3]
From publicly available versions of this problem, only qualitative constraints are explicitly given; precise numeric limits (like maximum string length) are not specified. The following are the constraints and behavioral rules that are clear from the descriptions:[4][2][3]
genome is a non-empty string.genome consists of lowercase English letters only.mutation is a single lowercase English letter.mutation does not occur in genome, the answer is 0 (no removals ever occur).genome and constant or linear extra space is expected, rather than step-by-step simulation for every time unit.Observe that each mutated nucleotide effectively “eats” one character to its left per time unit, and can also consume earlier mutated nucleotides; the time when deletions stop is governed by how far the rightmost mutated nucleotides need to reach to clear everything to their left. You can solve this in one pass by tracking the positions of the mutation character and taking a simple greedy maximum over the distance from the start to the first mutation and the gaps between consecutive mutations, without explicitly simulating each time step.