Design a workflow execution system that accepts a list of dependency pairs and returns a list of stages such that every step in the same stage can run concurrently. Each dependency pair [u, v] means step u must finish before step v can start. Your implementation must (1) build a directed graph from the pairs, (2) detect cycles (return empty list if any), (3) compute a valid topological order, and (4) group steps into the minimum number of stages where all steps in a stage have zero in-degree at the moment that stage begins. The order of stages must respect the topological order, and within a stage the steps may appear in any order. The function must perform a deep copy of all internal data structures so that subsequent mutations of the input do not affect the engine.