You are building a recommendation system at Pinterest and need to choose between two candidate models for ranking home-feed Pins. Model A is a shallow, linear model with strong regularization; Model B is a deep neural network with 50 M parameters and no explicit regularization. Your task is to implement a simulation that demonstrates the bias–variance trade-off for these two architectures so the team can decide which model to ship. Specifically, write a Python function that, given a fixed data-generating process (synthetic user-item interaction data with known ground-truth labels), trains both models on 100 independent bootstrap samples of size n (n provided), records the prediction of each model on a fixed test set, and returns the empirical bias², variance, and irreducible noise for each model. Use squared-error loss. The function signature should be:
def bias_variance_simulation(n: int, d: int, test_set: List[Tuple[np.ndarray, float]], num_bootstrap: int = 100) -> Dict[str, Dict[str, float]]:
where n is the training-set size, d is the feature dimension, test_set is a list of (x, y) pairs held fixed across bootstrap samples, and the returned dict maps model names ('linear', 'deep') to their estimated {'bias_squared': float, 'variance': float, 'noise': float}. You may use scikit-learn for the linear model and PyTorch for the deep network; keep the deep network architecture fixed (three hidden layers of 256 units each, ReLU activations, trained with Adam for 50 epochs, batch size 64, learning rate 1e-3).