← Back to companies
[ OK ] Loaded —
[ INFO ]
$ cd
$ ls -lt
01
02
03
04
05
$ ls -lt
01
02
03
04
05
user@intervues:~/$
You are asked to implement the forward pass of the Transformer encoder block from scratch in NumPy (or PyTorch without using nn.TransformerEncoderLayer). The block must include: 1) Multi-head self-attention with scaled dot-product attention, 2) Position-wise feed-forward network with GELU activation, 3) Residual connections and layer normalization (pre-norm style), 4) Dropout on attention weights, FFN hidden layer, and residual paths. Input is a tensor X ∈ ℝ^{b×n×d} where b=batch, n=seq-len, d=model-dim. You will also receive masks (causal or padding) that must be applied before softmax. Return the final tensor of the same shape. You must not use any high-level transformer utilities; only basic linear algebra ops are allowed.