You are given a dataset of customer reviews for a product. Each review is a plain-text string. Your task is to build a PyTorch model that, given a review, extracts the exact span of text that best expresses the customer’s overall sentiment (the “sentiment span”).
Model architecture:
Load a pre-trained BERT-base-uncased encoder from Hugging Face.
Feed the review text through BERT to obtain contextual token embeddings.
Add two independent linear classification heads on top of BERT:
Start-head: outputs a logit for every token indicating the probability that the token is the first token of the sentiment span.
End-head: outputs a logit for every token indicating the probability that the token is the last token of the sentiment span.
During inference, choose the start/end positions with the highest softmax probabilities such that start ≤ end.
Training:
Loss is the average of cross-entropy losses for the start and end positions.
Fine-tune all BERT parameters together with the two new heads.
Deliverables:
Complete PyTorch implementation (model, training loop, and inference function).
Script must accept a list of raw review strings and return the extracted sentiment spans.