Design and implement a Data Batcher for ML training that can sample mini-batches from multiple data sources according to user-specified weights, support checkpointing so that training can be resumed exactly from where it left off, and handle wrap-around when any source is exhausted. Your batcher will be initialized with a mapping from source names to (weight, iterable) pairs and a batch_size. Each call to get_batch() must return a list of exactly batch_size items drawn from the sources in proportion to their weights. When the sum of weights does not divide batch_size, use the largest-remainder method: compute exact fractional counts, assign floor values, then assign the remaining slots to the sources whose fractional parts are largest. The batcher must expose a checkpoint(path) method that atomically writes the current per-source offsets to disk and a load_checkpoint(path) method that restores those offsets so the next batch continues exactly where the previous run stopped. If a source has fewer items than its assigned count for a batch, take all remaining items, update the offset to 0 (wrap-around), and make up the deficit from the same source after wrapping. The API must be thread-safe for single-producer use and must not preload data into memory; it should iterate on demand.