Batch Image Processor — Anthropic

Problem Overview

You will build a batch image processing system that applies transformations to images based on JSON configuration files. This is a practical coding challenge that tests your ability to:

Quickly research and use unfamiliar libraries (documentation search is allowed and expected)
Parse and apply configuration from JSON files
Handle file I/O operations efficiently
Optimize for performance with parallel processing

The interviewer wants to see how you research and learn new APIs quickly. You may use any resources except AI-generated answers.

Interview Format Notes

Documentation search is allowed and expected
Common library choices: Pillow (PIL) or scikit-image (consider familiarizing yourself with one beforehand)
The interview involves testing on small images first, then optimizing for large images within a time target
You will manually create test files during the interview (e.g., in Google Colab)

Problem Setup

You are provided with four directories:

project/ ├── small_images/ # Small test images for development │ ├── image1.png │ ├── image2.jpg │ └── ... ├── large_images/ # Large images for performance testing │ ├── photo1.png │ ├── photo2.jpg │ └── ... ├── transformations/ # JSON files defining transformations │ ├── transform1.json │ ├── transform2.json │ └── ... └── output/ # Directory to save processed images

Helper utilities are provided to:

List all files in each directory
Generate output file paths based on input image and transformation file

Transformation Specifications

Each JSON file in the transformations/ directory contains a list of transformations to apply sequentially. There are six types:

Transformations Without Parameters:

Transformations Without ParametersTypeDescriptiongrayscaleConvert image to grayscaleflip_horizontalFlip image horizontally (mirror)flip_verticalFlip image vertically

Transformations With Parameters:

Transformations With ParametersTypeParameterDescriptionscale``factor (float)Scale image by the given factor (e.g., 0.5 = half size, 2.0 = double size)blur``radius (int)Apply Gaussian blur with the specified radiusrotate``angle (float)Rotate image by the specified angle in degrees

Example

{ "transformations": [ { "type": "grayscale" }, { "type": "scale", "factor": 0.5 }, { "type": "rotate", "angle": 90 } ] }

This configuration would:

Convert the image to grayscale
Scale it to 50% of its original size
Rotate it 90 degrees counter-clockwise

Requirements

Part 1: Basic Implementation

Choose an image processing library - Research and select a Python library capable of performing all six transformation types. Common choices:
- Pillow (PIL)
- scikit-image
- OpenCV
Implement transformation functions - Create functions for each of the six transformation types
Process images with transformations:
- For each transformation JSON file
- For each image in the source directory
- Apply all transformations in the JSON file sequentially to the image
- Save the result to the output directory using the provided path utility
Test with small images - Verify correctness using the small_images/ directory before moving to large images

Part 2: Performance Optimization

After verifying correctness with small images, process the large_images/ directory. You must complete processing within a target time limit (provided during the interview).

Key considerations:

Image processing is CPU-intensive
Each image can be processed independently
Consider parallelization strategies

Hint: Choosing a Library For this problem, Pillow (PIL) is a good choice because:

Simple API for common image operations

Built-in support for all required transformations

Well-documented and widely used

When searching documentation, look for these modules: Pillow Module/Method ReferenceTransformationPillow Module/MethodGrayscale``PIL.ImageOps.grayscale()``Flip horizontal``PIL.ImageOps.mirror()``Flip vertical``PIL.ImageOps.flip()``Scale/Resize``Image.resize(size, resample)``Blur``PIL.ImageFilter.GaussianBlur(radius)``Rotate``Image.rotate(angle, expand=True)

Hint: Processing Strategy A clear structure for the basic implementation:

` def load_transformations(json_path: str) -> list: """Load transformation specifications from a JSON file.""" with open(json_path, 'r') as f: data = json.load(f) return data.get('transformations', [])

def apply_transformation(image, transform: dict): """Apply a single transformation to an image.""" transform_type = transform['type']
if transform_type == 'grayscale':
    # Convert to grayscale, then back to RGB
    # for subsequent transformations
    pass
elif transform_type == 'scale':
    factor = transform['factor']
    # Calculate new size and resize
    pass
# ... handle other types

return transformed_image
def apply_all_transformations(image, transformations: list): """Apply a sequence of transformations to an image.""" result = image.copy() for transform in transformations: result = apply_transformation(result, transform) return result `

Hint: Parallel Processing For performance optimization, use ProcessPoolExecutor to parallelize the work: `` Why ProcessPoolExecutor?

Image transformations are CPU-bound operations

Python's GIL prevents threads from running in parallel for CPU-bound tasks

Each process has its own Python interpreter, bypassing the GIL

Each image can be processed independently

Full Solution (Python) `` Time complexity: O(N × M × T) where N = number of images, M = number of transform configs, T = time per transformation. With parallelization, divide by number of CPU cores.

Space complexity: O(I) where I = size of largest image being processed (multiple images in parallel processes).

Note: On Windows, multiprocessing requires the if __name__ == '__main__': guard around the executor code to prevent infinite process spawning.

Follow-up Discussion

Threading vs Multiprocessing

Question: Why did you choose ProcessPoolExecutor instead of ThreadPoolExecutor? When would threading be preferable?

Key Points:

Threading vs MultiprocessingAspectProcessPoolExecutorThreadPoolExecutorBest forCPU-bound tasksI/O-bound tasksGIL impactBypasses GIL (separate interpreters)Limited by GILMemorySeparate memory per processShared memoryOverheadHigher (process creation)Lower (thread creation)

Why ProcessPoolExecutor for this problem:

Image transformations (blur, rotate, scale) are CPU-intensive numerical operations
Python's Global Interpreter Lock (GIL) prevents threads from running Python bytecode in parallel
Each process has its own Python interpreter, allowing true parallel execution across CPU cores

When ThreadPoolExecutor would be better:

I/O-bound workloads: Network requests, database queries, file downloads
When tasks spend most time waiting (GIL is released during I/O waits)
When process creation overhead exceeds the computation time

Other Potential Follow-ups

Memory Management: How to handle images too large for memory?

Process in tiles/chunks, use memory-mapped files, limit concurrent workers

Error Handling: How to make this production-ready?

Validate JSON schema, handle corrupt images gracefully, add logging, support resume from failure

Scaling Further: What if you need to process millions of images?

Distribute across machines using message queues, consider GPU acceleration with OpenCV/cupy