Design and implement a Kubernetes Operator (in Go) that manages the full lifecycle of a custom AI inference microservice named NIMService. Your operator must:
Define a new CRD called NIMService (group: nvidia.com/v1) with spec fields: modelName, gpuCount, replicas, image, and runtimeConfig (JSON blob). Status must expose readyReplicas, lastReconciledVersion, and conditions.
Provide a controller that watches NIMService objects and reconciles the desired state by creating/updating/deleting:
Implement finalizer logic so that when a NIMService is deleted the controller gracefully drains traffic (by setting a ‘draining’ annotation on pods) and waits until all in-flight requests are finished (check an /inflight HTTP endpoint that each pod exposes) before removing the finalizer and allowing deletion.
Support CRD versioning: allow v1alpha1 → v1 conversion webhook so existing v1alpha1 objects (which lack runtimeConfig) are upgraded to v1 with a default runtimeConfig supplied by the webhook.
Ensure the controller is highly available: use leader-election, rate-limiting, and exponential back-off on errors. All RBAC rules must be generated with Kubebuilder markers.
You have 45 min to white-board the CRD schema, reconciliation flow diagram, and key Go code snippets (Reconcile loop, finalizer handling, and conversion webhook). Be prepared to discuss how you would write unit tests for the controller using envtest and how you would observe the operator in production (metrics, logging, alerting).