Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Kserve <> Model Registry connection: InferenceService Controller #577

Open
Al-Pragliola opened this issue Nov 20, 2024 · 0 comments

Comments

@Al-Pragliola
Copy link
Contributor

Al-Pragliola commented Nov 20, 2024

Overview

Following the work already done in:

I'm proposing the creation of a controller in the Model Registry repository. This controller will monitor kserve InferenceService resources with specific labels and handle the lifecycle of related Model Registry ServingEnvironment and InferenceService entities so that they are linked together and let users know when a model is deployed or not.

InDepth

The controller watches for resources of type:

apiVersion: serving.kserve.io/v1
kind: InferenceService
...

The labels/annotations that mark the InferenceService as connected to the Model Registry will be:

  • oneOf:
    • [LABEL]modelregistry.kubeflow.org/registered-model-id
    • [LABEL]modelregistry.kubeflow.org/inference-service-id
  • oneOf
    • [LABEL]modelregistry.kubeflow.org/name
    • [ANNOTATION]modelregistry.kubeflow.org/url

optional:

  • modelregistry.kubeflow.org/namespace
    Used to specify the namespace of the Model Registry, if not set a default namespace from params will be used.
  • modelregistry.kubeflow.org/model-version-id
    Used to specify the model version id in the Model Registry.

When the controller detects a new InferenceService with the required labels, it will make use of the name and namespace of the Model Registry or the url to connect to it, and will try to get a ServingEnvironment with the same name as the namespace of the InferenceService, if it doesn't exist it will create it. Then if an inference-service-id is provided it will try to get the InferenceService with that id, if it doesn't exist it will create it and set the DesiredState to DEPLOYED and set the inference-service-id label to the new InferenceService id. If the InferenceService is deleted the controller will set the DesiredState to UNDEPLOYED.

Final note

Since this is a simple controller, I initially considered writing it without using kubebuilder for scaffolding, but it's not worth it as it will be easier to maintain and extend in the future in my opinion, and following the guide from kubebuilder it's possible to remove much of the noise and build a simple controller.

Let me know if you have any questions or suggestions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant