Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] Support Datasource Field for PVC Creation in KFP Python SDK #11420

Open
leseb opened this issue Nov 29, 2024 · 5 comments · May be fixed by #11439
Open

[feature] Support Datasource Field for PVC Creation in KFP Python SDK #11420

leseb opened this issue Nov 29, 2024 · 5 comments · May be fixed by #11439

Comments

@leseb
Copy link
Contributor

leseb commented Nov 29, 2024

Feature Area

/area backend
/area sdk

What feature would you like to see?

Requesting support for the dataSource field when creating a PersistentVolumeClaim (PVC) using the KFP Python SDK DSL container_component CreatePVC. This feature would enable users to create PVCs with pre-populated data, aligning with Kubernetes capabilities for cloning or restoring PVCs from existing volumes or snapshots.

What is the use case or pain point?

The dataSource field is essential for workflows that depend on pre-initialized volumes, such as restoring a snapshot for processing or cloning an existing volume for parallel workflows.

Is there a workaround currently?

Use a custom component task that invokes the kubernetes library and creates a PVC with a Datasource from a VolumeSnapshot.

Proposed changes

Extend the PVC creation API in the KFP Python SDK to include the optional dataSource parameter, reflecting the Kubernetes PersistentVolumeClaimSpec.

pvc = CreatePVC(
    pvc_name_suffix="foo",
    access_modes=["ReadWriteOnce"],
    size="100Gi",
    data_source={"api_group": "snapshot.storage.k8s.io", "kind": "VolumeSnapshot", "name": "my-snap"},
)

Note: DatasourceRef could be considered in the future once it graduates to stable (currently beta).


Love this idea? Give it a 👍.

@leseb
Copy link
Contributor Author

leseb commented Nov 29, 2024

@HumairAK FYI

@leseb leseb changed the title [feature] Support dataSource Field for PVC Creation in KFP Python SDK [feature] Support DatasourceRef Field for PVC Creation in KFP Python SDK Dec 2, 2024
@leseb leseb changed the title [feature] Support DatasourceRef Field for PVC Creation in KFP Python SDK [feature] Support Datasource Field for PVC Creation in KFP Python SDK Dec 2, 2024
@HumairAK
Copy link
Collaborator

HumairAK commented Dec 3, 2024

thanks for raising this @leseb,

it makes sense to add this as it's part of the pvc api

@HumairAK HumairAK added this to the KFP 2.5.0 milestone Dec 3, 2024
@leseb
Copy link
Contributor Author

leseb commented Dec 3, 2024

@HumairAK thanks! I'm working on this.

@HumairAK
Copy link
Collaborator

HumairAK commented Dec 3, 2024

/assigned @leseb

leseb added a commit to leseb/pipelines that referenced this issue Dec 4, 2024
The kpf.kubernetes SDK now supports creating PVC from a data source.
This feature would enable users to create PVCs with pre-populated data,
aligning with Kubernetes capabilities for cloning or restoring PVCs
from existing volumes or snapshots.

This is how it can be done:

```python
pvc = kfp.Kubernetes.CreatePVC(
    pvc_name_suffix="-foo",
    access_modes=["ReadWriteOnce"],
    size="100Gi",
    data_source={"api_group": "snapshot.storage.k8s.io",
        "kind": "VolumeSnapshot",
        "name": "my-snap",
    },
)
```

Resolves: kubeflow#11420
Signed-off-by: Sébastien Han <[email protected]>
leseb added a commit to leseb/pipelines that referenced this issue Dec 4, 2024
The kpf.kubernetes SDK now supports creating PVC from a data source.
This feature would enable users to create PVCs with pre-populated data,
aligning with Kubernetes capabilities for cloning or restoring PVCs
from existing volumes or snapshots.

This is how it can be done:

```python
pvc = kfp.Kubernetes.CreatePVC(
    pvc_name_suffix="-foo",
    access_modes=["ReadWriteOnce"],
    size="100Gi",
    data_source={"api_group": "snapshot.storage.k8s.io",
        "kind": "VolumeSnapshot",
        "name": "my-snap",
    },
)
```

Resolves: kubeflow#11420
Signed-off-by: Sébastien Han <[email protected]>
leseb added a commit to leseb/pipelines that referenced this issue Dec 4, 2024
The kpf.kubernetes SDK now supports creating PVC from a data source.
This feature would enable users to create PVCs with pre-populated data,
aligning with Kubernetes capabilities for cloning or restoring PVCs
from existing volumes or snapshots.

This is how it can be done:

```python
pvc = kfp.Kubernetes.CreatePVC(
    pvc_name_suffix="-foo",
    access_modes=["ReadWriteOnce"],
    size="100Gi",
    data_source={"api_group": "snapshot.storage.k8s.io",
        "kind": "VolumeSnapshot",
        "name": "my-snap",
    },
)
```

Resolves: kubeflow#11420
Signed-off-by: Sébastien Han <[email protected]>
@leseb
Copy link
Contributor Author

leseb commented Dec 5, 2024

#11439

@leseb leseb linked a pull request Dec 9, 2024 that will close this issue
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants