This project uses Tekton to generate SLSA provenance for ML models on Google Cloud Platform (GCP). It uses Google Kubernetes Engine (GKE), Artifact Registry, Tekton and Sigstore.
-
To get started, you'll need to have a GCP Project. You will also need to have these CLI tools installed:
-
Enable the needed services:
gcloud services enable \ container.googleapis.com \ artifactregistry.googleapis.com
-
Create a GKE cluster:
-
Set the
PROJECT_ID
environment variable from your GCP project:export PROJECT_ID=<PROJECT_ID>
-
Set the
CLUSTER_NAME
environment variable to a cluster name of your choice:export CLUSTER_NAME=<CLUSTER_NAME>
-
Create a cluster:
gcloud container clusters create $CLUSTER_NAME \ --enable-autoscaling \ --min-nodes=1 \ --max-nodes=3 \ --scopes=cloud-platform \ --no-issue-client-certificate \ --project=$PROJECT_ID \ --region=us-central1 \ --machine-type=e2-standard-4 \ --num-nodes=1 \ --cluster-version=latest
-
-
Install Tekton:
-
Install Tekton Pipelines:
kubectl apply --filename https://storage.googleapis.com/tekton-releases/pipeline/latest/release.yaml
-
Install Tekton Chains:
kubectl apply --filename https://storage.googleapis.com/tekton-releases/chains/latest/release.yaml
-
-
Verify your Tekton installation was successful:
-
Check that Tekton Pipelines Pods are running in Kubernetes:
kubectl get pods -n tekton-pipelines
-
Check that Tekton Chains Pods are running in Kubernetes:
kubectl get pods -n tekton-chains
-
-
Configure Tekton:
-
Configure Tekton Pipelines to enable enumerations and alpha features:
kubectl patch cm feature-flags -n tekton-pipelines -p '{"data":{ "enable-param-enum":"true", "enable-api-fields":"alpha" }}'
-
Then restart the Tekton Pipelines controller to ensure it picks up the changes:
kubectl delete pods -n tekton-pipelines -l app=tekton-pipelines-controller
-
Configure Tekton Chains to enable transparency log, set SLSA format and configure storage:
kubectl patch configmap chains-config -n tekton-chains -p='{"data":{ "transparency.enabled": "true", "artifacts.taskrun.format":"slsa/v2alpha2", "artifacts.taskrun.storage": "tekton", "artifacts.pipelinerun.format":"slsa/v2alpha2", "artifacts.pipelinerun.storage": "tekton" }}'
-
Then restart the Tekton Chains controller to ensure it picks up the changes:
kubectl delete pods -n tekton-chains -l app=tekton-chains-controller
-
-
Generate an encrypted x509 keypair and save it as a Kubernetes secret:
cosign generate-key-pair k8s://tekton-chains/signing-secrets
-
(Optional) View the Tekton resources:
-
View the git-clone
Task
:cat slsa_for_models/gcp/tasks/git-clone.yml
-
View the build-model
Task
:cat slsa_for_models/gcp/tasks/build-model.yml
-
View the upload-model
Task
:cat slsa_for_models/gcp/tasks/upload-model.yml
-
View the
Pipeline
:cat slsa_for_models/gcp/pipeline.yml
-
View the
PipelineRun
:cat slsa_for_models/gcp/pipelinerun.yml
-
-
Apply the
Pipeline
:
kubectl apply -f slsa_for_models/gcp/pipeline.yml
-
Create a generic repository in Artifact Registry:
-
Set the
REPOSITORY_NAME
environment variable to a name of your choice:export REPOSITORY_NAME=ml-artifacts
-
Set the
LOCATION
environment variable to a location of your choice:export LOCATION=us
-
Create a generic repository:
gcloud artifacts repositories create $REPOSITORY_NAME \ --location=$LOCATION \ --repository-format=generic
-
If you set a different repository name and location from the example above, make sure to modify the
Parameter
named 'model-storage' in thePipelineRun
with your own values.
-
-
Execute the
PipelineRun
:kubectl create -f slsa_for_models/gcp/pipelinerun.yml
-
Observe the
PipelineRun
execution:export PIPELINERUN_NAME=$(tkn pr describe --last --output jsonpath='{.metadata.name}') tkn pipelinerun logs $PIPELINERUN_NAME --follow
-
When the
PipelineRun
succeeds, view its status:kubectl get pipelinerun $PIPELINERUN_NAME --output yaml
-
View the transparency log entry in the public Rekor instance:
export TLOG_ENTRY=$(tkn pr describe $PIPELINERUN_NAME --output jsonpath="{.metadata.annotations.chains\.tekton\.dev/transparency}")
open $TLOG_ENTRY
- Retrieve the attestation from the
PipelineRun
which is stored as a base64-encoded annotation:
export PIPELINERUN_UID=$(tkn pr describe $PIPELINERUN_NAME --output jsonpath='{.metadata.uid}')
tkn pr describe $PIPELINERUN_NAME --output jsonpath="{.metadata.annotations.chains\.tekton\.dev/signature-pipelinerun-$PIPELINERUN_UID}" | base64 -d > pytorch_model.pth.build-slsa
- View the attestation:
cat pytorch_model.pth.build-slsa | tr -d '\n' | pbcopy
pbpaste | jq '.payload | @base64d | fromjson'
- Download the model:
export MODEL_VERSION=$(tkn pr describe $PIPELINERUN_NAME --output jsonpath='{.status.results[1].value.digest}' | cut -d ':' -f 2)
gcloud artifacts generic download \
--package=pytorch-model \
--repository=$REPOSITORY_NAME \
--destination=. \
--version=$MODEL_VERSION
- Verify the attestation:
cosign verify-blob-attestation \
--key k8s://tekton-chains/signing-secrets \
--signature pytorch_model.pth.build-slsa \
--type slsaprovenance1 \
pytorch_model.pth
Provide a Kubeflow Pipeline that can be compiled into the above Tekton Pipeline using Kubeflow on Tekton.
Demonstrate how to verify the provenance of the model before deploying and serving the model.
Trigger execution of the PipelineRun
whenever changes are made in the
codebase.
Demonstrate training ML models that require multiple hours for training and require access to accelerators (i.e., GPUs, TPUs).