Skip to content

Commit

Permalink
Merge branch 'main' into Ishaan/nit-image-params
Browse files Browse the repository at this point in the history
  • Loading branch information
ishaansehgal99 authored Jan 14, 2024
2 parents f958500 + f38c8b1 commit 64050d2
Show file tree
Hide file tree
Showing 5 changed files with 121 additions and 37 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -29,3 +29,5 @@ hack/tools/bin/*
.DS_Store
/coverage.txt

# values override file for helm chart installation
values.override.yaml
138 changes: 113 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,42 @@ Note that the *gpu-provisioner* is not an open sourced component. It can be repl


## Installation
The following guidance assumes **Azure Kubernetes Service(AKS)** is used to host the Kubernetes cluster .

The following guidance assumes **Azure Kubernetes Service(AKS)** is used to host the Kubernetes cluster.

Before you begin, ensure you have the following tools installed:

- [Azure CLI](https://learn.microsoft.com/cli/azure/install-azure-cli) to provision Azure resources
- [Helm](https://helm.sh) to install this operator
- [kubectl](https://kubernetes.io/docs/tasks/tools/) to view Kubernetes resources
- [git](https://git-scm.com/downloads) to clone this repo locally

If you do not already have an AKS cluster, run the following Azure CLI commands to create one:

```bash
export RESOURCE_GROUP="myResourceGroup"
export MY_CLUSTER="myCluster"
export LOCATION="eastus"
az group create --name $RESOURCE_GROUP --location $LOCATION
az aks create --resource-group $RESOURCE_GROUP --name $MY_CLUSTER --enable-oidc-issuer --enable-workload-identity --enable-managed-identity --generate-ssh-keys
```

Connect to the AKS cluster.

```bash
az aks get-credentials --resource-group $RESOURCE_GROUP --name $MY_CLUSTER
```

If you do not have `kubectl` installed locally, you can install using the following Azure CLI command.

```bash
az aks install-cli
```

#### Enable Workload Identity and OIDC Issuer features
The *gpu-provisioner* controller requires the [workload identity](https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview?tabs=dotnet) feature to acquire the access token to the AKS cluster.
The *gpu-provisioner* controller requires the [workload identity](https://learn.microsoft.com/azure/aks/workload-identity-overview?tabs=dotnet) feature to acquire the access token to the AKS cluster.

> Run the following commands only if your AKS cluster does not already have the Workload Identity and OIDC issuer features enabled.
```bash
export RESOURCE_GROUP="myResourceGroup"
Expand All @@ -47,39 +79,65 @@ az aks update -g $RESOURCE_GROUP -n $MY_CLUSTER --enable-oidc-issuer --enable-wo
#### Create an identity and assign permissions
The identity `kaitoprovisioner` is created for the *gpu-provisioner* controller. It is assigned Contributor role for the managed cluster resource to allow changing `$MY_CLUSTER` (e.g., provisioning new nodes in it).
```bash
export SUBSCRIPTION="mySubscription"
az identity create --name kaitoprovisioner -g $RESOURCE_GROUP
export IDENTITY_PRINCIPAL_ID=$(az identity show --name kaitoprovisioner -g $RESOURCE_GROUP --subscription $SUBSCRIPTION --query 'principalId' | tr -d '"')
export IDENTITY_CLIENT_ID=$(az identity show --name kaitoprovisioner -g $RESOURCE_GROUP --subscription $SUBSCRIPTION --query 'clientId' | tr -d '"')
export SUBSCRIPTION=$(az account show --query id -o tsv)
export IDENTITY_NAME="kaitoprovisioner"
az identity create --name $IDENTITY_NAME -g $RESOURCE_GROUP
export IDENTITY_PRINCIPAL_ID=$(az identity show --name $IDENTITY_NAME -g $RESOURCE_GROUP --subscription $SUBSCRIPTION --query 'principalId' -o tsv)
export IDENTITY_CLIENT_ID=$(az identity show --name $IDENTITY_NAME -g $RESOURCE_GROUP --subscription $SUBSCRIPTION --query 'clientId' -o tsv)
az role assignment create --assignee $IDENTITY_PRINCIPAL_ID --scope /subscriptions/$SUBSCRIPTION/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.ContainerService/managedClusters/$MY_CLUSTER --role "Contributor"

```

#### Install helm charts
Two charts will be installed in `$MY_CLUSTER`: `gpu-provisioner` chart and `workspace` chart.

> Be sure you've cloned this repo and connected to your AKS cluster before attempting to install the Helm charts.
Install the Workspace controller.

```bash
helm install workspace ./charts/kaito/workspace
```

export NODE_RESOURCE_GROUP=$(az aks show -n $MY_CLUSTER -g $RESOURCE_GROUP --query nodeResourceGroup | tr -d '"')
export LOCATION=$(az aks show -n $MY_CLUSTER -g $RESOURCE_GROUP --query location | tr -d '"')
export TENANT_ID=$(az account show | jq -r ".tenantId")
yq -i '(.controller.env[] | select(.name=="ARM_SUBSCRIPTION_ID")) .value = env(SUBSCRIPTION)' ./charts/kaito/gpu-provisioner/values.yaml
yq -i '(.controller.env[] | select(.name=="LOCATION")) .value = env(LOCATION)' ./charts/kaito/gpu-provisioner/values.yaml
yq -i '(.controller.env[] | select(.name=="ARM_RESOURCE_GROUP")) .value = env(RESOURCE_GROUP)' ./charts/kaito/gpu-provisioner/values.yaml
yq -i '(.controller.env[] | select(.name=="AZURE_NODE_RESOURCE_GROUP")) .value = env(NODE_RESOURCE_GROUP)' ./charts/kaito/gpu-provisioner/values.yaml
yq -i '(.controller.env[] | select(.name=="AZURE_CLUSTER_NAME")) .value = env(MY_CLUSTER)' ./charts/kaito/gpu-provisioner/values.yaml
yq -i '(.settings.azure.clusterName) = env(MY_CLUSTER)' ./charts/kaito/gpu-provisioner/values.yaml
yq -i '(.workloadIdentity.clientId) = env(IDENTITY_CLIENT_ID)' ./charts/kaito/gpu-provisioner/values.yaml
yq -i '(.workloadIdentity.tenantId) = env(TENANT_ID)' ./charts/kaito/gpu-provisioner/values.yaml
helm install gpu-provisioner ./charts/kaito/gpu-provisioner

Install the Node provisioner controller.
```bash
# get additional values for helm chart install
export NODE_RESOURCE_GROUP=$(az aks show -n $MY_CLUSTER -g $RESOURCE_GROUP --query nodeResourceGroup -o tsv)
export LOCATION=$(az aks show -n $MY_CLUSTER -g $RESOURCE_GROUP --query location -o tsv)
export TENANT_ID=$(az account show --query tenantId -o tsv)

# create a local values override file
cat << EOF > values.override.yaml
controller:
env:
- name: ARM_SUBSCRIPTION_ID
value: $SUBSCRIPTION
- name: LOCATION
value: $LOCATION
- name: AZURE_CLUSTER_NAME
value: $MY_CLUSTER
- name: AZURE_NODE_RESOURCE_GROUP
value: $NODE_RESOURCE_GROUP
- name: ARM_RESOURCE_GROUP
value: $RESOURCE_GROUP
- name: LEADER_ELECT
value: "false"
workloadIdentity:
clientId: $IDENTITY_CLIENT_ID
tenantId: $TENANT_ID
settings:
azure:
clusterName: $MY_CLUSTER
EOF

# install gpu-provisioner using values override file
helm install gpu-provisioner ./charts/kaito/gpu-provisioner -f values.override.yaml
```

#### Create the federated credential
The federated identity credential between the managed identity `kaitoprovisioner` and the service account used by the *gpu-provisioner* controller is created.
```bash
export AKS_OIDC_ISSUER=$(az aks show -n $MY_CLUSTER -g $RESOURCE_GROUP --subscription $SUBSCRIPTION --query "oidcIssuerProfile.issuerUrl" | tr -d '"')
az identity federated-credential create --name kaito-federatedcredential --identity-name kaitoprovisioner -g $RESOURCE_GROUP --issuer $AKS_OIDC_ISSUER --subject system:serviceaccount:"gpu-provisioner:gpu-provisioner" --audience api://AzureADTokenExchange --subscription $SUBSCRIPTION
export AKS_OIDC_ISSUER=$(az aks show -n $MY_CLUSTER -g $RESOURCE_GROUP --subscription $SUBSCRIPTION --query "oidcIssuerProfile.issuerUrl" -o tsv)
az identity federated-credential create --name kaito-federatedcredential --identity-name $IDENTITY_NAME -g $RESOURCE_GROUP --issuer $AKS_OIDC_ISSUER --subject system:serviceaccount:"gpu-provisioner:gpu-provisioner" --audience api://AzureADTokenExchange --subscription $SUBSCRIPTION
```
Then the *gpu-provisioner* can access the managed cluster using a trust token with the same permissions of the `kaitoprovisioner` identity.
Note that before finishing this step, the *gpu-provisioner* controller pod will constantly fail with the following message in the log:
Expand All @@ -88,6 +146,36 @@ panic: Configure azure client fails. Please ensure federatedcredential has been
```
The pod will reach running state once the federated credential is created.

#### Verify installation
You can run the following commands to verify the installation of the controllers were successful.

Check status of the Helm chart installations.

```bash
helm list -n default
```

Check status of the `workspace`.

```bash
kubectl describe deploy workspace -n workspace
```

Check status of the `gpu-provisioner`.

```bash
kubectl describe deploy gpu-provisioner -n gpu-provisioner
```

#### Troubleshooting
If you see that the `gpu-provisioner` deployment is not running after some time, it's possible that some values incorrect in your `values.ovveride.yaml`.

Run the following command to check `gpu-provisioner` pod logs for additional details.

```bash
kubectl logs --selector=app.kubernetes.io\/name=gpu-provisioner -n gpu-provisioner
```

#### Clean up

```bash
Expand Down Expand Up @@ -128,8 +216,8 @@ $ kubectl get svc workspace-falcon-7b
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
workspace-falcon-7b ClusterIP <CLUSTERIP> <none> 80/TCP,29500/TCP 10m
$ kubectl run -it --rm --restart=Never curl --image=curlimages/curl sh
~ $ curl -X POST http://<CLUSTERIP>/chat -H "accept: application/json" -H "Content-Type: application/json" -d "{\"prompt\":\"YOUR QUESTION HERE\"}"
export CLUSTERIP=$(kubectl get svc workspace-falcon-7b -o jsonpath="{.spec.clusterIPs[0]}")
$ kubectl run -it --rm --restart=Never curl --image=curlimages/curl -- curl -X POST http://$CLUSTERIP/chat -H "accept: application/json" -H "Content-Type: application/json" -d "{\"prompt\":\"YOUR QUESTION HERE\"}"
```

## Usage
Expand All @@ -154,7 +242,7 @@ contact [[email protected]](mailto:[email protected]) with any additio

## Trademarks
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft
trademarks or logos is subject to and must follow [Microsoft's Trademark & Brand Guidelines](https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks/usage/general).
trademarks or logos is subject to and must follow [Microsoft's Trademark & Brand Guidelines](https://www.microsoft.com/legal/intellectualproperty/trademarks/usage/general).
Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship.
Any use of third-party trademarks or logos are subject to those third-party's policies.

Expand Down
2 changes: 0 additions & 2 deletions presets/models/falcon/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
# Falcon

## Supported Models
|Model name| Model source | Sample workspace|Kubernetes Workload|Distributed inference|
|----|:----:|:----:| :----: |:----: |
Expand Down
8 changes: 3 additions & 5 deletions presets/models/llama2/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
# llama2

## Supported Models
|Model name| Model source | Sample workspace|Kubernetes Workload|Distributed inference|
|----|:----:|:----:| :----: |:----: |
Expand All @@ -12,7 +10,7 @@

### Build llama2 private images

#### 1. Clone Kaito Repository
#### 1. Clone kaito repository
```
git clone https://github.com/Azure/kaito.git
```
Expand All @@ -32,8 +30,8 @@ Use the following command to build the llama2 inference service image from the r
```
docker build \
--file docker/presets/llama-2/Dockerfile \
--build-arg LLAMA_WEIGHTS=$LLAMA_WEIGHTS_PATH \
--build-arg SRC_DIR=presets/llama2 \
--build-arg WEIGHTS_PATH=$LLAMA_WEIGHTS_PATH \
--build-arg MODEL_PRESET_PATH=presets/models/llama2 \
-t $LLAMA_MODEL_NAME:latest .
```

Expand Down
8 changes: 3 additions & 5 deletions presets/models/llama2chat/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
# llama2chat

## Supported Models
|Model name| Model source | Sample workspace|Kubernetes Workload|Distributed inference|
|----|:----:|:----:| :----: |:----: |
Expand All @@ -12,7 +10,7 @@

### Build llama2chat private images

#### 1. Clone Kaito Repository
#### 1. Clone kaito repository
```
git clone https://github.com/Azure/kaito.git
```
Expand All @@ -32,8 +30,8 @@ Use the following command to build the llama2chat inference service image from t
```
docker build \
--file docker/presets/llama-2/Dockerfile \
--build-arg LLAMA_WEIGHTS=$LLAMA_WEIGHTS_PATH \
--build-arg SRC_DIR=presets/llama2chat \
--build-arg WEIGHTS_PATH=$LLAMA_WEIGHTS_PATH \
--build-arg MODEL_PRESET_PATH=presets/models/llama2chat \
-t $LLAMA_MODEL_NAME:latest .
```

Expand Down

0 comments on commit 64050d2

Please sign in to comment.