-
Notifications
You must be signed in to change notification settings - Fork 54
Cromwell on AKS Instructions and Troubleshooting
The CoA deployer requires the user to have Helm 3 installed locally to deploy with AKS. Use the flag "--HelmBinaryPath HELM_PATH" to give the deployer the path to the helm binary, if no flag is passed, the deployer will assume Helm is installed at "C:\ProgramData\chocolatey\bin\helm.exe" (Windows) or "/usr/local/bin/helm" (Linux, macOS).
-
Add the flag "--UseAks true" and the deployer will provision an AKS account and run its containers in AKS rather than provisioning a VM.
-
Add the flags "--UseAks true --AksClusterName {existingClusterName}", where the user has "Contributor" or "Azure Kubernetes Service Contributor" role access to the existing AKS account, the deployer will deploy blob-csi-driver, and aad-pod-identity to the kube-system namespace, and then deploy CoA to the namespace "coa". Add the flag "--AksCoANamespace {namespace}" to override the default namespace.
-
If the user is required to use an AKS account, but does not have the required access, the deployer will produce a Helm chart that can then be installed by an admin or existing CI/CD pipeline. Add the flags "--UseAks true --ManualHelmDeployment". The deployer will print a postgresql command, this would typically be run on the kubernetes node to setup the cromwell user however the user will need to run this manually since the deployer won't directly access the AKS account.
- Run the deployer with supplied flags.
- Deployer will create initial resources and pause once it's time to deploy the Helm chart.
- Ensure the blob-csi-driver and aad-pod-identity are installed.
- Install the CoA Helm chart.
- Run the postgresql command to create the cromwell user.
- Press enter in the deployer console to finish the deployment and run a test workflow.
These packages will be deployed into the kube-system namespace.
-
This is used to mount the storage account to the containers.
Blob CSI Driver - https://github.com/kubernetes-sigs/blob-csi-driver/
-
This is used to assign managed identities to the containers.
AAD Pod Identity - https://github.com/Azure/aad-pod-identity
For troubleshooting any of the CoA services, you can login directly to the pods or get logs using the kubectl program. The deployer will write a kubeconfig to the temp directory, either copy that file to ~/.kube/config for reference it manually for each command with --kubeconfig {temp-directory}/kubeconfig.txt. You can also run the command az aks get-credentials --resource-group {coa-resource-group} --name {aks-account} --subscription {subscription-id} --file kubeconfig.txt
to get the file.
-
Get the exact name of the pods.
kubectl get pods --namespace coa
-
Get logs for the tes pod.
kubectl logs tes-68d6dc4789-mvvwj --namespace coa
-
SSH to pod to troubleshoot storage or network connectivity.
kubectl exec --namespace coa --stdin --tty tes-68d6dc4789-mvvwj -- /bin/bash
If you need to frequently connect to a changing tes or cromwell pod. You can use this script to automatically retrieve the (first) pod name and connect to it.
# Pick one of these lines (or use CROMWELL_POD_NAME and TES_POD_NAME)
# Cromwell pod name:
POD_NAME=$(kubectl get pods -n coa -o jsonpath='{range .items[*]}{.metadata.name}{"\n"}{end}' | grep "cromwell-")
# TES pod name:
POD_NAME=$(kubectl get pods -n coa -o jsonpath='{range .items[*]}{.metadata.name}{"\n"}{end}' | grep "tes-")
# Now you can SSH to the pod to troubleshoot with:
kubectl exec --namespace coa --stdin --tty $POD_NAME -- /bin/bash
For VM based CoA deployments, you can ssh into the VM host, update the environment files, and restart the VM. To update settings for AKS, you will need to redeploy the helm chart. The configuration for the helm chart is stored in CoA default storage account /configuration/aksValues.yaml. Setting in this file such as the image versions and external storage accounts can be updated and redeployed with the deployer update command.
deploy-cromwell-on-azure.exe --update true --aksclustername clustername
Typically, in CromwellOnAzure you can add storage accounts with input data to the containers-to-mount file. For AKS, the external storage accounts are configured in the aksValues.yaml in the storage account. There are three methods for adding storage accounts to your deployment.
-
Managed Identity
-
Storage Key
a) Plain text - "externalContainers"
b) Key Vault - "internalContainersKeyVaultAuth"
-
SAS Token
internalContainersMIAuth:
- accountName: storageAccount
containerName: dataset1
resourceGroup: resourceGroup1
internalContainersKeyVaultAuth:
- accountName: storageAccount
containerName: dataset1
keyVaultURL:
keyVaultSecretName:
externalContainers:
- accountName: storageAccount
accountKey: <key>
containerName: dataset1
resourceGroup: resourceGroup1
externalSasContainers:
- accountName:
sasToken:
containerName:
To add new storage accounts, please update the aksValues.yaml according to the above template and run the update.
The default storage account will be mounted using either the internalContainersMIAuth or internalContainersKeyVaultAuth depending on if the CrossSubscriptionAKSDeployment flag was used during deployment.
I'm getting "User "<my account>" cannot list resource <some resource> in API group "" at the cluster scope. '<my account>' does not have the required Kubernetes permissions to view this resource. Ensure you have the correct role/role binding for this user or group.
This is likely related to the deprecation mentioned in the warning box on this page. The fix is too run the following command: az aks update -g <resource-group> -n <kubernetes-name> --aad-admin-group-object-ids <object-id> --aad-tenant-id <tenant-id>
. Context for the issue can be found here.
If you are using tools like kubectl
, you will likely also need to include the --admin
flag at the end of the az aks get-credentials
command.
To search, expand the Pages section above.