Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workspace Service Installs Fails with PutSubnetOperation or CanceledAndSupersededDueToAnotherOperation #3177

Open
marrobi opened this issue Feb 1, 2023 · 5 comments · May be fixed by #3807
Assignees
Labels
bug Something isn't working

Comments

@marrobi
Copy link
Member

marrobi commented Feb 1, 2023

When deploying the Databricks Workspace service get:

2) Main step for ff51fffc-c2c1-4dfe-a88e-e70766f5bc3c
ff51fffc-c2c1-4dfe-a88e-e70766f5bc3c: Error message: �[31m╷�[0m�[0m �[31m│�[0m �[0m�[1m�[31mError: �[0m�[0m�[1mwaiting for creation of Subnet: (Name "adb-host-subnet-mrtredemo24-ws-5740-svc-bc3c" / Virtual Network Name "vnet-mrtredemo24-ws-5740" / Resource Group "rg-mrtredemo24-ws-5740"): Code="Canceled" Message="Operation was canceled." Details=[{"code":"CanceledAndSupersededDueToAnotherOperation","message":"Operation PutSubnetOperation (a3d81bf2-68dd-4a93-93bb-3f3ad92059d9) was canceled and superseded by operation PutVirtualNetworkOperation (cc8eb20b-2242-4ae7-a1c2-1e74bbda5bfd)."}]�[0m �[31m│�[0m �[0m �[31m│�[0m �[0m�[0m with azurerm_subnet.host, �[31m│�[0m �[0m on network.tf line 90, in resource "azurerm_subnet" "host": �[31m│�[0m �[0m 90: resource "azurerm_subnet" "host" �[4m{�[0m�[0m �[31m│�[0m �[0m �[31m╵�[0m�[0m error running command /cnab/app/terraform
@marrobi marrobi added the bug Something isn't working label Feb 1, 2023
@marrobi
Copy link
Member Author

marrobi commented Feb 1, 2023

@guybartal seen this before?

@guybartal
Copy link
Collaborator

no, I haven't. looks like it fails on creating the public (host) subnet,
maybe a transient error? did you try to redeploy?

@marrobi
Copy link
Member Author

marrobi commented Oct 17, 2023

Got this again here:

1f350b8a-736f-4ff8-9a5c-ca3bbc8c459a: Error message: �[31m╷�[0m�[0m �[31m│�[0m �[0m�[1m�[31mError: �[0m�[0m�[1mwaiting for creation of Subnet: (Name "adb-host-subnet-mrtredemo28-ws-8044-svc-459a" / Virtual Network Name "vnet-mrtredemo28-ws-8044" / Resource Group "rg-mrtredemo28-ws-8044"): Code="Canceled" Message="Operation was canceled." Details=[{"code":"CanceledAndSupersededDueToAnotherOperation","message":"Operation PutSubnetOperation (f0dd77c7-05fd-4208-aa55-f62650568667) was canceled and superseded by operation PutVirtualNetworkOperation (b5f36438-7876-4e51-8e3a-36fc10f79daf)."}]�[0m �[31m│�[0m �[0m �[31m│�[0m �[0m�[0m with azurerm_subnet.host, �[31m│�[0m �[0m on network.tf line 90, in resource "azurerm_subnet" "host": �[31m│�[0m �[0m 90: resource "azurerm_subnet" "host" �[4m{�[0m�[0m �[31m│�[0m �[0m �[31m╵�[0m�[0m �[31m╷�[0m�[0m �[31m│�[0m �[0m�[1m�[31mError: �[0m�[0m�[1mSubnet: (Name "adb-container-subnet-mrtredemo28-ws-8044-svc-459a" / Virtual Network Name "vnet-mrtredemo28-ws-8044" / Resource Group "rg-mrtredemo28-ws-8044") was not found�[0m �[31m│�[0m �[0m �[31m│�[0m �[0m�[0m with azurerm_subnet_network_security_group_association.container, �[31m│�[0m �[0m on network.tf line 147, in resource "azurerm_subnet_network_security_group_association" "container": �[31m│�[0m �[0m 147: resource "azurerm_subnet_network_security_group_association" "container" �[4m{�[0m�[0m �[31m│�[0m �[0m �[31m╵�[0m�[0m �[31m╷�[0m�[0m �[31m│�[0m �[0m�[1m�[31mError: �[0m�[0m�[1mSubnet "adb-container-subnet-mrtredemo28-ws-8044-svc-459a" (Virtual Network "vnet-mrtredemo28-ws-8044" / Resource Group "rg-mrtredemo28-ws-8044") was not found!�[0m �[31m│�[0m �[0m �[31m│�[0m �[0m�[0m with azurerm_subnet_route_table_association.rt_container, �[31m│�[0m �[0m on network.tf line 157, in resource "azurerm_subnet_route_table_association" "rt_container": �[31m│�[0m �[0m 157: resource "azurerm_subnet_route_table_association" "rt_container" �[4m{�[0m�[0m �[31m│�[0m �[0m �[31m╵�[0m�[0m error running command /cnab/app/terraform /usr/bin/terraform apply -auto-approve -input=false -var address_space=10.1.8.0/24 -var arm_environment=public -var is_exposed_externally=false -var tre_id=mrtredemo28 -var tre_resource_id=1f350b8a-736f-4ff8-9a5c-ca3bbc8c459a -var workspace_id=14d01527-62d1-4bad-99ad-37d602c08044: exit status 1 Error: error running command /cnab/app/terraform /usr/bin/terraform apply -auto-approve -input=false -var address_space=10.1.8.0/24 -var arm_environment=public -var is_exposed_externally=false -var tre_id=mrtredemo28 -var tre_resource_id=1f350b8a-736f-4ff8-9a5c-ca3bbc8c459a -var workspace_id=14d01527-62d1-4bad-99ad-37d602c08044: exit status 1 1 error occurred: * mixin execution failed: package command failed

Issue seems to be related to multiple workspace services being deployed/updated in parallel and/or multiple private endpoints/network operations happening in parallel in a single bundle.

@marrobi marrobi changed the title Databricks Workspace Service Install Fails with PutSubnetOperation CanceledAndSupersededDueToAnotherOperation Workspace Service Installs Fails with PutSubnetOperation or CanceledAndSupersededDueToAnotherOperation Oct 17, 2023
@marrobi
Copy link
Member Author

marrobi commented Oct 17, 2023

Another

Error: waiting for creation of Private Endpoint "pe-mlflow-mrtredemo28-ws-8044-svc-89f1" (Resource Group "rg-mrtredemo28-ws-8044"): Code="RetryableError" Message="A retryable error occurred." Details=[{"code":"ReferencedResourceNotProvisioned","message":"Cannot proceed with operation because resource /subscriptions/7f1036b4-4d01-43a0-9f4d-602f5151dc0f/resourceGroups/rg-mrtredemo28-ws-8044/providers/Microsoft.Network/virtualNetworks/vnet-mrtredemo28-ws-8044/subnets/ServicesSubnet used by resource /subscriptions/7f1036b4-4d01-43a0-9f4d-602f5151dc0f/resourceGroups/rg-mrtredemo28-ws-8044/providers/Microsoft.Network/networkInterfaces/pe-mlflow-mrtredemo28-ws-8044-svc-89f1.nic.b228d946-de36-46c2-81ee-1e6b06155123 is not in Succeeded state. Resource is in Updating state and the last operation that updated/is updating the resource is PutSubnetOperation."}]

@marrobi marrobi changed the title Workspace Service Installs Fails with PutSubnetOperation or CanceledAndSupersededDueToAnotherOperation Databricks Workspace Service Installs Fails with PutSubnetOperation or CanceledAndSupersededDueToAnotherOperation Dec 1, 2023
@marrobi marrobi self-assigned this Dec 1, 2023
@marrobi marrobi changed the title Databricks Workspace Service Installs Fails with PutSubnetOperation or CanceledAndSupersededDueToAnotherOperation Workspace Service Installs Fails with PutSubnetOperation or CanceledAndSupersededDueToAnotherOperation Dec 1, 2023
@marrobi
Copy link
Member Author

marrobi commented Dec 1, 2023

Ok, this is down to having two operations in progress on the virtual network. On the virtual network. This can happen if one is adding an address space to a workspace in one operation, when another is adding a subnet to the virtual network at the same time.

We need to limit workspace and workspace service operations to one at a time for each workspace.

As user resources to not typically modify the network, do not believe they are an issue.

Or should the TF provider wait if an operation is in progress?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Development

Successfully merging a pull request may close this issue.

2 participants