You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One of our workflows that spins up multiple pods works fine through the first two or three pods containing multiple steps, but with one of the pods linkerd-proxy exits with a 137 and the following are the last events for this pod
which causes argo server to mark the step as failed and fails the entire workflow. All other preceding pods in the workflow have only
Normal Killing 37m kubelet Stopping container linkerd-proxy
as their last event. It seems for whatever reason in this particular pod there is a race condition where the health probes are running as the proxy container is shutting down.
Are there corresponding parameters that possibly should be tweaked when using injecting linkerd-proxy as a native sidecar?
How can it be reproduced?
This isn't clear. I don't yet have a test case as these are fairly complex workflows.
Logs, error output, etc
See above.
output of linkerd check -o short
% linkerd check -o short
linkerd-version
---------------
‼ cli is up-to-date
unsupported version channel: stable-2.14.10
see https://linkerd.io/2.14/checks/#l5d-version-cli for hints
control-plane-version
---------------------
‼ control plane is up-to-date
is running version 24.11.3 but the latest edge version is 24.11.4
see https://linkerd.io/2.14/checks/#l5d-version-control for hints
‼ control plane and cli versions match
control plane running edge-24.11.3 but cli running stable-2.14.10
see https://linkerd.io/2.14/checks/#l5d-version-control for hints
linkerd-control-plane-proxy
---------------------------
‼ control plane proxies are up-to-date
some proxies are not running the current version:
* linkerd-destination-5ddc58f9bc-5x9nh (edge-24.11.3)
* linkerd-destination-5ddc58f9bc-7gkdk (edge-24.11.3)
* linkerd-destination-5ddc58f9bc-9c99t (edge-24.11.3)
* linkerd-destination-5ddc58f9bc-brbh5 (edge-24.11.3)
* linkerd-destination-5ddc58f9bc-ffmdx (edge-24.11.3)
* linkerd-identity-85fb8c4b5f-c6l7m (edge-24.11.3)
* linkerd-identity-85fb8c4b5f-ctr4h (edge-24.11.3)
* linkerd-identity-85fb8c4b5f-jhp8q (edge-24.11.3)
* linkerd-identity-85fb8c4b5f-nzx8w (edge-24.11.3)
* linkerd-identity-85fb8c4b5f-vfmkc (edge-24.11.3)
* linkerd-proxy-injector-5497b8cb97-fw85c (edge-24.11.3)
* linkerd-proxy-injector-5497b8cb97-g22xn (edge-24.11.3)
* linkerd-proxy-injector-5497b8cb97-g2m2v (edge-24.11.3)
* linkerd-proxy-injector-5497b8cb97-gjfwv (edge-24.11.3)
* linkerd-proxy-injector-5497b8cb97-jwrnl (edge-24.11.3)
see https://linkerd.io/2.14/checks/#l5d-cp-proxy-version for hints
‼ control plane proxies and cli versions match
linkerd-destination-5ddc58f9bc-5x9nh running edge-24.11.3 but cli running stable-2.14.10
see https://linkerd.io/2.14/checks/#l5d-cp-proxy-cli-version for hints
linkerd-ha-checks
-----------------
‼ pod injection disabled on kube-system
kube-system namespace needs to have the label config.linkerd.io/admission-webhooks: disabled if injector webhook failure policy is Fail
see https://linkerd.io/2.14/checks/#l5d-injection-disabled for hints
linkerd-viz
-----------
‼ viz extension proxies are up-to-date
some proxies are not running the current version:
* metrics-api-5789bcc5d-2zdck (edge-24.11.3)
* prometheus-9c78c7f55-7q88p (edge-24.11.3)
* tap-6688cddf94-st2jc (edge-24.11.3)
* tap-injector-85b47576fc-9k222 (edge-24.11.3)
* web-8c5b96b6-s7ggv (edge-24.11.3)
see https://linkerd.io/2.14/checks/#l5d-viz-proxy-cp-version for hints
‼ viz extension proxies and cli versions match
metrics-api-5789bcc5d-2zdck running edge-24.11.3 but cli running stable-2.14.10
see https://linkerd.io/2.14/checks/#l5d-viz-proxy-cli-version for hints
Status check results are √
Environment
Server Version: v1.29.8-eks-a737599
Possible solution
No response
Additional context
No response
Would you like to work on fixing this bug?
None
The text was updated successfully, but these errors were encountered:
What is the issue?
We are injecting
linkerd-proxy
in our argo workflows as a native sidecar using the annotationOne of our workflows that spins up multiple pods works fine through the first two or three pods containing multiple steps, but with one of the pods
linkerd-proxy
exits with a 137 and the following are the last events for this podwhich causes argo server to mark the step as failed and fails the entire workflow. All other preceding pods in the workflow have only
as their last event. It seems for whatever reason in this particular pod there is a race condition where the health probes are running as the proxy container is shutting down.
Are there corresponding parameters that possibly should be tweaked when using injecting
linkerd-proxy
as a native sidecar?How can it be reproduced?
This isn't clear. I don't yet have a test case as these are fairly complex workflows.
Logs, error output, etc
See above.
output of
linkerd check -o short
Environment
Possible solution
No response
Additional context
No response
Would you like to work on fixing this bug?
None
The text was updated successfully, but these errors were encountered: