-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry setting Out Of Service flags #235
Retry setting Out Of Service flags #235
Conversation
Skipping CI for Draft Pull Request. |
/test 4.16-openshift-e2e |
How much longer is an OCP/k8s version without the taint still supported for? Unrelated, the handling of |
We use the taint as default since ocp 4.14 (we currently support 4.12+)
The main purpose of the retry was to overcome a network issue which prevents fetching the k8s version, I think it would be tricky to assume that we can get the version needed to decide whether to apply the retry logic when there are network issues that would potentially trigger the retry.
In case the err is already set it'll be returned in the previous |
/lgtm to get more reviews and close the other threads |
…emporary network issues. Signed-off-by: Michael Shitrit <[email protected]>
a6d4476
to
e1981ed
Compare
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mshitrit, slintes The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold in case you want to address the nit |
Signed-off-by: Michael Shitrit <[email protected]>
e1981ed
to
5bc18b5
Compare
/lgtm |
/unhold |
if err := utils.InitOutOfServiceTaintFlags(mgr.GetConfig()); err != nil { | ||
if err := utils.InitOutOfServiceTaintFlagsWithRetry(context.Background(), mgr.GetConfig()); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice improvement
/retest |
Why we need this PR
During release testing of SNR we've noticed one case were SNR template with Out Of Service taint strategy could not be created.
The reason for that was that because when SNR was initialized, it failed to fetch the k8s version therefore SNR couldn't verify support for Out Of Service taint.
We couldn't reproduce this, but we suspect this occurred due to a temporary network issue.
In order to minimize similar occurrences in the future a retry mechanism is added.
Adding links to our slack discussion on that subject. [1] [2]
Changes made
In case Out Of Service flags can't be set at the first attempt (due to failing getting the k8s version or any other reason) , several retries will be made before deciding Out Of Service flags isn't supported.
Which issue(s) this PR fixes
Test plan