Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support draining multiple node #469

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

andrewhu-hcl
Copy link
Contributor

@andrewhu-hcl andrewhu-hcl commented Nov 24, 2021

refactor the drainNode function to support draining multiple node

Signed-off-by: Andrew Hu [email protected]

What this PR does / why we need it:
Current node drain experiment, if the targetNodes is provided, it will get all the node names by split the comma and check each node status. However, during the node drain stage, it just passed the targetNodes into the node drain command, didn’t check if a string contained multiple nodes with comma-separated. So it will cause an error if we specify the targetNodes as multiple node name list by comma-separated. It will be great if it can support draining the multiple nodes.

Which issue this PR fixes: fixes #3343

Special notes for your reviewer:

Checklist:

  • Fixes #3343
  • PR messages has document related information
  • Labelled this PR & related issue with breaking-changes tag
  • PR messages has breaking changes related information
  • Labelled this PR & related issue with requires-upgrade tag
  • PR messages has upgrade related information
  • Commit has unit tests
  • Commit has integration tests
  • E2E run Required for the changes

refactor the drainNode function to support drainning multiple node

Signed-off-by: Andrew Hu <[email protected]>

common.SetTargets(experimentsDetails.TargetNode, "injected", "node", chaosDetails)
log.Infof("Target nodes list: %v", targetNodes)
for _, targetNode := range targetNodes {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also handle the panic case when the length of the list is zero and give an error message to provide the target node name.

Copy link
Member

@uditgaurav uditgaurav Jan 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we run chaos for more nodes we could perform the pre and post chaos check for all the target nodes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also change experimentsDetails.TargetNode to experimentsDetails.TargetNodes in the structure and other places where it is used.

@calebxu-hcl calebxu-hcl force-pushed the support-multiple-node-drain branch from bfa3c9b to 05726d3 Compare January 17, 2022 19:20
@calebxu-hcl
Copy link

Thanks for the feedback @uditgaurav, I think I've addressed all three items you mentioned above. Please advise if there's still any concerns.

@calebxu-hcl calebxu-hcl force-pushed the support-multiple-node-drain branch from 916934b to 96a94ec Compare January 19, 2022 13:38
@uditgaurav
Copy link
Member

Thanks, @andrewhu-hcl for being patient! Can we also add the logs of experiments with successful execution for multiple nodes?

default:
log.Infof("[Inject]: Draining the %v node", experimentsDetails.TargetNode)
targetNodes := strings.Split(experimentsDetails.TargetNodes, ",")
if len(targetNodes) == 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even for an empty string value in experimentsDetails.TargetNodes, the len(targetNodes) will be equal to 1. We can modify the check as follows:

Suggested change
if len(targetNodes) == 0 {
if experimentsDetails.TargetNodes == "" {

command.Stderr = &stderr
if err := command.Run(); err != nil {
log.Infof("Error String: %v", stderr.String())
return errors.Errorf("Unable to drain the %v node, err: %v", targetNode, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can pass the stderr here since the err will contain the exit code description e.g. exit code 1.

nodeSpec, err := clients.KubeClient.CoreV1().Nodes().Get(targetNode, v1.GetOptions{})
if err != nil {
if apierrors.IsNotFound(err) {
return nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we add a log here to specify that the resource was not found?

@@ -183,7 +203,11 @@ func uncordonNode(experimentsDetails *experimentTypes.ExperimentDetails, clients
Times(uint(experimentsDetails.Timeout / experimentsDetails.Delay)).
Wait(time.Duration(experimentsDetails.Delay) * time.Second).
Try(func(attempt uint) error {
targetNodes := strings.Split(experimentsDetails.TargetNode, ",")
targetNodes := strings.Split(experimentsDetails.TargetNodes, ",")
if len(targetNodes) == 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above, this check will fail for an empty string.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants