The resource spec contents of pp and rb are not synchronized in time. #5996

CharlesQQ · 2024-12-30T10:30:08Z

What happened:

During controller restart, the .spec.suspension fields of rb and pp are different

https://www.processon.com/v/6772523638625f07c0ec6273

During the restart, the resource detector controller processes the currently released workload later than the binding controller, causing the pause setting to become invalid; our custom controller extension (for grayscale release) does not update the op resource, and the number of new pods more than partition.

After pp sets .spec.preserveResourcesOnDeletion=true, execute the deletion operation immediately, and the resources of the member clusters may be deleted in cascade

What you expected to happen:

The relevant fields in pp, rb, and work can be consistent. If the fields in rb and work are inconsistent with pp, then the subsequent operations related to rb and work do not need to be performed until the synchronization of the relevant fields is completed

How to reproduce it (as minimally and precisely as possible):

have lost of resouce template And pp, restart controller and set .spec.suspension.dispatching=true, check rb field .spec.suspension.dispatching.
set pp field .spec.preserveResourcesOnDeletion=true, and deletion operation immediately, check whether the resource has been cascade deleted in member cluster.

Anything else we need to know?:

Environment:

Karmada version:
v1.12
kubectl-karmada or karmadactl version (the result of kubectl-karmada version or karmadactl version):
Others:

The text was updated successfully, but these errors were encountered:

CharlesQQ · 2024-12-31T03:13:54Z

I have a solution for reference only

When PP changes, the hash value of the current PP is calculated and added to the annotation By webhook; when the resoure detector updates RB, the hash value is also synchronized; before reconcile, the binding controller first detects whether the hash values of rb and pp are consistent; if they are not same, then Return to wait; the path from rb to work is similar

Reference for calculation method of hash value from openkruise.
https://github.com/openkruise/kruise/blob/e3e6d471a75737606e8cfe5338ad92bdddc72699/pkg/webhook/sidecarset/mutating/sidecarset_create_update_handler.go#L48-L66

CharlesQQ added the kind/bug Categorizes issue or PR as related to a bug. label Dec 30, 2024

github-project-automation bot added this to Karmada Overall Backlog Dec 30, 2024

RainbowMango added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Dec 31, 2024

RainbowMango moved this to Accepted in Karmada Overall Backlog Dec 31, 2024

RainbowMango added this to the v1.13 milestone Jan 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The resource spec contents of pp and rb are not synchronized in time. #5996

The resource spec contents of pp and rb are not synchronized in time. #5996

CharlesQQ commented Dec 30, 2024 •

edited

Loading

CharlesQQ commented Dec 31, 2024

The resource spec contents of pp and rb are not synchronized in time. #5996

The resource spec contents of pp and rb are not synchronized in time. #5996

Comments

CharlesQQ commented Dec 30, 2024 • edited Loading

CharlesQQ commented Dec 31, 2024

CharlesQQ commented Dec 30, 2024 •

edited

Loading