Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move typed queues to ResourceEvent over 'any' #3475

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

perdasilva
Copy link
Collaborator

@perdasilva perdasilva commented Dec 19, 2024

Description of the change:
This PR updates our usage of client-go work queues to be kubestate.ResourceEvent typed. Before the change, the queues would accept any comparable datatype, and we had different types flowing through the queues. For instance, for the namespace resolution queue, we had strings as well as ResourceEvents. Using the queue in this meant that we could (and had) the same resource being concurrently reconciled twice: for both the semantically identical string and ResourceEvent items. This in turn led to Subscriptions being put in a terminal UnSat state since the reconciliation that created the install plan (and subsequently the CSV) wasn't finished when the concurrent reconciliation for the same namespace resolved with a seemingly dissociated CSV (i.e. .status.CurrentCSV was empty, but the CSV had been stamped out).

Closes #3010

Motivation for the change:

Architectural changes:

Testing remarks:

Reviewer Checklist

  • Implementation matches the proposed design, or proposal is updated to match implementation
  • Sufficient unit test coverage
  • Sufficient end-to-end test coverage
  • Bug fixes are accompanied by regression test(s)
  • e2e tests and flake fixes are accompanied evidence of flake testing, e.g. executing the test 100(0) times
  • tech debt/todo is accompanied by issue link(s) in comments in the surrounding code
  • Tests are comprehensible, e.g. Ginkgo DSL is being used appropriately
  • Docs updated or added to /doc
  • Commit messages sensible and descriptive
  • Tests marked as [FLAKE] are truly flaky and have an issue
  • Code is properly formatted

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 19, 2024
@perdasilva perdasilva changed the title [WIP] Experimental: move Typed queues to ResourceEvent over any [WIP] Experimental: move typed queues to ResourceEvent over 'any' Dec 19, 2024
@perdasilva perdasilva force-pushed the queue-fix-2 branch 20 times, most recently from e1a5348 to 1c67076 Compare December 19, 2024 17:53
event, ok := item.(kubestate.ResourceEvent)
if !ok || event.Type() != kubestate.ResourceDeleted {
var event = item
if item.Type() != kubestate.ResourceDeleted {
// Get the key
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this indirection on getting a key can probably be removed now that we know the item is a ResourceEvent.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's another nutty thing going on here (which I dared not touch). The queue items are in the form
EventResource{type, string} (with the exception of the delete event which carries the object), but the events that get sent to the Sync'ers are in the form EventResource{type, obj}.
So, initial intuition that the syncers worked over old/failed versions of the resources was wrong, because the current version gets picked up and injected here.

Or, did you already get that, and there's a better way to massage this that I'm not seeing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the reason why the delete event doesn't get the same treatment is: https://github.com/operator-framework/operator-lifecycle-manager/blob/master/pkg/lib/queueinformer/queueinformer.go#L40.
We may want to revisit this in a follow-up. But, for now, I feel there is sufficient change in this PR that we can let out into the wild before further refactoring.

@perdasilva perdasilva force-pushed the queue-fix-2 branch 2 times, most recently from 065fe04 to 4133943 Compare December 20, 2024 08:59
@perdasilva perdasilva changed the title [WIP] Experimental: move typed queues to ResourceEvent over 'any' Move typed queues to ResourceEvent over 'any' Dec 20, 2024
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 20, 2024
Signed-off-by: Per Goncalves da Silva <[email protected]>
@perdasilva perdasilva force-pushed the queue-fix-2 branch 2 times, most recently from c19f185 to 36b3357 Compare January 6, 2025 13:26
Per Goncalves da Silva added 3 commits January 6, 2025 14:51
Signed-off-by: Per Goncalves da Silva <[email protected]>
Signed-off-by: Per Goncalves da Silva <[email protected]>
Signed-off-by: Per Goncalves da Silva <[email protected]>
Signed-off-by: Per Goncalves da Silva <[email protected]>
Per Goncalves da Silva added 2 commits January 7, 2025 15:50
Signed-off-by: Per Goncalves da Silva <[email protected]>
Signed-off-by: Per Goncalves da Silva <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Problems arising from deleting and rapidly creating a subscription
2 participants