Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter by namespace intermittently includes all namespaces #388

Open
abean-work opened this issue Nov 29, 2024 · 0 comments
Open

Filter by namespace intermittently includes all namespaces #388

abean-work opened this issue Nov 29, 2024 · 0 comments

Comments

@abean-work
Copy link




Description:
When attempting to run Popeye against a namespace, it will intermittently (around half the time) fail due to problems in other namespaces.

To Reproduce
Steps to reproduce the behavior:

  1. Deploy a pod that will cause Popeye to fail: kubectl run fail-pod --image=nonexistent/nonexistentimage:latest -n test
  2. Scan a different namespace that is healthy: popeye -n healthy -l error -f ./spinach.yml
  3. Repeat the scan until it fails.
  • Most scans will return healthy with no issues, e.g.:
    PODS (5 SCANNED)                                                             💥 0 😱 0 🔊 0 ✅ 5 100٪
    
    ┅┅┅┅┅┅┅
    · Nothing to report.
    
  • Occasionally, it will fail due to the pod in the other namespace (notice a lot more pods are included in the scan):
    PODS (30 SCANNED)                                                            💥 1 😱 0 🔊 0 ✅ 29 96٪
    ┅┅┅┅┅┅┅
    · test/fail-pod...............................................................................💥
      💥 [POP-207] Pod is in an unhappy phase (Pending).
      🐳 fail-pod
        💥 [POP-203] Pod is waiting [0/1] ImagePullBackOff.
    

Using the following (crude) command, I was able to reproduce the error easily:

> repeat 20 { popeye -n healthy -l error -f ./spinach.yml > /dev/null 2>&1; echo $?}
1
0
0
1
0
0
1
0
0
1
1
0
0
0
0
1
1
1
0
1

The exit codes show that, in this instance, 9 out of 20 scans failed due to including resources from other namespaces. When repeating this command, the number of failures has always been between 8 and 12, so roughly half the time it fails.

Expected behavior

  1. The namespace flag should restrict the popeye scan to that namespace.
  2. Scans are consistent in the resources they include.

Versions (please complete the following information):

  • OS: OSX 14.7 and Ubuntu 22.04
  • Popeye: 0.21.5
  • K8s: 1.29.8

Additional context

Our team owns/manages a number of namespaces on shared Kubernetes (AKS) clusters, which we are scanning individually using the -n flag and then aggregating the JUnit output.

These namespaces are looped through, so the scans happen immediately after one another. I've tried adding sleeps between scans, but this didn't help.

This could be related to #314, but I've created a new issue as it does work some of the time.

Spinach config:
---
# Popeye configuration using the AKS sample as a base.
# See: https://github.com/derailed/popeye/blob/master/spinach/spinach_aks.yml
popeye:
  allocations:
    cpu:
      # Checks if cpu is under allocated by more than x% at current load.
      underPercUtilization: 200
      # Checks if cpu is over allocated by more than x% at current load.
      overPercUtilization: 50
    memory:
      # Checks if mem is under allocated by more than x% at current load.
      underPercUtilization: 200
      # Checks if mem is over allocated by more than x% at current load.
      overPercUtilization: 50

  # Excludes define rules to exempt resources from sanitization
  excludes:
    global:
      fqns:
        # Exclude kube-system namespace
        - rx:^kube-system/

    linters:
      # Exclude system CRBs
      clusterrolebindings:
        instances:
          - fqns:
              - rx:^aks
              - rx:^omsagent
              - rx:^system

      # Exclude system CRs
      clusterroles:
        instances:
          - fqns:
              - rx:^system
              - admin
              - cluster-admin
              - edit
              - omsagent-reader
              - view
            codes: [400]

      # Exclude unused windows daemonset
      daemonsets:
        instances:
          - fqns: [calico-system/calico-windows-upgrade]
            codes: [508]

      # Exclude due to intermittent false positives
      serviceaccounts:
        codes: ["305"]

  resources:
    # Nodes specific sanitization
    node:
      limits:
        cpu: 90
        memory: 80

    # Pods specific sanitization
    pod:
      limits:
        # Fail if cpu is over x%
        # Set intentionally high to ignore (if you comment it out, it'll default to 80)
        cpu: 250
        # Set intentionally high to ignore (if you comment it out, it'll default to 90)
        # Fail if pod mem is over x%
        memory: 900
      # Fail if more than x restarts on any pods
      restarts: 3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant