You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Grafana dashboards defined in grafana-dashboardDefinitions.yaml include graphs for memory consumption per pod. The memory consumption query currently used is:
When a pod is restarted, the current query adds memory usage data from both the old and new containers simultaneously. This can lead to temporary spikes in the displayed memory consumption. As a result, the dashboard may show memory usage that exceeds the container's memory limit, even though the actual memory consumption is within the limit.
Steps to Reproduce:
Trigger a pod restart (e.g OOM kill, or Evict).
Compare graphs with expression grouped by just container field with graph that has expression that groups by container and id:
Hi! I also found this issue, but I am thinking that it may not be an issue on the dashboard, but on the metric itself or in the scrapping configuration, no? 🤔 For me I still see the previous container run for 4:30 minutes (comparing when the new run started and the metrics from the previous one disappear).
but as seen, it also includes the memory from a container that is not currently running anymore.
May it be a misconfiguration in the metrics scrapper?
And the problem is that, for a dashboard showing a single pod, this is viable showing the different containers, but what about a dashboard showing the total memory usage in the cluster? If you still sum the different container instances for the same container you will be displaying something wrong. 🤔
Problem:
The Grafana dashboards defined in grafana-dashboardDefinitions.yaml include graphs for memory consumption per pod. The memory consumption query currently used is:
https://github.com/prometheus-operator/kube-prometheus/blob/main/manifests/grafana-dashboardDefinitions.yaml#L8300
When a pod is restarted, the current query adds memory usage data from both the old and new containers simultaneously. This can lead to temporary spikes in the displayed memory consumption. As a result, the dashboard may show memory usage that exceeds the container's memory limit, even though the actual memory consumption is within the limit.
Steps to Reproduce:
container
field with graph that has expression that groups bycontainer
andid
:The text was updated successfully, but these errors were encountered: