Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search cache aggregation does not distinguish failed harness results from normal subtests #4159

Open
DanielRyanSmith opened this issue Dec 10, 2024 · 0 comments

Comments

@DanielRyanSmith
Copy link
Contributor

Rarely, a test has a non-OK harness status, but passes some number of subtests. During aggregation, this partial failure is not displayed correctly.

  • NOTE: The harness status is NEVER counted toward the score on the Interop Dashboard, which is why the discrepancy arises. The searchcache that populates the results page is stored in a way to quickly reference and aggregate subsets of test run data. See the results analysis script for information on interop score aggregation.

  • When the searchcache aggregates subtest scores, it filters out the harness status OK scores, which are stored as subtests results themselves. "OK" results are unique to the harness status result, so they can be filtered out and not counted toward the test score. Harness results that are non-passing have the same statuses as non-harness results, e.g. "TIMEOUT", "CRASH", etc, so the searchcache aggregation does not know which subtest result is the harness status.

Example:
Screenshot 2024-12-07 12 11 11 PM

This test has a harness result TIMEOUT for the browser in the right-most column. When looking at the single test view, the total displays "8/9" subtests passing, which is not counting the harness result TIMEOUT (correctly). The error occurs when looking at a view higher than the single test view.

Screenshot 2024-12-07 12 34 11 PM
Here, we can see that the searchcache did not differentiate the harness result from the other subtest results, so the ratio displays an 8/10 instead of 8/9. This scenario happens rather rarely, but has come up before.

The likely best option for fixing this problem is to add some way to discern the harness result as being the harness result in the data stored that is aggregated by the search cache rather than only filtering "OK" statuses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant