Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement metrics comparison table & plotting #1707

Open
LinasKo opened this issue Dec 3, 2024 · 7 comments
Open

Implement metrics comparison table & plotting #1707

LinasKo opened this issue Dec 3, 2024 · 7 comments
Assignees
Labels
enhancement New feature or request

Comments

@LinasKo
Copy link
Contributor

LinasKo commented Dec 3, 2024

The metrics system allows the users to compute a metrics result - a class with values for a specific metrics run.

When comparing multiple models, a natural next step is to aggregate the results into a single table and/or plot them on a single chart.

Let's make this step easy!

I propose two new functions:

def aggregate_metric_results(metrics_results: List[MetricResult], *,  include_object_sizes=False) -> pd.DataFrame:
   """
   Raises when different types of metrics results are passed in
   """

def plot_aggregate_metric_results(metrics_results: List[MetricResult], *, include_object_sizes=False) -> None:
    ...


class MetricResult(ABC):
   @abstractmethod
   def to_pandas():
      raise NotImplementedError()

   def plot():
      raise NotImplementedError()

Several questions we need to address:

  1. Would it be better for plot_aggregate_metric_results to take in a pd.DataFrame? This way, the user can apply their own sorting or preprocessing, but we're not guaranteed to have the fields we need.
  2. Could the naming be improved?

Suggested by @David-rn, discussion

@LinasKo
Copy link
Contributor Author

LinasKo commented Dec 3, 2024

Note: Please share a Google Colab with minimal code to test the new feature. We know it's additional work, but it will speed up the review process. You may use the Starter Template. The reviewer must test each change. Setting up a local environment to do this is time-consuming. Please ensure that Google Colab can be accessed without any issues (make it public). Thank you! 🙏

@LinasKo LinasKo added the enhancement New feature or request label Dec 3, 2024
@David-rn
Copy link
Contributor

David-rn commented Dec 4, 2024

Regarding the first question and considering the guide that led to this discussion, I think that maybe it is a better option using a List[MetricResult] as input as first suggested, since the user would already have used supervision classes (Detections, MetricResult).

If you think it's a good idea, I can give it a try these days and share a Colab to evaluate whether it is feasible or there are more things to consider.

@LinasKo
Copy link
Contributor Author

LinasKo commented Dec 4, 2024

That'd be most helpful @David-rn ! I'm assigning it to you.

@David-rn
Copy link
Contributor

David-rn commented Dec 8, 2024

I have written both functions and I came across some things:

  1. The function aggregate_metric_results is straightforward, since the metrics returns a DataFrame, which can be aggregated. In case you need to check the implementation, it can be seen here. Maybe it is needed to decide on the exceptions, naming, etc.
  2. The second function, plot_aggregate_metric_results, is not that easy/clean to implement. Unlike the to_pandas method that returns an object, plot function shows the actual plot. For this reason, I have added a parameter to this function (return_params=False) to get the necessary values to plot the results of several models in the same plot. It can be seen here. I have only added the parameter in the class MeanAveragePrecision to showcase the situation and check with you what would be the best decision.

A Colab Notebook is shared here.

@LinasKo
Copy link
Contributor Author

LinasKo commented Dec 9, 2024

Hey David,

  1. The one quick edit I'd make is moving the new method to utils.
  2. I was imagining an aggregate method with more if-else cases, but your approach seems better. How about we add a result._get_plot_details() to each metric result? The _ will hide it from docs, even if a docstring is added. Then, both plot() and plot_aggregate_metric_results can use it (yes, I know it's slightly idiosyncratic, as an external method accesses a private one).

Preview for any other reviewers stumbling upon this.

It's looking great so far!

image

@David-rn
Copy link
Contributor

David-rn commented Dec 9, 2024

Hi @LinasKo, thanks for the comments!

  1. Just to make sure, do you mean to move the file with both methods into utils? Or, do you mean to add both methods into the utils.py file inside the utils module? I'm guessing the first one.
  2. Perfect! I'll add the method _get_plot_details() into the new MetricResult class.

@LinasKo
Copy link
Contributor Author

LinasKo commented Dec 9, 2024

  1. Either option seems fine. A separate file is slightly tidier, I reckon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants