You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello. First of all, thank you for your work and for sharing it!
I am implementing the evaluation process of OWOD (defined in PascalVOCDetectionEvaluator.evaluate). I have encountered several problems to correctly apply the benchmark to my own work. Here are them:
For evaluating every task and every metric (even WI), I assume you are always performing evaluation against all_task_test.txt. Is that right?
For the Wilderness Impact, they get reported the WI values for different recalls. Which one do you chose?
Are instances categorized as "difficult" used in for the metrics?
The reported number of test instances of T1 in the below image of your paper refer only to the KNOWN classes or are accounting for both KNOWN and UNKNOWN objects?
The text was updated successfully, but these errors were encountered:
Hello. First of all, thank you for your work and for sharing it!
I am implementing the evaluation process of OWOD (defined in
PascalVOCDetectionEvaluator.evaluate
). I have encountered several problems to correctly apply the benchmark to my own work. Here are them:The text was updated successfully, but these errors were encountered: