Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hardcoded floating-point answers should be verified more exactly #403

Open
mpsijm opened this issue Nov 8, 2024 · 3 comments
Open

Hardcoded floating-point answers should be verified more exactly #403

mpsijm opened this issue Nov 8, 2024 · 3 comments

Comments

@mpsijm
Copy link
Collaborator

mpsijm commented Nov 8, 2024

For problem Fractal Area of the BAPC preliminaries 2024, the version used on the 21st of September showed sample answers that were slightly too inexact. Specifically, they differed in the last two digits. The reason here was that the input precision was reduced from 9 to 6 digits, and the sample answer that was hardcoded in generators.yaml was not updated. However, all jury submissions still passed, because the error was smaller than $10^{-7}$.

To prevent this, BAPCtools could perform a more exact match for the hardcoded answers. Specifically, the hardcoded answer should only be allowed to differ in the last digit, and only by one. For Fractal Area, the samples show 9 digits after the 0., so basically, the maximum allowed absolute error for the canonical jury solution on these hardcoded samples should be $10^{-9}$ rather than $10^{-6}$.

To avoid re-inventing the wheel (the wheel here being default_output_validator.cpp, perhaps we could modify the arguments passed to this validator when running the canonical jury solution on a hardcoded .ans file (or any .ans file, to simplify things). The question then becomes how to detect the value of the last significant digit in this answer (in the example above, this "value" is trivially $10^{-9}$ by counting digits, but we should also handle scientific notation correctly).

@mzuenni
Copy link
Collaborator

mzuenni commented Nov 9, 2024

I don't understand the first paragraph. Why did decreasing the precision make the hardcoded value wrong/require it to be updated? Or was it just wrong from the beginning and due to the change in precision this was unnoticed?

And my second question is why was it not generated in the first place? Or to be more precise: for problems with fixed output (i.e. those using default_output_validator.cpp) is there ever a need to use hardcoded .ans files?
(If not we could just encourage to not use the hardcoded .ans files)

@RagnarGrootKoerkamp
Copy link
Owner

RagnarGrootKoerkamp commented Nov 9, 2024

So, before the input was, say 0.333_333_333 (underscores for clarity), and the answer was hardcoded to 0.666_666_666 (because we like showing exactly that number, rather than with more or less decimals).
Then, the input was changed to 0.333_333, but the answer was left untouched.
Submissions still pass because the difference between 0.666_666 and 0.666_666_666 is small enough, but it's weird that the given answer is not 'fully' correct in all its digits.

So indeed the fix we did is to write a generator that specifically prints the number of digits we want to show in the samples.
We could decide that from now on that's what you're supposed to do, and we warn when ans: is given for default-output-validator problems.

Or, we could first 'ignore' given ans: files, run the solution to generate a .ans, and then check that the hardcoded ans: is sufficiently precise, as in, as least both the required precision, and also all its digits are correct (with rounding or truncating for the last).

@mzuenni
Copy link
Collaborator

mzuenni commented Nov 9, 2024

Aahhh the input precision was changed not the output precision... now I understand the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants