-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add parse_retrieved_files option for the PP plugin. #1029
Add parse_retrieved_files option for the PP plugin. #1029
Conversation
The option enables switching on/off the parsing of the output files produced by the PP plugin.
@yakutovicha I tried testing it but it doesnt work for me i used from aiida.plugins import CalculationFactory load_profile(); # noqa: E402 from aiida import orm parameters = { pp_parameters = orm.Dict(parameters) code = orm.load_code('pp-7.2@daint-gpu') #Replace for your comp builder = code.get_builder() builder.parameters = pp_parameters builder.metadata.options = { } from aiida.engine import run How are you testing this PR ? |
@yakutovicha could you update the PR? , somehow some test are failing maybe with the update is fixed |
The solution for this submission code was to add a label to the metadata , I check the code with the main branch and ppcalculation runs, but testing the PR it doesnt |
@yakutovicha what is the use case for not wanting to parse output files? |
Sometimes we do not want to save cube files as they are too big. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @yakutovicha
Sometimes we do not want to save cube files as they are too big.
But then what is the point of running the calculation? Are you still using the output files but directly on the scratch? Or is there still some other partial information that is parsed that is of use?
Regardless of the use case, I wonder if we could come up with a better name for the option. The current name seems to suggest that there is no parsing going on, but that is not the case. What the option is disabling is the parsing/storing of any output files besides the stdout, which is still parsed regardless of the value for this bew input option.
In the code they are referred to as "output data files", so perhaps we can use something like:
spec.input('metadata.options.store_parsed_data_files', valid_type=bool, default=True, help='When set to `False`, only the stdout is retrieved and parsed. All other produced data files are not retrieved, parsed or stored.')
self.out('output_data', data_parsed[0][1]) | ||
else: | ||
self.out('output_data_multiple', dict(data_parsed)) | ||
if self.node.base.attributes.get('parse_retrieved_files', True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of moving this entire block in a conditional, we can simply exit early. All the code starting from line 67 is only necessary to parse the additional output data. So I propose we simply add
if not self.node.base.attributes.get('parse_retrieved_files', True):
return self.exit(logs=logs)
after line 65.
@@ -83,6 +83,7 @@ def define(cls, spec): | |||
spec.input('metadata.options.parser_name', valid_type=str, default='quantumespresso.pp') | |||
spec.input('metadata.options.withmpi', valid_type=bool, default=True) | |||
spec.input('metadata.options.keep_plot_file', valid_type=bool, default=False) | |||
spec.input('metadata.options.parse_retrieved_files', valid_type=bool, default=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a docstring as to what this does
That depends on the specific needs of a user. We develop tools allowing them to download and look at low-res copies of the cube files to then decide how to render the original. Another alternative is to stash the file for further use. In the original comment I should have written "... to save cube files locally as they are too big". |
I understand the name is confusing, but the option I add is really about parsing. Essentially, 2 options affect the parsing/storage of the mesh files: the existing
In the 4th case, the mesh files are not even downloaded. I guess we can call the option |
Thanks @yakutovicha , that clarifies things a lot.
I think that would already help things a lot, since it makes it immediately clear that these options affect the same files. Unfortunately, the naming of the existing I think it makes sense to properly name the options if we are adding one and deprecating |
Hi @sphuber, Thanks for looping me. 4 or 5 years is a long time in If you guys want to rename, happy for you to go for it, but one needs to make clear the link with the PP.x docs and how PP works. Otherwise, the experienced QE user will just be frustrated by these new mystery AiiDA-QE options that are different to what they've been using for many years. |
Very fair point @ConradJohnston . Mirroring the API/naming of the tool itself as close as possible definitely has its benefits. But what I am mostly worried about is that we are not really talking about one file. There is a plot file and a data file in most cases, is there not? At least looking at the calcjob/parser implementation it expects (multiple) pairs of files of |
To be 100% clear, do we agree on what behaviour we look for? From my perspective, it should be this:
If that is the wanted behaviour, I am fine with any naming convention - I have no strong opinion here. |
Yes, that is how I currently understand it as well @yakutovicha and that makes sense to me. I would just suggest to choose a name that makes it somewhat clear that it concerns the "data" files and not the |
@sphuber I renamed the parameters to The other two things we need to do before merging:
I will do them after we agree on the naming. |
Thanks @yakutovicha I am onboard with the latest naming. In terms of backwards compatibility, I think it would be good to keep the old option and simply print a deprecation warning if it is set (set the default to |
I have to say that tests need to be improved. Currently, the attributes are passed explicitly instead of being generated automatically. But this is for another PR, I guess. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Other than my previous comment, the PR is ready. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @yakutovicha Setting the options through the attributes directly when mocking the calcjob node is actually the right way to do it, so there is nothing to improve there 👍
The `parse_data_files` option is added. When switched to `False` the parser will not parse the outputs files but just keep the raw files. The existing option `keep_plot_file` is deprecated in favor of the renamed `keep_data_files` option to make it coherent with the new option.
fixes #945
The option enables switching on/off the parsing of the output files produced by the PP plugin.