Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add derivation of surface concentrations for additional trace gases (CH4 and N2O) #2611

Open
jlenh opened this issue Dec 9, 2024 · 11 comments
Labels
enhancement New feature or request

Comments

@jlenh
Copy link

jlenh commented Dec 9, 2024

The variables ch4 and n2o are 3D variables providing trace gas concentrations on pressure levels. The related surface quantities are often useful in order to compare to surface measurements (see cmorizer and diagnostics in development in ESMValTool repo Issue 3838 and Issue 3839 ). In the spirit of what was done for co2s #587 , a dedicated preprocessor could be added to create the corresponding surface variables.

Is there a way to make this surface derivation method somewhat standard or potentially used as a standalone preprocessor ? As it currently looks like, a dedicated preprocessing file would be needed in the esmvalcore//preprocessor/_derive folder for each variable ch4s and n2os.

@jlenh jlenh added the enhancement New feature or request label Dec 9, 2024
@bouweandela
Copy link
Member

Is there a way to make this surface derivation method somewhat standard or potentially used as a standalone preprocessor?

Yes, you can take that bit of code and put it into a new preprocessor function. That would avoid the need for these derived variables altogether. It might be nice to also make it work with ocean variables, e.g. so people could use it to extract the sea floor.

@jlenh
Copy link
Author

jlenh commented Dec 11, 2024

Yes, you can take that bit of code and put it into a new preprocessor function. That would avoid the need for these derived variables altogether. It might be nice to also make it work with ocean variables, e.g. so people could use it to extract the sea floor.

In this case, can the additional required data be handled similarly to the _derive preprocessors ? For example in the case of atmospheric variables, an interpolation to the surface pressure given by the variable ps is necessary as the surface pressure is also an output of the model simulation. I am a bit unsure how to incorporate this in the preprocessor function. Would it need to be done directly in the recipe then i.e. mentioning which variables are required there (e.g. ch4 and ps) ?

@bouweandela
Copy link
Member

You could probably register ps as a supplementary variable similar to how it is done here:

@register_supplementaries(
variables=["sftgif"],
required="require_at_least_one",
)
def mask_landseaice(cube: Cube, mask_out: Literal["landsea", "ice"]) -> Cube:

It will then automatically get added in the recipe and will be available on the loaded Iris cube that gets passed to the preprocessor as an ancillary variable. Documentation on supplementary variables in the recipe is available here and from the Python API here.

@jlenh
Copy link
Author

jlenh commented Dec 13, 2024

I implemented the derivation of the variable in the way you mentioned but ran into the problem that for the variable ps to be added as an ancillary variable to the atmospheric variable cube, both cubes need to have the same dimensions which is not the case here: ps is a 3D variable with (time, lat, lon) while the atmospheric variable still has a pressure-level dimension (time, plev, lat, lon). I basically get the error when the add_supplementary_variables function tries to run add_ancillary_variable on the iris.Cube.

I have made some tests using new dedicated derivation files for ch4s and n2os and it seems to work properly for now. It of course leads to some duplicated code for the preprocessing which could be formatted and included in /preprocessors/_derive/_shared.py maybe.

@bouweandela
Copy link
Member

You could update this code so it provides the correct dimensions:

https://github.com/ESMValGroup/ESMValCore/blob/main/esmvalcore/preprocessor/_supplementary_vars.py#L109

@bouweandela
Copy link
Member

You could update this code so it provides the correct dimensions:

https://github.com/ESMValGroup/ESMValCore/blob/main/esmvalcore/preprocessor/_supplementary_vars.py#L109

Do you know how to do that? Or could you use some help?

@jlenh
Copy link
Author

jlenh commented Dec 19, 2024

You could update this code so it provides the correct dimensions:

https://github.com/ESMValGroup/ESMValCore/blob/main/esmvalcore/preprocessor/_supplementary_vars.py#L109

I have partly reworked the add_ancillary_variable function at https://github.com/ESMValGroup/ESMValCore/blob/main/esmvalcore/preprocessor/_supplementary_vars.py#L109
with:

coords_cube = cube.coords()
coords_ancillary_var = ancillary_cube.coords()
data_dims = [next(i for i, cc in enumerate(coords_cube) if cc == ca) for ca in coords_ancillary_var]
cube.add_ancillary_variable(ancillary_var, data_dims=data_dims)

if you have any recommendations to improve that @bouweandela.

From what I gathered otherwise, I added the preprocessing function in https://github.com/ESMValGroup/ESMValCore/blob/main/esmvalcore/preprocessor/_volume.py and its mention in https://github.com/ESMValGroup/ESMValCore/blob/main/esmvalcore/preprocessor/__init__.py#L86. Not sure if any other modification if needed somewhere else but for now it seems to work!

@bouweandela
Copy link
Member

From looking at the code, I think that may only work for 1D coordinates. Maybe something like this would be more general?

    ancillary_var = iris.coords.AncillaryVariable(
        ancillary_cube.core_data(),
        standard_name=ancillary_cube.standard_name,
        units=ancillary_cube.units,
        var_name=ancillary_cube.var_name,
        attributes=ancillary_cube.attributes,
    )

    data_dims = [None] * ancillary_cube.ndim
    for coord in ancillary_cube.coords():
        for ancillary_dim, cube_dim in zip(ancillary_cube.coord_dims(coord), cube.coord_dims(coord)):
            data_dims[ancillary_dim] = cube_dim
    if None in data_dims:
        none_dims = ", ".join(str(i) for i, d in enumerate(data_dims) if d is None)
        msg = (
            f"Failed to add {ancillary_cube} to {cube} as ancillary variable. "
            f"No coordinate associated with ancillary cube dimensions {none_dims}"
        )
        raise ValueError(msg)

    cube.add_ancillary_variable(ancillary_var, data_dims)

@jlenh
Copy link
Author

jlenh commented Dec 20, 2024

The only problem I would have with this solution is that we only check the name and not the actual values of the coordinates (even though most of the preprocessing is handled through ESMValCore similarly for the input cube and ancillary variable so it should be fine I guess).
Could adding a simple
data_dims[ancillary_dim] = cube_dim if coord == cube.coord(coord) else None
do the trick?

@bouweandela
Copy link
Member

Yes, I think so. It will be a bit slow. Maybe it would be sufficient to only loop over DimCoords, those are not Dask arrays and will be relatively fast to compare.

@bouweandela
Copy link
Member

So replace for coord in ancillary_cube.coords() with for coord in ancillary_cube.dim_coords

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants