-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correctly interpolate seasons in Grouper #2019
base: main
Are you sure you want to change the base?
Conversation
I just realised that the factor of 1/6 is assuming that all seasons have the same length which in gregorian calendars is not necessarily true but I am not sure it matters too much at least the function should be smooth. |
Warning This Pull Request is coming from a fork and must be manually tagged |
Weirdly and contrary to what I showed yesterday, today I am still getting clear transitions as if there still wasn't any linear interpolation. |
@saschahofmann We recently changed the layout of xclim to use a |
I reinstalled xclim but I am still getting very similar results to before the "fix". You have any advice on where else I could look? |
Could it be that you have obsolete |
I managed to install the environment, for some reason I only had the branch "main" when I cloned the fork yesterday
import inspect
print(inspect.getsource(sdba.base.Grouper.get_index))
I'll try to have look later. Maybe the |
I am pretty sure that the |
It's simply from xclim import sdba
QM = sdba.EmpiricalQuantileMapping.train(
ref, hist, nquantiles=15, group="time.season", kind="+"
)
scen = QM.adjust(sim, extrapolation="constant", interp="nearest")
scen_interp = QM.adjust(sim, extrapolation="constant", interp="linear")
outd = {
"Reference":ref,
"Model - biased":hist,
"Model - adjusted - no interp":scen,
"Model - adjusted - linear interp":scen_interp,
}
for k,da in outd.items():
da.groupby("time.dayofyear").mean().plot(label=k)
plt.legend() This doesn't reproduce your figure however. It seems your figure above was matching the reference very well, better than what I have even with the linear interpolation. But it does get rid of obvious discontinuities. |
@coxipi I think only mention this in the original issue: my analysis is done with |
Yes, I have seen simlilar things by playing with the choice of how |
In any case, the mean over the 15-year period should be closer to the target, time.season is way off (and in fact, I maintain, it should be equal to the time adjustment.)
Got it, sorry I read too fast. Good observation. It seems |
I think I am getting closer. I tried to replicate whats happening inside from scipy.interpolate import griddata
oldx = qdm_season.ds.hist_q.sel(season='MAM')
oldg = np.ones(20)
oldy = qdm_season.ds.af.sel(season='MAM')
value = hist.sel(time='2000-04-15')
newx = value
newg = u.map_season_to_int(value.time.dt.season)
griddata(
(oldx.values, oldg),
oldy.values,
(newx.values, newg),
method='nearest'
) If I didn't make mistake that should mimick the function but it leads to 0.20145109. |
Ok I just need to find the difference between this code and the one running from scipy.interpolate import griddata
oldx = qdm_season.ds.hist_q.sel(season="MAM")
oldg = np.ones(20)
oldy = qdm_season.ds.af.sel(season="MAM")
value = hist_spring
newx = value
newg = u.map_season_to_int(value.time.dt.season)
afs = griddata((oldx.values, oldg), oldy.values, (newx.values, newg), method="nearest")
af.groupby("time.dayofyear").mean().plot(label="interpolated af time")
af_corrected = xr.DataArray(afs, coords=dict(time=hist_spring.time))
af_corrected.groupby('time.dayofyear').mean().plot( linestyle="--",
label='seasonal af manually computed with griddata')
plt.legend() |
Great, I have reproduced your example def _interp_on_quantiles_2d(newx, newg, oldx, oldy, oldg, method, extrap):
mask_new = np.isnan(newx) | np.isnan(newg)
mask_old = np.isnan(oldy) | np.isnan(oldx) | np.isnan(oldg)
out = np.full_like(newx, np.nan, dtype=f"float{oldy.dtype.itemsize * 8}")
if np.all(mask_new) or np.all(mask_old):
warn(
"All-nan slice encountered in interp_on_quantiles",
category=RuntimeWarning,
)
return out
out[~mask_new] = griddata(
(oldx[~mask_old], oldg[~mask_old]),
oldy[~mask_old],
(newx[~mask_new], newg[~mask_new]),
method=method,
)
# if method == "nearest" or extrap != "nan":
# # 'nan' extrapolation implicit for cubic and linear interpolation.
# out = _extrapolate_on_quantiles(out, oldx, oldg, oldy, newx, newg, extrap)
return out
def bad_af_season(afq, hist):
oldg = np.ones(20)
oldy = afq.sel(season="MAM")
value = hist.where(hist.time.dt.month.isin([3,4,5]), drop=True)
newx = value
newg = u.map_season_to_int(value.time.dt.season)
afs = _interp_on_quantiles_2d(newx,newg,oldx,oldy,oldg, "nearest", "nan")
return xr.DataArray(afs, coords=dict(time=value.time))
af.groupby("time.dayofyear").mean().plot(label="interpolated af time")
good_af_season(qdm_season.ds.af,hist).groupby('time.dayofyear').mean().plot( linestyle="--",
label='seasonal af manually computed with griddata')
bad_af_season(qdm_season.ds.af,hist).groupby('time.dayofyear').mean().plot( linestyle="--",
label='seasonal af manually computed with xclim internals')
plt.legend() And I get the same (the functions are quite similar, not surprised, but I still wanted to confirm quickly). I was having some problems with numba, I commented the "extrapolate", but we have problems when method!="nearest" too, so that should not be the problem. So now I suspect that it might be because of the arguments |
Yes I'm also still trying to figure out the difference but I guess I ran into the same problem as you with the extrapolation |
Happy new year everyone! Sorry for disappearing. The holidays came out of nowhere 😏 but I am back now and really wanna get to the bottom of this and I am think I am closing in on at least one issue: the behaviour changes a lot on whether or not if method == "nearest" or extrap != "nan":
# 'nan' extrapolation implicit for cubic and linear interpolation.
out = u._extrapolate_on_quantiles(out, oldx, oldg, oldy, newx, newg, extrap) I will now try to understand what this function does but I am pretty sure we will find something there. However, this function is only called for |
Turns out that a large part of the newx values are out of bounds (either bnds = _first_and_last_nonnull(oldx)
xp = np.arange(bnds.shape[0])
toolow = newx < np.interp(newg, xp, bnds[:, 0])
toohigh = newx > np.interp(newg, xp, bnds[:, 1]) |
I am trying to understand how it works and I am a little concerned about this |
Happy new year! Nice work! The months are from 1..12, then with cyclic coords become 0..13 which works correctly with the I'm a bit rusty after holidays, you mention a first issue being solved, what would be the second? That's related to your comment:
? |
This finally also looks as expected: |
Hmm, in your last plot, the biased and adjusted models are the same? |
In theory, the line should only be called when |
Sorry was plotting the wrong thing, edited the comment. |
I'm hopeful this will work. The condition is an OR: |
I see your point. Maybe the sufficiently high number of quantiles (20 or 50) and the fact that you average over 15 years is enough to make this smooth. If you look directly at the time series, the "nearest" should be less smooth? Anyways, great work, thanks a lot! |
I made the same error above when I commented the extrapolate haha... maybe I influenced your reading |
Hum, in the QDM case, the linear interpolation seems to have some issues? |
Dang now I see it too somehow I was focusing on the nearest. Let's see. |
Ok here the linearly interpolated I believe this is due to the problem I mentioned in December and better summarised by you Gonna think about this tomorrow. |
Do you have any resources on someone else doing this? The originally paper Cannon et al. (2015 ) doesn't seem to look at monthly or seasonal adjustments/ the only thing I could find was them saying to use a sliding window:
|
No, unfortunately, I'm searching right now. @aulemahal , Sascha fixed one problem, in the extrapolation, it was assumed that values of season/month would start at 0, but for season with periodic condition, it can go below zero, this needed a change. But now, we have the problem I describe , e.g.:
Do you agree this is a problem? Do you know if people explored specifically the use of QDM with seasons in the litterature? |
I was also thinking the other option for a fix for the extrapolation without needing to change the extrapolate function is changing the mapping of the seasons to start at 1 (so that cyclic_bounds would add 0 and 5). Not sure which of the two you prefer. I think the current fix might be more robust to future changes because it simply uses the coordinate values. |
I don't think I have ever read such a paper (neither for QDM nor for any other QMs). Maybe @huard remembers if we had sources in mind when implementing it ? My "fear" is that we implemented it because it was possible, because |
Pull Request Checklist:
number
) and pull request (:pull:number
) has been addedWhat kind of change does this PR introduce?
This PR adds a line to correctly interpolate seasonal values. It also changes the test_timeseries function that now accepts a
calendar
argument instead ofcftime
. Not providing it or providingNone
is equivalent tocftime=False
andcalendar='standard
to the previouscftime=True
. This allows for testing different calendar implementations e.g. 360_day calendars