-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable bottleneck by default? #7344
Comments
I kinda think correctness by default is more important than performance, especially if the default performance isn't prohibitively slow. |
I'd be fine with disabling, since bottleneck doesn't seem to be actively maintained. Though I would say it's numerically unstable rather than incorrect! I would always want it enabled, but it does make sense to default to the conservative option. I had dreams of making numbagg into a better bottleneck — it's just as fast, much more flexible, and integrates really well with xarray. But those dreams have not come to pass (yet!). |
The case where Bottleneck really makes a difference was for moving window statistics, where it uses a smarter algorithm than our current NumPy implementation, which creating a moving window view. Otherwise, I agree, it probably isn't worth the trouble. That said -- we could also switch to smarter NumPy based algorithms to implement most moving window calculations, e.g,. using |
I fully agree that correctness is the priority. Note however that some functions now require bottleneck, like ffill and bfill (I am not sure if there are more cases). They may need to be modified so they can run without bottleneck. |
I think it's OK to still require bottleneck for
|
Maybe it is just a problem of documenting it more clearly? |
I want to add a +1 to disable it by default. It's pretty common to be using |
I recently stumbled across a case where bottleneck gives a completely incorrect answer or segfaults when taking the max of a small array (similar to: pydata/bottleneck#381 - a fix was proposed but never merged).
As of #8389 |
I would support disabling bottleneck by default for now, and eventually permanently removing support for bottleneck. Numbagg seems to be a more than capable and much more maintainable replacement. |
As an update — numbagg is in a pretty good place now, and xarray defaults to using numbagg over bottleneck for overlapping functions when both are installed. (I didn't finish my fuzzing work in numbagg, so I'm not as confident as I'd hope to be, but also no one has reported anything for a long time, so the "fuzzing from the world" has been successful...) Numbagg doesn't do all of bottlenecks functions — there are functions that aren't practical to write with numba, since they require specialized data structures, such as moving max. ...though I'm not sure if that updates us towards or away from disabling it by default — the substitute is better than it used to be, but also the substitute overrides many of its functions by default when installed... |
If I don't have numbagg installed but happen to have |
What is your issue?
Our choice to enable bottleneck by default results in quite a few issues about numerical stability and funny dtype behaviour: #7336, #7128, #2370, #1346 (and probably more)
Shall we disable it by default?
The text was updated successfully, but these errors were encountered: