-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Histogram Scope #114
Comments
Personally, I would use it purely to write and read the bins and data/weights from and to boost-hist, whatever is needed to recreate the boost-hist object without having to re-fill it with data. i.e. if you use LGDO to read a histogram from an .lh5 file, it should return a boost-hist, and if you use LGDO to write a boost-hist, it writes it to file in some predetermined way. That way, the histogram can be saved and communicated easily between different people and scripts and does not require adding any supporting methods to LGDO. It is a small price to convenience if the user cannot directly apply some methods to the LGDO object, but I would think it makes implementation much easier and cleaner. So this falls to your option 1. |
As Louis said, we should try to keep LGDOs as pure I/O classes as possible. Third-party formats are much better than us at data manipulation and we want to keep our code as lightweight as possible. This is the overall philosophy behind this package. As a side note: I personally think that the approach followed by the LH5IO.jl package (i.e. directly read data into Julia native data structure) is better than ours, but it would be harder to implement in Python, unfortunately. |
I see one problem with option 1, especially with large histograms. The conversion into a For small histograms this would be negligible and not worth the code overhead here. |
I think part of the problem with trying to make them pure I/O, is that sometimes I/O is best done with somewhat specialized containers (see #109). When dealing with large amounts of data, iterating becomes important, and so having a way to fill in steps becomes useful (or in the case of other objects append). In this case, the The documentation makes it sound like I'll also add that if the best way to fill histograms is through boost histogram, we should include filling in the tutorial. |
That's right. But there is no way to create a |
I recently added a
.fill
function to the lgdo histogram. This turned out to be controversial, under the argument that any non-trivial histogram operations should be done by thehist
library. So we should figure out the proper scope of this class. @lvarriano @ManuelHu @gipert @jasondetOption 1: It's purely an I/O class. If we want to do any manipulations at all on it we should look towards other implementations
Option 2: It's a data container class. This means it should have both I/O functions and functions for setting the contents (this was the logic I was following when adding fill). Under this logic the only other functions from boost-hist (https://boost-histogram.readthedocs.io/en/latest/user-guide/histogram.html) that I could imagine wanting, are
reset
, and maybeadd
. A counter argument to this is the risk of scope creep.Option 3 (which nobody wants but I include for completeness): A full fledged hist class that does all the things in https://boost-histogram.readthedocs.io/en/latest/user-guide/histogram.html. This seems clearly beyond the scope of what we want, and using
view_as
to get these kinds of manipulations seems clearly preferable.The text was updated successfully, but these errors were encountered: