Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Support for units objects in rvar data type #385

Open
henningte opened this issue Dec 10, 2024 · 6 comments
Open

Feature Request: Support for units objects in rvar data type #385

henningte opened this issue Dec 10, 2024 · 6 comments
Labels
feature New feature or request

Comments

@henningte
Copy link

henningte commented Dec 10, 2024

I often use the units package to keep track of measurement units and I think it may be interesting to support units objects with the rvar class. I wanted to know whether there is interest to include this feature into the posterior package.

Currently, it is possible to create rvar objects where the draws attribute is a units object:

suppressPackageStartupMessages(library(posterior))
#> Warning: package 'posterior' was built under R version 4.3.2
suppressPackageStartupMessages(library(units))

a <- rvar(units::set_units(1:5, "g"), dim = 1)
draws_of(a)
#> Units: [g]
#>   [,1]
#> 1    1
#> 2    2
#> 3    3
#> 4    4
#> 5    5
class(draws_of(a))
#> [1] "units"

Created on 2024-12-10 with reprex v2.0.2

And simple arithmetic operations work:

class(draws_of(a + a))
#> [1] "units"

But some operations drop the units class. For example:

class(mean(a)) # should return a units object, but returns a numeric vector
#> [1] "numeric"
class(draws_of(rvar_mean(a))) # should keep the units class, but drops it
#> [1] "matrix" "array"

I only did a couple of quick tests, so I have no complete overview, which operations are supported and which are not supported. I can also imagine that adapting some functions may be more difficult.

@paul-buerkner
Copy link
Collaborator

Thank you for opening this issue! @mjskay since you are the developer of rvar, what are your thoughts on this?

@paul-buerkner paul-buerkner added the feature New feature or request label Dec 17, 2024
@mjskay
Copy link
Collaborator

mjskay commented Dec 17, 2024

Hmm, in principle it seems like it should be possible to support other backing datatypes so long as they implement the necessary array operations. If we're going down that road I'd want to think about how to do it in a general enough way that we don't have to explicitly support every array-like datatype under the sun, but instead base it on some consistent extension points.

Part of the problem is that several rvar methods currently use specialized functions under the hood for performance reasons but which only (I think?) work on numeric vectors (e.g. matrixStats::col***, tensorA::mul.tensor()). One solution could be a parameterized type, say rvar[T], where T is a type (like units) implementing the necessary interface. Then we could create a default implementation of rvar methods that do not use specialized functions, and move specialized implementations to (say) rvar[numeric]. A starting point could be to factor out the code that was needed to get factor-like rvars to work, creating an rvar[numeric], rvar[factor], etc.

However, I would not want to do this using multiple dispatch as implemented in {vctrs}, because it is (frankly) a giant hack that will require a lot of boilerplate code (and may not even be doable...). Instead, I think the right solution would be to build it on top of S7. I'm not sure if S7 is quite mature enough yet for some of the things we want to do, but it's close. I have been playing with building parameterized types on top of S7 in another project and it can work pretty well (and I think they may also add some kind of official support for it). The main thing will be making sure we can retain compatibility with {vctrs}, though I assume they will make vctrs S7-compatible at some point since several of the same folks are involved in each.

@paul-buerkner
Copy link
Collaborator

Thank you for your thoughts! In this light, I would say we may want to revisit this issue later on once S7 is more stable. Even then, it may not necessarily be a good idea to go this route, since (I assume) it would make the rvar code even more involved. Currently, we already largely rely on you, @mjskay, to maintain this (really cool) feature. And every increase in complexity of rvar may make it harder to maintain for others in case that would be needed at some point.

@mjskay
Copy link
Collaborator

mjskay commented Dec 18, 2024

@paul-buerkner an excellent point. Perhaps we should make maintainability a criterion of any large overhaul?

I think we could do a shift to S7 where the goal is to make rvar more maintainable, keeping feature parity. Having played with S7 quite a bit now, I think it will be much easier to read and write than vctrs. Then, if the shift to S7 works out, we could consider new features, like supporting other backing data types.

@paul-buerkner
Copy link
Collaborator

paul-buerkner commented Dec 18, 2024 via email

@henningte
Copy link
Author

henningte commented Dec 20, 2024

I also think this is a reasonable plan. Thank you all for discussing my issue so quickly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants