Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add opt_sample that performs optimizations prior to calling pymc.sample #396

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

ricardoV94
Copy link
Member

@ricardoV94 ricardoV94 commented Dec 5, 2024

Includes building blocks for #358

import pymc as pm
import pymc_experimental as pmx

with pm.Model() as m:
    p = pm.Beta("p", 1, 1, shape=(1000,))
    y = pm.Binomial("y", n=100, p=p, observed=[1, 50, 99, 50]*250)

    idata = pm.sample(verbose=True) 

# Applied optimization: beta_binomial_conjugacy 1x
# ConjugateRVSampler: [p]

CC @wd60622 @theorashid

Hopefully it will make it easy for follow-up PRs to add more cases

@ricardoV94 ricardoV94 requested a review from zaxtax December 5, 2024 15:34
@ricardoV94 ricardoV94 force-pushed the add_opt_sample branch 2 times, most recently from 403a6bd to 2525ea8 Compare December 5, 2024 15:49
@ricardoV94 ricardoV94 added the enhancements New feature or request label Dec 5, 2024
@wd60622
Copy link
Contributor

wd60622 commented Dec 7, 2024

Need to try out. This example in description is different than the docstrings. Both work?

import pymc as pm
import pymc_experimental as pmx

with pm.Model() as m:
    p = pm.Beta("p", 1, 1, shape=(1000,))
    y = pm.Binomial("y", n=100, p=p, observed=[1, 50, 99, 50]*250)

    idata = pm.sample(verbose=True) 

# Applied optimization: beta_binomial_conjugacy 1x
# ConjugateRVSampler: [p]

@ricardoV94
Copy link
Member Author

Need to try out. This example in description is different than the docstrings. Both work?

import pymc as pm
import pymc_experimental as pmx

with pm.Model() as m:
    p = pm.Beta("p", 1, 1, shape=(1000,))
    y = pm.Binomial("y", n=100, p=p, observed=[1, 50, 99, 50]*250)

    idata = pm.sample(verbose=True) 

# Applied optimization: beta_binomial_conjugacy 1x
# ConjugateRVSampler: [p]

Yes, I wanted an example that is faster than NUTS, but for testing it would be overkill

pymc_experimental/sampling/optimizations/conjugacy.py Outdated Show resolved Hide resolved

conjugate_a = a + y
conjugate_b = b + (n - y)
extra_dims = range(binomial_rv.type.ndim - beta_rv.type.ndim)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work in all cases?

Copy link
Contributor

@wd60622 wd60622 Dec 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the cases where there is some type of ExpandDims on one of the parameters would won't detect conjuncy since the owner is not Beta:

N = 100
X = np.repeat(
    [1, 50, 99],
    250,
).reshape(3, 250)


coords = {"idx": range(each), "kind": ["low", "mid", "high"]}
with pm.Model(coords=coords) as model:
    p_raw = pm.Beta("p", 1, 1, dims="kind")
    p = p_raw[:, None]
    y = pm.Binomial("y", n=N, p=p, observed=X, dims=("kind", "idx"))

    idata = pmx.opt_sample(verbose=True)

Or case with shapes like this which actually fails it:

N = 100
X = np.repeat(
    [1, 50, 99],
    250,
).reshape(3, 250)

with pm.Model() as model:
    p = pm.Beta("p", 1, 1, shape=(3, 1))
    y = pm.Binomial("y", n=N, p=p, observed=X, shape=(3, 250))

    idata = pmx.opt_sample(verbose=True)

Copy link
Member Author

@ricardoV94 ricardoV94 Dec 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm being lazy and only allowing expand_dims on the left. I wanted to keep things simple for the expression of the conjugate beta alpha and beta params, specially with batch dims of the Binomial. We can make it more flexible

Copy link
Member Author

@ricardoV94 ricardoV94 Dec 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To answer your first question, it does not. If alpha already had one dimension, I should sum contributions from y/n-y and keep that dimension. Basically I have to unbroadcast uses of the same alpha/beta parameters.

I guess that's what you're seeing as a failure in your second example

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw this sort of things are what's usually missing from the literature on conjugacy and bayesian model rewrites. Things are much more tricky once you start working with arrays of parameters

Copy link
Member Author

@ricardoV94 ricardoV94 Dec 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed the second example. The first one is explicitly excluded as it was before. Consider it not implemented functionality.

The challenge is we would need to map the alpha/beta to the transformations that happen afterwards. and right expand_dims is kind of like a transpose. The left expand dims is supported because that's just default numpy-like broadcasting

Comment on lines 136 to 190
conjugate_a = a + y
conjugate_b = b + (n - y)
extra_dims = range(binomial_rv.type.ndim - beta_rv.type.ndim)
if extra_dims:
conjugate_a = conjugate_a.sum(extra_dims)
conjugate_b = conjugate_b.sum(extra_dims)
conjugate_beta_rv = Beta.dist(conjugate_a, conjugate_b)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would summing through data to make summary stats used be more general?

For instance, y.sum(...) and (n - y).sum(...)

Copy link
Member Author

@ricardoV94 ricardoV94 Dec 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would mix independent dimensions of the beta. The model would no longer be equivalent to the original one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically a could itself not be a scalar

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am saying that the parameter, p, and its shape defines the model and to keep that. Make n and y broadcastable with p only as that will make broadcastable with a and b. For instance,

a: ()
b: ()
p: ()
n: (100, )
y: (100, )

Then to sum n and y to match dim of p, (). n.sum(axis=0), y.sum(axis=0)

Another example :

a: (3, )
b: ()
p: (4, 3) # Defined from shape
n: ()
y: (100, 4, 1)

n summed to something broadcastable with (4, 3)
n, y.sum(axis=0)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# Determined shape from p
y_stat = ...
n_stat = ...

# Cookie-cutter conjugate logic
a_post = a + y_stat
b_post = b + (n_stat - y_stat)
# Translate the conjugate parameters to PyMC distribution
conjugate_beta_rv = Beta.dist(alpha=a_post, beta=b_post) # Same shape as p before. Maybe shape needs to be passed as well

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that was the goal originally but as you found out it was not implemented correctly. I was only handing new dims. The last push has the sum_bcast_dims that reduces a_post/b_post to the shape of the original beta RV

Copy link
Member Author

@ricardoV94 ricardoV94 Dec 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a: (3, )
b: ()
p: (4, 3) # Defined from shape
n: ()
y: (100, 4, 1)

This example is not possible/not supported. the (100, 4, 1) y cannot map to the (4, 3) of the p, unless you slice or something, which we don't allow

Copy link
Contributor

@wd60622 wd60622 Dec 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but in the operation to get to posterior parameters, they are fine and broadcast to the desired shape.

Also, why not pass shape to new Beta.dist?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but in the operation to get to posterior parameters, they are fine and broadcast to the desired shape.

It matters how that shape went from 3 to 1. Where they summed/sliced/multiplied. Was there another operation in between like exponentiation/ reciprocal?

By only allowing direct use or left expand dims I rule out all these cases that I wouldn't know how to conjugate.

Also, why not pass shape to new Beta.dist?

Yup, it's doing that now.

@ricardoV94 ricardoV94 force-pushed the add_opt_sample branch 2 times, most recently from 575beee to a5c036a Compare December 8, 2024 16:38
@ricardoV94
Copy link
Member Author

@wd60622 I had forgot to push the updates. I wasn't trying to gaslight you :D

@ricardoV94 ricardoV94 force-pushed the add_opt_sample branch 2 times, most recently from 7bdb03e to c1809e8 Compare December 11, 2024 10:11
@ricardoV94 ricardoV94 requested a review from wd60622 December 11, 2024 13:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancements New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants