You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Continues a discussion with @avehtari from here. Distilled down: full-rank ADVI is constrained by memory. The mean-field approximation can be problematic for certain models. A sensible intermediate (a low-rank implementation) for certain models would be very helpful. Ong et al. (2017) described one possible implementation.
Description:
I'll briefly outline the mathematical approach of Ong et al., and leave the Stan-specific implementation details (most of which were kindly outlined in the preceding discussion) for the pull request. To generate the parameters of the model: if n is the dimension of the parameters, and r is the desired rank of our approximation, we draw eta = (z, eps) from the r + n dimensional identity Gaussian. Then zeta is distributed according to N(mu, BB^T + diag(d^2)) where mu and d are n-dimensional and B is n x r and constrained to be lower-triangular, and can be obtained from eta by the reparameterization trick with the formula zeta = mu + Bz + d * eps. zeta is then transformed to the model parameters according to ADVI.
Additional info:
I've started working on an implementation and will open a PR now.
Current Version:
v2.19.1
The text was updated successfully, but these errors were encountered:
Summary:
Continues a discussion with @avehtari from here. Distilled down: full-rank ADVI is constrained by memory. The mean-field approximation can be problematic for certain models. A sensible intermediate (a low-rank implementation) for certain models would be very helpful. Ong et al. (2017) described one possible implementation.
Description:
I'll briefly outline the mathematical approach of Ong et al., and leave the Stan-specific implementation details (most of which were kindly outlined in the preceding discussion) for the pull request. To generate the parameters of the model: if
n
is the dimension of the parameters, andr
is the desired rank of our approximation, we draweta = (z, eps)
from ther + n
dimensional identity Gaussian. Thenzeta
is distributed according toN(mu, BB^T + diag(d^2))
wheremu
andd
aren
-dimensional andB
isn x r
and constrained to be lower-triangular, and can be obtained frometa
by the reparameterization trick with the formulazeta = mu + Bz + d * eps
.zeta
is then transformed to the model parameters according to ADVI.Additional info:
I've started working on an implementation and will open a PR now.
Current Version:
v2.19.1
The text was updated successfully, but these errors were encountered: