-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added gaussian process #19
base: master
Are you sure you want to change the base?
Conversation
added gp, sparse gp, studentsT gp
added tests for gp, studentsT process
updated CHANGELOG
Thanks for the contribution @Emaasit , I think this would make a great contribution to the library. I'll try to have a look at it in the next few days. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Emaasit: Thanks for submitting this! I finally had the time to look at your PR.
I made a bunch of changes to the library recently so could you please merge/rebase in master and resolve any conflicts?
I'm not very familiar with Gaussian processes, but I looked in your notebooks and it seems like the models aren't doing very well. Is that expected behavior? I really like the Criticize the model
steps in the notebooks though.
Most of my other comments are style things to make the code consistent with the rest of the library.
@@ -1,2 +1,9 @@ | |||
__version__ = "1.1.3" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added in a _version.py
file in master so you can remove this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you might have to modify your notebooks too.
""" | ||
|
||
def __init__(self, prior_mean=0.0): | ||
self.ppc = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could these properties be alphabetized?
# cov = signal_variance**2 * pm.gp.cov.ExpQuad(1, length_scale) | ||
cov = signal_variance ** 2 * pm.gp.cov.Matern52(1, length_scale) | ||
|
||
# mean_function = pm.gp.mean.Zero() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment outdated since you allow the user to specify a prior_mean now?
signal_variance = pm.HalfCauchy('signal_variance', beta=5, shape=(1)) | ||
noise_variance = pm.HalfCauchy('noise_variance', beta=5, shape=(1)) | ||
|
||
# cov = signal_variance**2 * pm.gp.cov.ExpQuad(1, length_scale) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this line needed?
if self.cached_model is None: | ||
self.cached_model = self.create_model() | ||
|
||
self._set_shared_vars({'model_input': X, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to format longer lines like this:
self._set_shared_vars({
'model_input': X,
'model_output': np.zeros(num_samples)
})
Could you please change your code to match that style of indenting?
'model_output': np.zeros(num_samples)}) | ||
|
||
with self.cached_model: | ||
f_pred = self.gp.conditional("f_pred", X) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please switch to single quotes to be consistent with the rest of the code.
'model_output': model_output, | ||
} | ||
|
||
self.gp = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the gp
property is already set to None since it inherits from GaussianProcessRegression
. Is this needed here?
self.length_scale) | ||
|
||
mean_func = pm.gp.mean.Zero() | ||
f_ = np.random.multivariate_normal(mean_func(X).eval(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I started running the unittests on Travis, I realized I forgot to set a random seed so the tests weren't repeatable. Could one to all your tests that generate data? Like here, https://github.com/parsing-science/pymc3_models/blob/master/tests/models/test_LinearRegression.py#L23
int(self.test_GPR.summary['mean']['noise_variance__0']), | ||
0) | ||
|
||
# def test_nuts_fit_returns_correct_model(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this test commented out because it takes too long to run? If so, could you please leave a comment about that?
|
||
|
||
class StudentsTProcessRegressionScoreTestCase(StudentsTProcessRegressionTestCase): | ||
def test_score_matches_sklearn_performance(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this test a lot! Good call comparing to sklearn.
[1.1.4] - 2018-09-06
Added