Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On the positive definiteness of Gram matrix in EASE_R recommender #14

Closed
tohtsky opened this issue Aug 20, 2020 · 2 comments
Closed

On the positive definiteness of Gram matrix in EASE_R recommender #14

tohtsky opened this issue Aug 20, 2020 · 2 comments

Comments

@tohtsky
Copy link

tohtsky commented Aug 20, 2020

Hi, congratulations on your new paper! I'm really enjoying them!
Do you have any plans to further extend your analysis towards graph convolutional (not ConvNCF -like methods) NN methods?
(Because some of them seem to have same methodological issues you mentioned in your RecSys 2019 paper like this).

I have a question regarding the Gram matrix which is inverted in EASE_R recommender below:

# Grahm matrix is X X^t, compute dot product
similarity = Compute_Similarity(self.URM_train, shrink=0, topK=self.URM_train.shape[1], normalize=False, similarity = "cosine")
grahm_matrix = similarity.compute_similarity().toarray()
diag_indices = np.diag_indices(grahm_matrix.shape[0])

I wanted to examine whether I can speed up the matrix inversion by using Cholesky decomposition method.
However, it complained that the Gram matrix, even after including regularization term, is not positive definite, which is unacceptable.

So I noticed that your Compute_Similarity class sets that similarity between an item and itself to be 0 like this:

>>> similarity = Compute_Similarity(X_URM, shrink=0, topK=X_URM.shape[1], normalize=False, similarity = "cosine")
>>> grahm_matrix = similarity.compute_similarity().toarray()
>>> grahm_matrix
array([[ 0.,  5.,  6., ...,  0., 13., 14.],
       [ 5.,  0., 12., ...,  3.,  5., 38.],
       [ 6., 12.,  0., ...,  3.,  7., 17.],
       ...,
       [ 0.,  3.,  3., ...,  0.,  6.,  9.],
       [13.,  5.,  7., ...,  6.,  0., 34.],
       [14., 38., 17., ...,  9., 34.,  0.]], dtype=float32)

while X_URM.T.dot(X_URM) yields

>>> X_URM.T.dot(X_URM).toarray()
array([[ 48,   5,   6, ...,   0,  13,  14],
       [  5, 119,  12, ...,   3,   5,  38],
       [  6,  12,  46, ...,   3,   7,  17],
       ...,
       [  0,   3,   3, ...,  18,   6,   9],
       [ 13,   5,   7, ...,   6, 117,  34],
       [ 14,  38,  17, ...,   9,  34, 312]], dtype=int32)

which are basically the same except for the diagonals.

So accordingly, I think that the resulting W_sparse differs from the original version of EASE.

@MaurizioFD
Copy link
Owner

MaurizioFD commented Aug 26, 2020

Hello,

Thank you! I am glad you fould our work interesting :)

We hope that our work will increase the awareness regarding how frequent methodological problems are in published research (even in high level venues), such that they may be spotted more easily and that evaluation and reproducibility studies will be more common in our community. The issues that seem to be present in the project you mention (information leakage and erroneous NDCG implementation) are not tied to the nature of the method itself (as opposed to those of ConvNCF) and were already identified in our previous analysis as rather common. We may in the future try to do another study like the RecSys 2019 to see if something has changed.

About your question on EASE_R, you are indeed correct. Thank you for spotting the problem!
The Compute_Similarity class is used for the KNNs and I had forgotten it removes the self-similarity, hence the diagonal. I have updated EASE_R to fix the problem, recomputed all results and updated the analysis.

@tohtsky
Copy link
Author

tohtsky commented Aug 27, 2020

I'm amazed at how quickly the bugs are fixed and the experiments are updated!
It seems that the fix improves the performance of EASE_R especially for Movielens20M dataset,
which confirms the author's original claim.

We may in the future try to do another study like the RecSys 2019 to see if something has changed.

Great! Looking forward to seeing your future work!

@tohtsky tohtsky closed this as completed Aug 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants