Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lonely points #2

Open
shanishalgi opened this issue Jul 18, 2016 · 5 comments
Open

Lonely points #2

shanishalgi opened this issue Jul 18, 2016 · 5 comments

Comments

@shanishalgi
Copy link

Hi,
I'm trying to use LargeVis to visualize my doc2vec features of 20NG (7532 test documents, 100 features each). I'm using all the default parameters, and I get the following result.
largevis20ng
I was surprised by he lonely points in the data as their corresponding documents were not noticeably different than the others in their category. I tried running the algorithms after taking these documents out of the dataset, but got a similar pattern of results - a few ~6-9 lonely points representing seemingly normal documents. I previously modeled this data using various TSNE methods and none showed such a pattern of results. I am wondering if there is a simple explanation or something I am overlooking?

Also, plot.py only works for me if I change in row 29: vec[1], vec[2] to vec[0], vec[1]

Thanks in advance
Shani

@lferry007
Copy link
Owner

Hi,

I have fixed the bug in plot.py. Please try it!

Your visualization indeed looks odd. Could you please share with us the feature vectors you were using? Thanks!

@shanishalgi
Copy link
Author

Attached is my feature vector file (and their labels)
ng20test_features.txt
ng20test_labels.txt

@lferry007
Copy link
Owner

Hi,

Using the feature vectors you shared and default parameter settings, we get the following visualization.
ng

Since your data set is relatively small, we try to decrease the parameter "-neigh" to 50 and the parameter "-perp" to 20, and get visulization like this:
ng_neigh50_perp20

It seems to look better. (We don't try other settings further.)

@shanishalgi
Copy link
Author

Hi,
Has the code been changed since I uploaded my data? Because I used the
default settings and got a very different result.

On Wed, Sep 14, 2016 at 5:17 PM, lferry007 [email protected] wrote:

Hi,

Using the feature vectors you shared and default parameter settings, we
get the following visualization.
[image: ng]
https://cloud.githubusercontent.com/assets/15796471/18515282/63320bb2-7ac7-11e6-8335-8d4a86925c11.png

Since your data set is relatively small, we try to decrease the parameter
"-neigh" to 50 and the parameter "-perp" to 20, and get visulization like
this:
[image: ng_neigh50_perp20]
https://cloud.githubusercontent.com/assets/15796471/18515328/924aae5e-7ac7-11e6-999f-549cfd2f75ce.png

It seems to look better. (We don't try other settings further.)


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#2 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AK99YzYjoOR-mxMmDSnVMi__A-7aEM-2ks5qqAHngaJpZM4JOiKR
.

@tangjianpku
Copy link
Collaborator

We've updated the code and you can have a try. If there is still a problem, it may be the problem of system configuration.

Thanks,
Jian
https://sites.google.com/site/pkujiantang/home

sin-mike pushed a commit to sin-mike/LargeVis that referenced this issue Nov 21, 2017
…y_to_largeviz_input

Feature/2017 03 27/npy to largeviz input
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants