-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Relate Killed? #113
Comments
That's a large cohort! Also, to make sure I understand your splits ,you are sending a total of 78K samples per run? Is that right? |
This is the error command - b'somalier version: 0.2.16\n[somalier] starting read of 78306 samples\nKilled\n'. I have run it combining splits (~156,000) and just within one split (78,306 samples). I guess I will just keep splitting until I have enough resources to run. |
@brentp Another question that I am curious about, my group has previously used KING to estimate the relatedness of individuals in our cohort, but our cohort has grown so large that we are testing out Somalier. Using KING, we defined 2nd degree relatives as those with kinship > 0.0884. Does this threshold fit with the relatedness predictions from Somalier or is there another threshold we should use? Thank you for all of your hard work on this |
That cutoff should generally be ok. The problem is if you have 20K samples, and a single sample with low-quality, that may appear to be related at that level to 19,999 other samples. So you do have to do some filtering. Another thought is that it's most likely to die generating the JSON for the html as that uses a lot of memory. But even in that case, you may still get the text-output. |
Hello,
I am trying to run relate on > 400,000 samples. I know this is quite the task and I believe I have figured out a way to run while chunking the samples into 6 different groups and comparing head to head. For example, I would compare split 1 to split 2 which would consist of ~78,000 samples in each group (~156,000 per relate run). I thought this would fix my issues with memory, however, when I run Somalier, it still gets killed within seconds. I calculated I would need at least 42 Gb of memory to run this and my machine has 250Gb available. Any other ideas of how to work around this?
The text was updated successfully, but these errors were encountered: