Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prolonged running time for PEER #162

Closed
hsun3163 opened this issue Feb 17, 2022 · 3 comments
Closed

Prolonged running time for PEER #162

hsun3163 opened this issue Feb 17, 2022 · 3 comments

Comments

@hsun3163
Copy link
Collaborator

At the moment, A peer analysis on log2cpm expression data in Ast, with 1000 maximum iteration + 60 maximum factors have not been complete after 24 Hr.

The main toll on time is the PEER_update function. which run 1000 times iteratively.

For this particular dataset, it take ~15hr to complete ~400 iterations, and the stdout file fail to update afterward.

The R wrapper for peer do not seems to have any function that could use more resource to speed up the analysis, possibly due to the fact that R only use 1 cores.


Therefore, I wonder is there anyway we could optimize this process./make compromise to speed up this process, i.e. lower the number of factors to be estimated?

@gaow
Copy link
Contributor

gaow commented Feb 22, 2022

As discussed let's try to hack APEX into taking fake VCF file ... but i'm still thinking if we want to run PEER separately by asking for more CPU hours? there are a few other methods out there for the remove unwanted variation (RUV) type of analysis (see here on page 2 for a review). I don't feel motivated doing comparisons of these methods by the number of QTLs discovered, as in APEX paper (where the bi-cv method seems to outperform PEER). But at least we should try to provide this data from PEER for the analysis, because "GTEx endorsed" ?

@hsun3163
Copy link
Collaborator Author

It is less about the walltime, but more about the fact that peer stop at iteration ~400 every time it ran, which I have no way to debug.

From the PEER repo, their developer do suggest using a updated version of PEER: mz2/peer#16 which may be better supported in terms of efficiency

@gaow
Copy link
Contributor

gaow commented Feb 24, 2022

Thanks @hsun3163 i had a long offline discussion with the developer's team. We'll implement a version of it based on our discussions. I'll document it in more detail on the PEER and APEX module pages.

@gaow gaow closed this as completed Feb 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants