Prolonged running time for PEER #162

hsun3163 · 2022-02-17T20:25:00Z

At the moment, A peer analysis on log2cpm expression data in Ast, with 1000 maximum iteration + 60 maximum factors have not been complete after 24 Hr.

The main toll on time is the PEER_update function. which run 1000 times iteratively.

For this particular dataset, it take ~15hr to complete ~400 iterations, and the stdout file fail to update afterward.

The R wrapper for peer do not seems to have any function that could use more resource to speed up the analysis, possibly due to the fact that R only use 1 cores.

Therefore, I wonder is there anyway we could optimize this process./make compromise to speed up this process, i.e. lower the number of factors to be estimated?

gaow · 2022-02-22T19:22:24Z

As discussed let's try to hack APEX into taking fake VCF file ... but i'm still thinking if we want to run PEER separately by asking for more CPU hours? there are a few other methods out there for the remove unwanted variation (RUV) type of analysis (see here on page 2 for a review). I don't feel motivated doing comparisons of these methods by the number of QTLs discovered, as in APEX paper (where the bi-cv method seems to outperform PEER). But at least we should try to provide this data from PEER for the analysis, because "GTEx endorsed" ?

hsun3163 · 2022-02-23T18:54:56Z

It is less about the walltime, but more about the fact that peer stop at iteration ~400 every time it ran, which I have no way to debug.

From the PEER repo, their developer do suggest using a updated version of PEER: mz2/peer#16 which may be better supported in terms of efficiency

gaow · 2022-02-24T16:11:37Z

Thanks @hsun3163 i had a long offline discussion with the developer's team. We'll implement a version of it based on our discussions. I'll document it in more detail on the PEER and APEX module pages.

This was referenced Feb 17, 2022

A sure way to know if task still running needed. #163

Closed

Default number of factors in APEX factor analysis #139

Closed

gaow mentioned this issue Feb 24, 2022

Feature or bug: PEER change the original covariate slightly in its output. #144

Closed

gaow closed this as completed Feb 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prolonged running time for PEER #162

Prolonged running time for PEER #162

hsun3163 commented Feb 17, 2022

gaow commented Feb 22, 2022

hsun3163 commented Feb 23, 2022

gaow commented Feb 24, 2022

Prolonged running time for PEER #162

Prolonged running time for PEER #162

Comments

hsun3163 commented Feb 17, 2022

gaow commented Feb 22, 2022

hsun3163 commented Feb 23, 2022

gaow commented Feb 24, 2022