-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Report feature quality #5
Comments
Is feature quality consistent across various datasets and sources?Are there groups with overall high or low quality according to median scores across datasets?Same analysis but with including lincs: |
Feature replicability across datasets
TA-ORF, Jump-ORF and Jump-CRISPRJump-Compound and LINCS
|
We discussed in checkin that most JUMP sources do not have replicates of a given compound, except for the Target2 plates that most partners did in many replicates (except source_1 did not do that plate, explaining why its result is quite different though we are not sure what compounds are shown in the plot because there would be very few with replicates at all!). So here we are probably looking at the results for around 300 compounds. Another exception is that the three wave 2 partners may have had 2 replicates per compound because they had a different swapping scheme. Overall, Marzieh if you're able to describe some conclusions here from each result that would be great because it's hard to grasp just looking at the plots what analysis is happening. Thx! |
@AnneCarpenter sure these results are not complete yet. I just wanted to show you the mito_radialdistribution category quality being low according to the median over datasets in the checkin (which has the caveat of high variance across datasets). I will go trough a complete interpretation once the tasks in this issue are complete. For now I can say that feature quality seems to be consistent among various batches within an experiment but that doesn't hold across datasets. That means that we cant say for example this specific group of features are always low quality relative to the rest of features in all cell painting experiments/datasets but we can make such a statement for different batches within a dataset/source/experiment. But let's pause here and come back to it once I have all I need to have a conclusion. |
cc @shntnu @MarziehHaghighi
Here is list of materials to generate and questions to explore. Please add anything else you think might be useful to check to this list by directly editing my comment:
We each of three ORF, CRISPR and Compound datasets, I check the following:
UPDATE:
Basic tables as reference:
Can we trust the ranking?
The current quality metric ranks features based on their consistency in an experiment, does features replicate across experiments?
The text was updated successfully, but these errors were encountered: