Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coding mutations in non-coding variants table #231

Open
sci-kai opened this issue May 5, 2023 · 6 comments
Open

Coding mutations in non-coding variants table #231

sci-kai opened this issue May 5, 2023 · 6 comments

Comments

@sci-kai
Copy link

sci-kai commented May 5, 2023

Hi, I have an issue with the table splitting in coding and non-coding variants.
Within my dataset there are (germline) variants annotated in their consequence with "frameshift_variation" or even "coding_sequence_variant" that are sorted within the "non-coding" table.
I may not understand the criteria which define coding and non-coding variants, it would be nice to have more documentation and clarity about these to understand this error.
Here is an example variant that I found in the non-coding table:

chr: 11
position: 87622535
ref: C
alt: <DEL>
gene: Rnf43
impact: HIGH
consequence: frameshift_variant&feature_truncation

@FelixMoelder
Copy link
Contributor

FelixMoelder commented May 6, 2023

Hi @sci-kai! Thanks for reaching out. Variants for which a HGVSp value has been annotated are considered as coding variants. I assume that your variant does not have one? I am not sure if this is the one criteria but I will have a look at this after the weekend and come back to you.

Edit: I just checked and splitting variants in coding and non-coding variants is done by considering canonical transcripts and the presence/absence of a HGVSp value in each variant.

@sci-kai
Copy link
Author

sci-kai commented May 8, 2023

Thanks for the clarification! Also good to know that the transcript selection process is also performed at this step.
These found variants do not have an HGVSp value, as those are mostly structural variants called with delly that are probably difficult to annotate with HGVSp values. That explains my confusion.

@johanneskoester
Copy link
Contributor

I think we should rename the tables slightly, such that it becomes clear that noncoding can also contain variants where no information on the amino acid impact is available.

@johanneskoester
Copy link
Contributor

suggestions welcome

@sci-kai
Copy link
Author

sci-kai commented Dec 5, 2023

Hi Johannes, I think it is a good idea to add "unknown" or something similar to "noncoding", i.e., rename the "noncoding" table into "noncoding/unknown".
In general, molecular biologists filter for mutations based on the ensembl consequence terms to not miss such frameshift mutations as in my example, so maybe this should be considered for splitting the tables more in detail.

@johanneskoester
Copy link
Contributor

We also currently investigate the possibility to merge all into one, because it is cumbersome to open a separate view just to see the noncoding/unknown variants.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants