Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support duplicate (aka parallel) edges in GFA / FASTG files? #239

Open
2 tasks
fedarko opened this issue Mar 31, 2023 · 0 comments
Open
2 tasks

Support duplicate (aka parallel) edges in GFA / FASTG files? #239

fedarko opened this issue Mar 31, 2023 · 0 comments

Comments

@fedarko
Copy link
Member

fedarko commented Mar 31, 2023

I'm working on #75 right now (the parsing functions have been updated, but I need to update the surrounding parts of the codebase). Right now (at least in this branch), LastGraph, GML, and DOT files containing duplicate edges are all supported. These filetypes' corresponding parsing functions return nx.MultiDiGraph objects, rather than just nx.DiGraph objects like before.

The GFA and FASTG parsing functions also return nx.MultiDiGraph objects, but these functions actually delegate the work of parsing to other libraries (GfaPy and pyfastg, respectively), both of which do not allow duplicate edges. This means that, although we could support duplicate edges for these filetypes, trying to run MetagenomeScope on them will lead to an error being raised by the library to which we delegate.

I don't think addressing this is an urgent or important issue, since I haven't seen many GFA or FASTG files containing duplicate edges (I'm not even sure how you'd express duplicate edges in SPAdes-dialect FASTG files). I'm just documenting this issue here, separately from #75, so that when #75 is addressed we will still have this thread open.

  • Support duplicate edges in GFA files (relatively feasible)
  • Support duplicate edges in FASTG files (uhhhh yeah idk about this one)
fedarko added a commit to fedarko/MetagenomeScope-1 that referenced this issue Mar 31, 2023
... even though the fastg / gfa ones don't actually allow for parallel
edges in parsing (marbl#239). We just wanna *create* a multidigraph so that
in the later stages of pattern identification, etc. (marbl#202), we can
assume safely that the object we are working with is a multidigraph.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant