Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors on motif with disconnected nodes #43

Open
bdpedigo opened this issue Jun 4, 2024 · 4 comments
Open

Errors on motif with disconnected nodes #43

bdpedigo opened this issue Jun 4, 2024 · 4 comments

Comments

@bdpedigo
Copy link

bdpedigo commented Jun 4, 2024

Encountered an error when feeding in some motifs that had isolated nodes.

Code:

import networkx as nx
from grandiso import find_motifs

host = nx.fast_gnp_random_graph(3, 0.5, directed=True)

motif = nx.DiGraph()
motif.add_node(0)
motif.add_node(1)
motif.add_node(2)
motif.add_edge(0, 1)
motif.add_edge(1, 0)

grandiso_motifs = find_motifs(motif, host, count_only=False)

Error:

File ~/code/skedits/skedits-app/skedits/.venv/lib/python3.11/site-packages/grandiso/__init__.py:204, in get_next_backbone_candidates(backbone, motif, host, interestingness, next_node, directed, is_node_structural_match, is_node_attr_match, is_edge_attr_match, isomorphisms_only)
    [201](https://file+.vscode-resource.vscode-cdn.net/Users/ben.pedigo/code/skedits/skedits-app/skedits/~/code/skedits/skedits-app/skedits/.venv/lib/python3.11/site-packages/grandiso/__init__.py:201)             _greatest_backbone_count = motif_backbone_connections_count
    [202](https://file+.vscode-resource.vscode-cdn.net/Users/ben.pedigo/code/skedits/skedits-app/skedits/~/code/skedits/skedits-app/skedits/.venv/lib/python3.11/site-packages/grandiso/__init__.py:202)     # Now we have _node_with_greatest_backbone_count as the best candidate
    [203](https://file+.vscode-resource.vscode-cdn.net/Users/ben.pedigo/code/skedits/skedits-app/skedits/~/code/skedits/skedits-app/skedits/.venv/lib/python3.11/site-packages/grandiso/__init__.py:203)     # for `next_node`.
--> [204](https://file+.vscode-resource.vscode-cdn.net/Users/ben.pedigo/code/skedits/skedits-app/skedits/~/code/skedits/skedits-app/skedits/.venv/lib/python3.11/site-packages/grandiso/__init__.py:204)     next_node = max(
    [205](https://file+.vscode-resource.vscode-cdn.net/Users/ben.pedigo/code/skedits/skedits-app/skedits/~/code/skedits/skedits-app/skedits/.venv/lib/python3.11/site-packages/grandiso/__init__.py:205)         _nodes_with_greatest_backbone_count,
    [206](https://file+.vscode-resource.vscode-cdn.net/Users/ben.pedigo/code/skedits/skedits-app/skedits/~/code/skedits/skedits-app/skedits/.venv/lib/python3.11/site-packages/grandiso/__init__.py:206)         key=lambda node: interestingness.get(node, 0.0),
    [207](https://file+.vscode-resource.vscode-cdn.net/Users/ben.pedigo/code/skedits/skedits-app/skedits/~/code/skedits/skedits-app/skedits/.venv/lib/python3.11/site-packages/grandiso/__init__.py:207)     )
    [209](https://file+.vscode-resource.vscode-cdn.net/Users/ben.pedigo/code/skedits/skedits-app/skedits/~/code/skedits/skedits-app/skedits/.venv/lib/python3.11/site-packages/grandiso/__init__.py:209) # Now we have a node `next_node` which we know is connected to the current
    [210](https://file+.vscode-resource.vscode-cdn.net/Users/ben.pedigo/code/skedits/skedits-app/skedits/~/code/skedits/skedits-app/skedits/.venv/lib/python3.11/site-packages/grandiso/__init__.py:210) # backbone. Get all edges between `next_node` and nodes in the backbone,
    [211](https://file+.vscode-resource.vscode-cdn.net/Users/ben.pedigo/code/skedits/skedits-app/skedits/~/code/skedits/skedits-app/skedits/.venv/lib/python3.11/site-packages/grandiso/__init__.py:211) # and verify that they exist in the host graph:
    [212](https://file+.vscode-resource.vscode-cdn.net/Users/ben.pedigo/code/skedits/skedits-app/skedits/~/code/skedits/skedits-app/skedits/.venv/lib/python3.11/site-packages/grandiso/__init__.py:212) # `required_edges` has the form (prev, self, next), with non-values filled
    [213](https://file+.vscode-resource.vscode-cdn.net/Users/ben.pedigo/code/skedits/skedits-app/skedits/~/code/skedits/skedits-app/skedits/.venv/lib/python3.11/site-packages/grandiso/__init__.py:213) # with None. That way we can easily remember and store the roles of the
    [214](https://file+.vscode-resource.vscode-cdn.net/Users/ben.pedigo/code/skedits/skedits-app/skedits/~/code/skedits/skedits-app/skedits/.venv/lib/python3.11/site-packages/grandiso/__init__.py:214) # node IDs in the next step.
    [215](https://file+.vscode-resource.vscode-cdn.net/Users/ben.pedigo/code/skedits/skedits-app/skedits/~/code/skedits/skedits-app/skedits/.venv/lib/python3.11/site-packages/grandiso/__init__.py:215) required_edges = []

ValueError: max() arg is an empty sequence

Totally understand if you want to punt on how whether/how to execute the search with these disconnected motifs, but some documentation and/or early value erroring around this would be helpful. Took me a while to figure out what I was doing wrong (since the NetworkX will execute for these cases).

@j6k4m8
Copy link
Member

j6k4m8 commented Jun 4, 2024

hey @bdpedigo! that's a great point, we should definitely fail early on these.

The obvious reason we don't support multi-component motifs is that there is an exponential cost to enumerating motif matches for each connected component; what would be the sort of behavior you prefer? docs that make this explicit? a check at call-time that raises an exception? it doesn't raise right now because I've been trying to minimize the number of branches in the find_motifs call. but of course, uninterpretable errors = bad :)

@j6k4m8
Copy link
Member

j6k4m8 commented Jun 4, 2024

PS: excited that grandiso is useful to you :)

I guess a third option is computing matches for each component and returning them either as a product map (A1+B1, A1+B2, ... An+Bn) or as a list of length n with match lists for each...

@bdpedigo
Copy link
Author

bdpedigo commented Jun 5, 2024

PS: excited that grandiso is useful to you :)

Same! 🎉

I feel the simplest answer would be to just throw an error if the input motif is not fully connected. And given that this would be dictating the allowed inputs, a brief line in the documentation that says "input motif must be one connected component" also seems warranted.

In terms of behavior if you were to allow this kind of thing, I would default to whatever networkx is doing currently (since it does at least run):

nx_motifs = list(
    nx.isomorphism.DiGraphMatcher(host, motif).subgraph_monomorphisms_iter()
)

I suspect it in practice it is something like your product map (if I understand you correctly) but I doubt they are doing anything explicit to recognize that this subgraph is disconnected when they compute it. That being said, I have what I need for my application, so no pressure at all here to change the implementation.

@j6k4m8
Copy link
Member

j6k4m8 commented Jun 5, 2024

I like that solution — will add shortly :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants