-
-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Approach handling 4 billion nodes on modern hardware #3
Comments
It's definitely possible but that would involve preprocessing the graph to analyze it's connectivity and identify disconnected sub-graphs that could be calculated independently. Also, when you follow this approach I think you cannot normalize the values of the edges so that their sum amounts to 1 - not 100% sure. That would also mean you would have to known after how many iterations to stop beforehand. My current needs involve processing a little over 10 million nodes and for that 32GB of RAM is enough. I suggest you look into https://github.com/GraphChi/graphchi-cpp and the respective papers as it's an amazing piece of software and the parallel sliding window is a very nice idea, another alternative could be something like InifitiMem but from my benchmarks it's nowhere near as efficient. If you want to implement graph sharding to this project, PRs are more than welcome though. :) |
GraphChi looks like a great project. Thanks for those links. Unfortunately I haven't spent any time on C++ in a long time and don't remember much. I guess I just need to rent a EC2 r3.2xlarge for a couple hours to fit a larger dataset in memory. |
@xeoncross Throwing money at the problem is usually the easiest route. :) But there's also a GraphChi version in Java (see https://github.com/GraphChi/graphchi-java/blob/master/src/main/java/edu/cmu/graphchi/apps/WeightedPagerank.java) if you feel more comfortable with that and want to tackle the problem. ;) |
I know this is pretty old, but I thought I'd comment in case it helps anyone: There are a couple of ways you could improve the ability to scale, but at the size of the problem you're considering, you're running into fundamental memory requirements. I see you're storing the nodes in a map, and the edges in a two-level map. That's going to result in a good deal of overhead (space and time) from the hash map, but the bare minimum you could get away with, using 32 bit weights, for a 1 billion x 1 billion matrix with 100 nonzero elements per node is 1.2 TB. That's if you were to use a CSR matrix with 32 bit values (the two indices would have to be 64 bit because of the number of edges). With hash maps, you should probably expect a roughly constant multiple of that for memory usage. If it's important to speed this up, though, switching to a sparse matrix and using BLAS for the sparse matrix - dense vector dot product would do wonders. It would immediately reduce memory usage and increase performance- and you'd get parallelism for free, if it would actually benefit your problem. Sparse matrix multiplication is very complex and the communication required often outweighs the benefit. Lastly, if you need to work on matrices larger than main memory, this can be done as a block matrix multiplication. You could keep only the block of the matrix that you're working on in memory, load the next, combine results, etc. I/O-efficient techniques for computing pagerank gives an overview of a few methods of doing that. Full text pdf |
The
uint
can handle values past 4 billion so that means this system can handle graphs with up to4294967295
nodes in them. Currently, a graph with only 1,000,000 nodes ( with 100,000,000 links) takes 2.6GB of memory to calculate. Approaching 1 billion nodes (with 10 or 100 links each) would take a massive server to calculate.Is there any way we could split the processing up by sharding the graph into smaller parts rather than processing the whole thing in memory? Perhaps we could even combine the
graph.edges
andgraph.nodes
weight values since they contain duplicate weights (or swap them for pointers to a shared float64 value).The text was updated successfully, but these errors were encountered: