This repository provides the implemention of the Affinity Clustering method for partitioning sparse graphs introduced by Bateni et al. 2017. Apache Spark is used for performing the parallel computation.
The algorithm, along with testing code is provided in the following Jupyter Notebook.
- Python 3.9
- PySpark 3.0.1
- numpy1.20.1
- networkx 2.5
- matplotlib 3.3.4
- sklearn
- scipy 1.6.1
- jupyter
Install the required dependencies
pip install -r requirements.txt
Recomended: use a dedicated virtual env or anaconda for easy dependency management