Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

smart dependency manager #511

Open
wants to merge 170 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
170 commits
Select commit Hold shift + click to select a range
9208e27
naive first pass, not working
dcolinmorgan Oct 5, 2023
1b1a727
working smart dep manager in feature_utils
dcolinmorgan Oct 10, 2023
eb4ac0c
working smart dep manager in feature_utils
dcolinmorgan Oct 10, 2023
ea08c7c
working smart dep manager in feature_utils
dcolinmorgan Oct 10, 2023
e0c7123
working smart dep manager in feature_utils
dcolinmorgan Oct 10, 2023
a41f762
lint
dcolinmorgan Oct 10, 2023
d54ee2e
umap smart dependecies
dcolinmorgan Oct 11, 2023
01abf59
update umap&feature tests
dcolinmorgan Oct 12, 2023
2e58fa5
update umap&feature tests
dcolinmorgan Oct 12, 2023
2960bda
update umap&feature tests
dcolinmorgan Oct 12, 2023
e2fac00
feature_utils build import_min_exn using deps
dcolinmorgan Oct 12, 2023
70d3e9b
feature_utils build import_min_exn using deps
dcolinmorgan Oct 12, 2023
4d8c6c8
add return types
dcolinmorgan Oct 12, 2023
3c2fdcf
add return types
dcolinmorgan Oct 12, 2023
f168a4f
working dgl, progress on embed
dcolinmorgan Oct 13, 2023
5144e3c
smart packages load, subfunctions not yet
dcolinmorgan Oct 13, 2023
f7a8e01
working embed and library function import
dcolinmorgan Oct 13, 2023
3e3d44c
working embed and library function import
dcolinmorgan Oct 13, 2023
e99cbe5
add functional import to feature/umap
dcolinmorgan Oct 14, 2023
c8523ba
review leo lint
dcolinmorgan Oct 16, 2023
c2b0397
loading just libraries
dcolinmorgan Oct 19, 2023
813fde2
lint
dcolinmorgan Oct 19, 2023
caecfba
lint
dcolinmorgan Oct 19, 2023
4af3fad
lint
dcolinmorgan Oct 19, 2023
22e4d18
lint
dcolinmorgan Oct 19, 2023
68537c6
lint
dcolinmorgan Oct 19, 2023
886d51a
add tests
dcolinmorgan Oct 19, 2023
a4ca316
add tests
dcolinmorgan Oct 19, 2023
f6fb4b9
if library then subfunction import
dcolinmorgan Oct 23, 2023
ed0262b
if library then subfunction import
dcolinmorgan Oct 23, 2023
0f9539d
naive first pass, not working
dcolinmorgan Oct 5, 2023
d34fef2
working smart dep manager in feature_utils
dcolinmorgan Oct 10, 2023
65eca98
working smart dep manager in feature_utils
dcolinmorgan Oct 10, 2023
629b648
working smart dep manager in feature_utils
dcolinmorgan Oct 10, 2023
ff7590b
working smart dep manager in feature_utils
dcolinmorgan Oct 10, 2023
4d7b824
lint
dcolinmorgan Oct 10, 2023
fc89beb
umap smart dependecies
dcolinmorgan Oct 11, 2023
6778a16
update umap&feature tests
dcolinmorgan Oct 12, 2023
df5fcae
update umap&feature tests
dcolinmorgan Oct 12, 2023
8c48dcf
update umap&feature tests
dcolinmorgan Oct 12, 2023
c1df5ba
feature_utils build import_min_exn using deps
dcolinmorgan Oct 12, 2023
0c86a7e
feature_utils build import_min_exn using deps
dcolinmorgan Oct 12, 2023
86f51b3
add return types
dcolinmorgan Oct 12, 2023
7230af2
add return types
dcolinmorgan Oct 12, 2023
45415e8
working dgl, progress on embed
dcolinmorgan Oct 13, 2023
9e28265
smart packages load, subfunctions not yet
dcolinmorgan Oct 13, 2023
5e9956b
working embed and library function import
dcolinmorgan Oct 13, 2023
f595dc5
working embed and library function import
dcolinmorgan Oct 13, 2023
5e25907
add functional import to feature/umap
dcolinmorgan Oct 14, 2023
f47b6d7
review leo lint
dcolinmorgan Oct 16, 2023
511187f
loading just libraries
dcolinmorgan Oct 19, 2023
e7ba215
lint
dcolinmorgan Oct 19, 2023
d784537
lint
dcolinmorgan Oct 19, 2023
8e6cd50
lint
dcolinmorgan Oct 19, 2023
fddde77
lint
dcolinmorgan Oct 19, 2023
9aed732
lint
dcolinmorgan Oct 19, 2023
2ee37fc
add tests
dcolinmorgan Oct 19, 2023
0011a73
add tests
dcolinmorgan Oct 19, 2023
e08c16f
if library then subfunction import
dcolinmorgan Oct 23, 2023
e6f29dd
if library then subfunction import
dcolinmorgan Oct 23, 2023
9f95b7e
Merge branch 'dev/dep_man' of https://github.com/graphistry/pygraphis…
dcolinmorgan Nov 23, 2023
1304968
lint
dcolinmorgan Nov 23, 2023
4dd7d0a
lint
dcolinmorgan Nov 23, 2023
a12898b
lint
dcolinmorgan Nov 23, 2023
a1db061
tqdm bugs ??
dcolinmorgan Nov 23, 2023
9199db0
tqdm bugs ??
dcolinmorgan Nov 24, 2023
f3c12e9
tqdm bugs ??
dcolinmorgan Nov 24, 2023
95be2db
tqdm bugs ??
dcolinmorgan Nov 24, 2023
74092fc
tqdm bugs ??
dcolinmorgan Nov 24, 2023
3210019
test_text_utils deps check
dcolinmorgan Nov 24, 2023
abb999e
test_text_utils deps check
dcolinmorgan Nov 24, 2023
5192f79
typos
dcolinmorgan Nov 24, 2023
0d165dd
ignore type
dcolinmorgan Nov 24, 2023
032193a
lint
dcolinmorgan Nov 24, 2023
75207ce
lint
dcolinmorgan Nov 24, 2023
1f539f1
lint
dcolinmorgan Nov 24, 2023
219555b
lint
dcolinmorgan Nov 24, 2023
8b53e6d
lint
dcolinmorgan Nov 24, 2023
3380fa5
lint
dcolinmorgan Nov 24, 2023
c12ed7e
push test logic
dcolinmorgan Nov 24, 2023
ecdd72b
push test logic
dcolinmorgan Nov 24, 2023
181abfa
push test logic
dcolinmorgan Nov 24, 2023
703e923
push test logic
dcolinmorgan Nov 24, 2023
5d7f750
lint
dcolinmorgan Nov 24, 2023
849baae
lint
dcolinmorgan Nov 24, 2023
6935a91
lint
dcolinmorgan Nov 24, 2023
c1f94c2
lint
dcolinmorgan Nov 24, 2023
eeaef0b
dep_flag lint
dcolinmorgan Nov 24, 2023
8d4c1df
assert logic
dcolinmorgan Nov 24, 2023
37ea918
lint
dcolinmorgan Nov 27, 2023
8e32e0c
lint
dcolinmorgan Nov 27, 2023
1f5f243
lint
dcolinmorgan Nov 27, 2023
20430e0
lint
dcolinmorgan Nov 27, 2023
a3bb113
remove conditional
dcolinmorgan Nov 27, 2023
9528e4a
sklearn assert
dcolinmorgan Nov 27, 2023
d170ace
sklearn assert
dcolinmorgan Nov 27, 2023
6a508c4
sklearn assert
dcolinmorgan Nov 27, 2023
f5812bd
sklearn assert
dcolinmorgan Nov 27, 2023
976d1dd
cumml _v_ test
dcolinmorgan Nov 27, 2023
2faf466
cumml _v_ test
dcolinmorgan Nov 27, 2023
2c96419
lint
dcolinmorgan Nov 27, 2023
ab73859
lint
dcolinmorgan Nov 27, 2023
a379787
lint
dcolinmorgan Nov 27, 2023
580ef32
lint
dcolinmorgan Nov 27, 2023
2c35bb2
lint
dcolinmorgan Nov 27, 2023
3d5aa45
lint
dcolinmorgan Nov 27, 2023
260c3b7
remove two too precise tests
dcolinmorgan Nov 27, 2023
23e4257
lint
dcolinmorgan Nov 27, 2023
c6417f9
lint
dcolinmorgan Nov 27, 2023
457ef7a
lint
dcolinmorgan Nov 27, 2023
69e59e7
add sklearn to core dep
dcolinmorgan Nov 27, 2023
6977d67
add sklearn to core dep
dcolinmorgan Nov 27, 2023
bba6c00
add sklearn to core dep
dcolinmorgan Nov 27, 2023
533a750
add sklearn+umap to core dep
dcolinmorgan Nov 27, 2023
20b1f16
add sklearn+umap to core dep
dcolinmorgan Nov 27, 2023
dd23f25
add sklearn+umap to core dep
dcolinmorgan Nov 27, 2023
3b59258
add scipy, dc to core dep
dcolinmorgan Nov 27, 2023
5e63074
add scipy, dc to core dep
dcolinmorgan Nov 27, 2023
6db86a3
revert to working
dcolinmorgan Nov 27, 2023
42f6a75
Merge branch 'dev/dep_man' of https://github.com/graphistry/pygraphis…
dcolinmorgan Nov 27, 2023
aadc84b
clsoe
dcolinmorgan Nov 27, 2023
edbdf37
remove has_
dcolinmorgan Nov 27, 2023
0ec47bb
np.all to allclose
dcolinmorgan Nov 27, 2023
139f7f9
lint
dcolinmorgan Nov 27, 2023
3223a27
revert allclose
dcolinmorgan Nov 27, 2023
c47df98
drop assert
dcolinmorgan Nov 27, 2023
26cd5e9
drop assert
dcolinmorgan Nov 27, 2023
e47fa35
drop assert
dcolinmorgan Nov 27, 2023
d8f9e6d
lint
dcolinmorgan Nov 27, 2023
1904df5
respond to most comments
dcolinmorgan Dec 4, 2023
a9d3d9e
respond to most comments
dcolinmorgan Dec 4, 2023
0dd4ed6
respond to most comments
dcolinmorgan Dec 4, 2023
6007eb7
respond to tqdm, <2 column comments
dcolinmorgan Dec 5, 2023
6d0cb1c
respond to tqdm, <2 column comments
dcolinmorgan Dec 5, 2023
86378eb
respond to tqdm, <2 column comments
dcolinmorgan Dec 5, 2023
5b36dd0
respond to tqdm
dcolinmorgan Dec 5, 2023
90ca97a
Merge branch 'master' into dev/dep_man
dcolinmorgan Dec 5, 2023
08de406
tqdm set_descr error
dcolinmorgan Dec 5, 2023
b236337
tqdm set_descr error
dcolinmorgan Dec 5, 2023
85e1e24
tqdm not trange has "set_description"
dcolinmorgan Dec 5, 2023
c86cb53
tqdm not trange has "set_description"
dcolinmorgan Dec 5, 2023
5d5146f
tqdm not trange has "set_description"
dcolinmorgan Dec 5, 2023
8640971
tqdm.tqdm
dcolinmorgan Dec 5, 2023
58d9810
tqdm.tqdm
dcolinmorgan Dec 5, 2023
d02d480
fallback to lazy import
dcolinmorgan Dec 5, 2023
a39928c
fallback to lazy import
dcolinmorgan Dec 5, 2023
cedd9ad
half lazy import
dcolinmorgan Dec 5, 2023
dcfdd9c
smart import
dcolinmorgan Dec 5, 2023
cc8c4d2
smart import
dcolinmorgan Dec 5, 2023
79045df
smart import
dcolinmorgan Dec 5, 2023
21bf0c9
lint
dcolinmorgan Dec 5, 2023
9801824
refactored 1 column exception workaround
dcolinmorgan Dec 5, 2023
7b86a04
refactored 1 column exception workaround
dcolinmorgan Dec 5, 2023
0eea678
Merge branch 'master' into dev/dep_man
dcolinmorgan May 23, 2024
7441b29
no explicit lazy
dcolinmorgan May 23, 2024
52abe0f
lint
dcolinmorgan May 23, 2024
f87139d
lint
dcolinmorgan May 23, 2024
637a991
lint
dcolinmorgan May 23, 2024
21d2748
lint
dcolinmorgan May 23, 2024
f0db78b
remove defunct lazies
dcolinmorgan May 23, 2024
9189800
lint
dcolinmorgan May 23, 2024
0de2ffa
lint
dcolinmorgan May 23, 2024
071faf1
lint
dcolinmorgan May 23, 2024
62c58bc
lint
dcolinmorgan May 23, 2024
ab49794
lint
dcolinmorgan May 23, 2024
8cb2838
lint
dcolinmorgan May 23, 2024
bb8a258
lint sheesh
dcolinmorgan May 23, 2024
0309329
test gpu-avail
dcolinmorgan May 24, 2024
f37ce87
lint
dcolinmorgan May 27, 2024
83f8fc5
lint2
dcolinmorgan May 27, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 9 additions & 21 deletions graphistry/Engine.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import pandas as pd
from typing import Any, Optional, Union
from enum import Enum

from .dep_manager import deps

class Engine(Enum):
PANDAS : str = 'pandas'
Expand All @@ -21,17 +21,6 @@ class EngineAbstract(Enum):
DataframeLocalLike = Any # pdf, cudf
GraphistryLke = Any

#TODO use new importer when it lands (this is copied from umap_utils)
def lazy_cudf_import_has_dependancy():
try:
import warnings

warnings.filterwarnings("ignore")
import cudf # type: ignore

return True, "ok", cudf
except ModuleNotFoundError as e:
return False, e, None

def resolve_engine(
engine: Union[EngineAbstract, str],
Expand All @@ -58,15 +47,14 @@ def resolve_engine(
if isinstance(g_or_df, pd.DataFrame):
return Engine.PANDAS

has_cudf_dependancy_, _, _ = lazy_cudf_import_has_dependancy()
if has_cudf_dependancy_:
import cudf
cudf = deps.cudf
if cudf:
if isinstance(g_or_df, cudf.DataFrame):
return Engine.CUDF
raise ValueError(f'Expected cudf dataframe, got: {type(g_or_df)}')

has_cudf_dependancy_, _, _ = lazy_cudf_import_has_dependancy()
if has_cudf_dependancy_:
cudf = deps.cudf
if cudf:
return Engine.CUDF
return Engine.PANDAS

Expand All @@ -86,7 +74,7 @@ def df_to_engine(df, engine: Engine):
else:
return df.to_pandas()
elif engine == Engine.CUDF:
import cudf
cudf = deps.cudf
if isinstance(df, cudf.DataFrame):
return df
else:
Expand All @@ -97,22 +85,22 @@ def df_concat(engine: Engine):
if engine == Engine.PANDAS:
return pd.concat
elif engine == Engine.CUDF:
import cudf
cudf = deps.cudf
return cudf.concat
raise NotImplementedError("Only pandas/cudf supported")

def df_cons(engine: Engine):
if engine == Engine.PANDAS:
return pd.DataFrame
elif engine == Engine.CUDF:
import cudf
cudf = deps.cudf
return cudf.DataFrame
raise NotImplementedError("Only pandas/cudf supported")

def s_cons(engine: Engine):
if engine == Engine.PANDAS:
return pd.Series
elif engine == Engine.CUDF:
import cudf
cudf = deps.cudf
return cudf.Series
raise NotImplementedError("Only pandas/cudf supported")
56 changes: 15 additions & 41 deletions graphistry/compute/cluster.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
from graphistry.constants import CUML, UMAP_LEARN, DBSCAN # noqa type: ignore
from graphistry.features import ModelDict
from graphistry.feature_utils import get_matrix_by_column_parts
from graphistry.dep_manager import deps

logger = logging.getLogger("compute.cluster")

Expand All @@ -21,36 +22,12 @@
DBSCANEngineConcrete = Literal["cuml", "umap_learn"]
DBSCANEngine = Literal[DBSCANEngineConcrete, "auto"]


def lazy_dbscan_import_has_dependency():
has_min_dependency = True
DBSCAN = None
try:
from sklearn.cluster import DBSCAN
except ImportError:
has_min_dependency = False
logger.info("Please install sklearn for CPU DBSCAN")

has_cuml_dependency = True
dbscan = deps.dbscan
if deps.cuml:
import cuml.DBSCAN as cuDBSCAN
else:
cuDBSCAN = None
try:
from cuml import DBSCAN as cuDBSCAN
except ImportError:
has_cuml_dependency = False
logger.info("Please install cuml for GPU DBSCAN")

return has_min_dependency, DBSCAN, has_cuml_dependency, cuDBSCAN

def lazy_cudf_import_has_dependancy():
try:
import warnings

warnings.filterwarnings("ignore")
import cudf # type: ignore

return True, "ok", cudf
except ModuleNotFoundError as e:
return False, e, None
cudf = deps.cudf


def resolve_cpu_gpu_engine(
Expand All @@ -59,15 +36,9 @@ def resolve_cpu_gpu_engine(
if engine in [CUML, UMAP_LEARN, 'sklearn']:
return engine # type: ignore
if engine in ["auto"]:
(
has_min_dependency,
_,
has_cuml_dependency,
_,
) = lazy_dbscan_import_has_dependency()
if has_cuml_dependency:
if cuDBSCAN:
return "cuml"
if has_min_dependency:
if dbscan:
return "umap_learn"

raise ValueError( # noqa
Expand All @@ -89,9 +60,8 @@ def safe_cudf(X, y):
else:
new_kwargs[key] = value
return new_kwargs['X'], new_kwargs['y']

has_cudf_dependancy_, _, cudf = lazy_cudf_import_has_dependancy()
if has_cudf_dependancy_:

if cudf:
# print('DBSCAN CUML Matrices')
return safe_cudf(X, y)
else:
Expand Down Expand Up @@ -209,7 +179,11 @@ def _cluster_dbscan(
):
"""DBSCAN clustering on cpu or gpu infered by .engine flag
"""
_, DBSCAN, _, cuDBSCAN = lazy_dbscan_import_has_dependency()
dbscan = deps.dbscan
if deps.cuml:
import cuml.DBSCAN as cuDBSCAN
else:
cuDBSCAN = None

if engine_dbscan in [CUML]:
print('`g.transform_dbscan(..)` not supported for engine=cuml, will return `g.transform_umap(..)` instead')
Expand Down
2 changes: 2 additions & 0 deletions graphistry/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,8 @@
# scikit-learn params
SKLEARN = "sklearn"

# gpu-req libs
GPU_REQ = ['cudf','cupy', 'cuml', 'numba', 'cuda']
# #############################################################
# Caching and other internals
CACHE_COERCION_SIZE = 100
Expand Down
40 changes: 40 additions & 0 deletions graphistry/dep_manager.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
import importlib
import subprocess

from .constants import GPU_REQ

class DepManager:
def __init__(self):
self.pkgs = {}

def __getattr__(self, pkg:str):
dcolinmorgan marked this conversation as resolved.
Show resolved Hide resolved
# self._add_deps(pkg)
self._proc_import(pkg)
try:
return self.pkgs[pkg]
except KeyError:
return None

def _proc_import(self, pkg:str):
if pkg in GPU_REQ and self._is_gpu_available():
self._add_deps(pkg)
elif pkg not in GPU_REQ:
self._add_deps(pkg)

def _is_gpu_available(self):
try:
output = subprocess.check_output("nvidia-smi", shell=True)
return len(output) > 0
except subprocess.CalledProcessError:
return False

def _add_deps(self, pkg:str):
try:
pkg_val = importlib.import_module(pkg)
self.pkgs[pkg] = pkg_val
setattr(self, pkg, pkg_val)
except:
pass


deps = DepManager()
37 changes: 8 additions & 29 deletions graphistry/dgl_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
)

from .util import setup_logger

from .dep_manager import deps

if TYPE_CHECKING:
import scipy
Expand All @@ -33,31 +33,9 @@
else:
MIXIN_BASE = object


def lazy_dgl_import_has_dependency():
try:
import warnings
warnings.filterwarnings('ignore')
import dgl # noqa: F811
return True, 'ok', dgl
except ModuleNotFoundError as e:
return False, e, None


def lazy_torch_import_has_dependency():
try:
import warnings
warnings.filterwarnings('ignore')
import torch # noqa: F811
return True, 'ok', torch
except ModuleNotFoundError as e:
return False, e, None


logger = setup_logger(name=__name__)



# #########################################################################################
#
# Torch helpers
Expand All @@ -73,7 +51,7 @@ def convert_to_torch(X_enc: pd.DataFrame, y_enc: Optional[pd.DataFrame]): # typ
:param y_enc: DataFrame Matrix of Values for Target
:return: Dictionary of torch encoded arrays
"""
_, _, torch = lazy_torch_import_has_dependency() # noqa: F811
torch = deps.torch # noqa: F811

if not y_enc.empty: # type: ignore
data = {
Expand All @@ -98,7 +76,7 @@ def get_available_devices():
device (torch.device): Main device (GPU 0 or CPU).
gpu_ids (list): List of IDs of all GPUs that are available.
"""
_, _, torch = lazy_torch_import_has_dependency() # noqa: F811
torch = deps.torch # noqa: F811

gpu_ids = []
if torch.cuda.is_available():
Expand Down Expand Up @@ -181,7 +159,8 @@ def pandas_to_dgl_graph(
sp_mat: sparse scipy matrix
ordered_nodes_dict: dict ordered from most common src and dst nodes
"""
_, _, dgl = lazy_dgl_import_has_dependency() # noqa: F811
dgl = deps.dgl # noqa: F811

sp_mat, ordered_nodes_dict = pandas_to_sparse_adjacency(df, src, dst, weight_col)
g = dgl.from_scipy(sp_mat, device=device) # there are other ways too
logger.info(f"Graph Type: {type(g)}")
Expand All @@ -196,7 +175,7 @@ def get_torch_train_test_mask(n: int, ratio: float = 0.8):
:param ratio: mimics train/test split. `ratio` sets number of True vs False mask entries.
:return: train and test torch tensor masks
"""
_, _, torch = lazy_torch_import_has_dependency() # noqa: F811
torch = deps.torch # noqa: F811

train_mask = torch.zeros(n, dtype=torch.bool).bernoulli(ratio)
test_mask = ~train_mask
Expand Down Expand Up @@ -225,8 +204,8 @@ def dgl_lazy_init(self, train_split: float = 0.8, device: str = "cpu"):
"""

if not self.dgl_initialized:
lazy_dgl_import_has_dependency()
lazy_torch_import_has_dependency()
deps.dgl
deps.torch
self.train_split = train_split
self.device = device
self._removed_edges_previously = False
Expand Down
Loading
Loading