You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Periodically we need to run GPU-BDB on a new system with local only storage.
It sometimes proves difficult getting the result directories setup correctly. For instance, w/ a config having the value: output_dir: /data/gpu-bdb/results/, and all worker nodes having that directory, q10 gives a failure:
Encountered Exception while running query
Traceback (most recent call last):
File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/bdb_tools/ut
ils.py", line 283, in run_dask_cudf_query
benchmark(
File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/bdb_tools/ut
ils.py", line 61, in benchmark
result = func(*args, **kwargs)
File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/bdb_tools/ut
ils.py", line 115, in write_result
write_etl_result(
File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/bdb_tools/ut
ils.py", line 147, in write_etl_result
df.to_parquet(output_path, write_index=False)
File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/dask_cudf/co
re.py", line 263, in to_parquet
return to_parquet(self, path, *args, **kwargs)
File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/dask/datafra
me/io/parquet/core.py", line 653, in to_parquet
out = out.compute(**compute_kwargs)
File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/dask/base.py
", line 286, in compute
(result,) = compute(self, traverse=False, **kwargs)
File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/dask/base.py
", line 568, in compute
results = schedule(dsk, keys, **kwargs)
File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/distributed/
client.py", line 2704, in get
results = self.gather(packed, asynchronous=asynchronous, direct=direct)
File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/distributed/
client.py", line 2018, in gather
return self.sync(
File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/distributed/
client.py", line 859, in sync
return sync(
File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/distributed/
utils.py", line 326, in sync
raise exc.with_traceback(tb)
File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/distributed/
utils.py", line 309, in f
result[0] = yield future
File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/tornado/gen.
py", line 762, in run
value = future.result()
File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/distributed/
client.py", line 1883, in _gather
raise exception.with_traceback(traceback)
File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/dask_cudf/io
/parquet.py", line 136, in write_partition
with fs.open(fs.sep.join([path, filename]), mode="wb") as out_file:
File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/fsspec/spec.
py", line 962, in open
f = self._open(
File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/fsspec/imple
mentations/local.py", line 144, in _open
return LocalFileOpener(path, mode, fs=self, **kwargs)
File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/fsspec/imple
mentations/local.py", line 235, in __init__
self._open()
File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/fsspec/imple
mentations/local.py", line 240, in _open
self.f = open(self.path, mode=self.mode)
FileNotFoundError: [Errno 2] No such file or directory: '/data/gpu-bdb/results/q10-results.pa
rquet/part.13.parquet'
The text was updated successfully, but these errors were encountered:
For future reference, if running on nodes w/ a local only filesystem, you'll need to make q10-q30 results directory available at output_dir on all nodes:
END=30
for i in $(seq 10 $END); do mkdir -p /data/gpu-bdb/results/q$i-results.parquet; done
Periodically we need to run GPU-BDB on a new system with local only storage.
It sometimes proves difficult getting the result directories setup correctly. For instance, w/ a config having the value:
output_dir: /data/gpu-bdb/results/
, and all worker nodes having that directory, q10 gives a failure:The text was updated successfully, but these errors were encountered: