Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle result directory creation #221

Open
randerzander opened this issue Jul 13, 2021 · 1 comment
Open

Handle result directory creation #221

randerzander opened this issue Jul 13, 2021 · 1 comment

Comments

@randerzander
Copy link
Contributor

Periodically we need to run GPU-BDB on a new system with local only storage.

It sometimes proves difficult getting the result directories setup correctly. For instance, w/ a config having the value:
output_dir: /data/gpu-bdb/results/, and all worker nodes having that directory, q10 gives a failure:

Encountered Exception while running query                                                    
Traceback (most recent call last):      
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/bdb_tools/ut
ils.py", line 283, in run_dask_cudf_query                                                    
    benchmark(                             
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/bdb_tools/ut
ils.py", line 61, in benchmark                                                               
    result = func(*args, **kwargs)      
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/bdb_tools/ut
ils.py", line 115, in write_result                                                           
    write_etl_result( 
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/bdb_tools/ut
ils.py", line 147, in write_etl_result                                                       
    df.to_parquet(output_path, write_index=False)
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/dask_cudf/co
re.py", line 263, in to_parquet  
    return to_parquet(self, path, *args, **kwargs)
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/dask/datafra
me/io/parquet/core.py", line 653, in to_parquet
    out = out.compute(**compute_kwargs)
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/dask/base.py
", line 286, in compute      
    (result,) = compute(self, traverse=False, **kwargs)
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/dask/base.py
", line 568, in compute                                               
    results = schedule(dsk, keys, **kwargs)
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/distributed/
client.py", line 2704, in get
    results = self.gather(packed, asynchronous=asynchronous, direct=direct)
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/distributed/
client.py", line 2018, in gather
    return self.sync(          
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/distributed/
client.py", line 859, in sync
    return sync(
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/distributed/
utils.py", line 326, in sync
    raise exc.with_traceback(tb)
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/distributed/
utils.py", line 309, in f
    result[0] = yield future
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/tornado/gen.
py", line 762, in run
    value = future.result()
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/distributed/
client.py", line 1883, in _gather
    raise exception.with_traceback(traceback)
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/dask_cudf/io
/parquet.py", line 136, in write_partition
    with fs.open(fs.sep.join([path, filename]), mode="wb") as out_file:
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/fsspec/spec.
py", line 962, in open
    f = self._open(
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/fsspec/imple
mentations/local.py", line 144, in _open
    return LocalFileOpener(path, mode, fs=self, **kwargs)
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/fsspec/imple
mentations/local.py", line 235, in __init__
    self._open()
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/fsspec/imple
mentations/local.py", line 240, in _open
    self.f = open(self.path, mode=self.mode)
FileNotFoundError: [Errno 2] No such file or directory: '/data/gpu-bdb/results/q10-results.pa
rquet/part.13.parquet'
@randerzander
Copy link
Contributor Author

randerzander commented Jul 13, 2021

For future reference, if running on nodes w/ a local only filesystem, you'll need to make q10-q30 results directory available at output_dir on all nodes:

END=30
for i in $(seq 10 $END); do mkdir -p /data/gpu-bdb/results/q$i-results.parquet; done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant