CUDA error - not sure of solution #54

JGBurgess1 · 2023-11-01T11:20:35Z

Dear Unidock developers,

Thanks for producing this software for us to use!
I really appreciate that.

We've run into some errors when I start to use multiple ligands as input parameters.:

"....
.....
Computing Vina
grid ... done.
Total ligands: 283
Batch 1 size: 283
> CUDA error at /apps/chpc/bio/Uni-Dock/unidock/src/cuda/precalculate.cu:198
code=34(cudaErrorStubLibrary) "cudaMalloc(&atom_
xs_gpu, thread * max_atom_num * sizeof(sz))"

Do you know what is happening?

By the way, I think we have Nvidia V100 x 16GB cards,
will this cause an issue? I saw in your code that you recognize the 32GB V100 when you determine the memory size to be allocated. Is this part of the problem?

Regards

Jeremy

caic99 · 2023-11-01T12:16:26Z

Hi Jeremy @JGBurgess1 ,
I noticed that you are facing the error of code=34(cudaErrorStubLibrary). This indicates that your CUDA library or driver is not installed correctly. Run nvidia-smi to check if the system can use GPU.
V100 16GB should work. Please let us know if you encountered CUDA out-of-memory errors.

JGBurgess1 · 2023-11-02T06:41:09Z

I ran nvidia-smi, on the GPU node.

Wed Nov  1 21:03:05 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 520.61.05    Driver Version: 520.61.05    CUDA Version: 11.8     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-PCIE...  Off  | 00000000:3B:00.0 Off |                  Off |
| N/A   32C    P0    34W / 250W |      0MiB / 16384MiB |      1%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  Tesla V100-PCIE...  Off  | 00000000:AF:00.0 Off |                  Off |
| N/A   53C    P0   191W / 250W |   1392MiB / 16384MiB |     99%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   2  Tesla V100-PCIE...  Off  | 00000000:D8:00.0 Off |                  Off |
| N/A   45C    P0   119W / 250W |    378MiB / 16384MiB |     72%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    1   N/A  N/A    340553      C   ...u/amber/18/bin/pmemd.cuda     1388MiB |
|    2   N/A  N/A    325488      C   ...pu/gromacs/2020.1/bin/gmx      374MiB |
+-----------------------------------------------------------------------------+

Does this help answer the question?

Regards,

Jeremy

caic99 · 2023-11-02T07:03:55Z

@JGBurgess1
I guess you are using a cluster with slurm/PBS? Please try compiling and running Uni-Dock on the GPU nodes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA error - not sure of solution #54

CUDA error - not sure of solution #54

JGBurgess1 commented Nov 1, 2023

caic99 commented Nov 1, 2023

JGBurgess1 commented Nov 2, 2023 •

edited

Loading

caic99 commented Nov 2, 2023

CUDA error - not sure of solution #54

CUDA error - not sure of solution #54

Comments

JGBurgess1 commented Nov 1, 2023

caic99 commented Nov 1, 2023

JGBurgess1 commented Nov 2, 2023 • edited Loading

caic99 commented Nov 2, 2023

JGBurgess1 commented Nov 2, 2023 •

edited

Loading