Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

illegal memory access was encountered #52

Open
koparasy opened this issue Mar 22, 2019 · 13 comments
Open

illegal memory access was encountered #52

koparasy opened this issue Mar 22, 2019 · 13 comments

Comments

@koparasy
Copy link

Hello, I am compiling gpusph with "make" then I execute " make ProblemExample".
And then I execute ./GPUSPH and I get the following error:

Device 0 thread 140735808663952 iteration 0 last command: 7. Exception: src/cuda/forces.cu(516) : in unbind_textures() @ thread 0x140735808663952 : cudaSafeCall() runtime API error 77 : an illegal memory access was encountered

The same error rises also when executing different problems.

My system information is the following:
g++ : (GCC) 6.4.0
nvcc : release 9.1, V9.1.85
GPU devices: 4 x Tesla V100-SXM2.

@Oblomov
Copy link
Contributor

Oblomov commented Mar 23, 2019

Hello @koparasy

which GPUSPH version (or branch) does this happen with?

@koparasy
Copy link
Author

Hello @Oblomov

I am working on the main branch.

@Oblomov
Copy link
Contributor

Oblomov commented Mar 26, 2019

Hello @koparasy

can you please try the next branch and see if it's fixed there already?

@koparasy
Copy link
Author

I checkout to the next branch and the error persists.

@Oblomov
Copy link
Contributor

Oblomov commented Mar 27, 2019

I'm afraid I'm unable to reproduce the issue locally. Can you see if running GPUSPH under cuda-memcheck gives some indiciation of where the illegal access is coming from?

@Oblomov
Copy link
Contributor

Oblomov commented Mar 27, 2019

Also: are you running single- or multi-GPU?

@koparasy
Copy link
Author

koparasy commented Mar 28, 2019

The error appears on single and multi GPU runs. I attach you the output of a run with a single gpu without mpi support without hdf5 support.

I attach the output of the cuda-memcheck.
cudaMemCheck.txt

@Oblomov
Copy link
Contributor

Oblomov commented Mar 28, 2019

Thanks for the report. From the log, it would seem that the issue happens when the forcesDevice kernel tries to fetch the neighbors position, but the array where it's trying to read from should be valid. I do not have any GPU with Compute Capability 7.0 and cannot reproduce the error on my machine, so I'm afraid debugging will be a bit slow and you'll have to be my hands and eyes 8-)

For starters, I would recommend to updated to the latest next that I just pushed, which includes a small fix for neighbors traversal. I don't think it's directly relevant to the case, but we never know.

if the latest next (currently at commit add5af0) doesn't fix the issue, I would ask you to try the following change: in src/cuda/textures.cuh, replace the line:

#if __COMPUTE__ >= 20 && __COMPUTE__/10 != 3

with

#if 1

and see if you can replicate the error, and then again replacing it with

#if 0

and see if you can replicate the error. This should help us pinpoint a bit better the possible source of error.

@anthropoy
Copy link

Had the same issue on a 6.1 CUDA device. I am actually working on the wsl branch but I did merge that fix from the 'next' branch. I also tried the fix in textures.cuh, none worked.

@Oblomov
Copy link
Contributor

Oblomov commented Jul 26, 2019

Can you please provide the output of make show? It should be available as info/show.txt ready for export if you're on a recent enough branch.

@anthropoy
Copy link

show.txt

As attached, thanks.

@Oblomov
Copy link
Contributor

Oblomov commented Jul 29, 2019

The Microsoft compiler suffers from this bug, which affects GPUSPH. A large part of the changes introduced in the wsl branch are specifically to work around this, but it seems you're hitting a case we missed. We'll look into it.

@anthropoy
Copy link

I see, thanks for looking into this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants