-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cain stuck file copy #12
Comments
Could this be that these are large files? |
well the schema has only a few records since I'm trying on a test minimal installation, if a do a du on the minio folder the total is less than 13M
btw I'm running kubernetes v1.11.5 with flannel host-gw setup |
on cassandra-0 the total data amounts to:
same on the other nodes |
Lets try to narrow this down. |
Sure give a minute |
ok skbn seems to be working properly created a file on minio
Run skbn to copy that file into k8s container
Check the file copied
|
Maor, looking at your skbn PerformCopy code https://github.com/nuvo/skbn/blob/42781bdb9d5cd81fcda5a6ac44a17e0480fb0e94/pkg/skbn/skbn.go#L139 I see you are using nio buffers, maybe the hang process is due to some race condition provoked by the goroutines pipew and piper. Probably converting piper goroutine to a standard function could be a good test to see if that is the cause.. When cain gets stuck I can only see "copy:" log output, the instead "done:" never appears. What do you think? |
These routines are running concurrently, allowing copy to be done using a pipe. This has to be 2 goroutines... |
See nuvo/skbn#3 for details |
Then the stuck is either in Download/Upload functions.. |
Probably in download. Can you try the same again, but with a file that gets stuck? |
Unfortunately is not a particular file, when running cain it randomly stops every time on different ( very small ) files. Only a couple of times It did finish the job. Funny thing is backup that runs 2x faster and it never gets stuck |
If minio is a pod in the cluster, you can try treating it as k8s://... |
Cool idea ! I will try thanks |
no luck I got stuck here this time :(
|
I want to assume this is an issue with minio, but can't verify at this time... |
well using k8s:// same result I guess is something that happens during the PerformCopy stuff |
Is this project still active? |
Hi Maor,
I'm trying to restore, but almost every time some file gets stuck during the copy and cain 0.5.1 hangs.
I tried to do some tunings with buffer size/parallelism but no success, cain randomly gets stuck at certain file copy. In this state after a while, the tcp connection towards minio disappears from netstat output, but cain remains still alive.
Any idea to increase verbosity of the copy process ?
Regards
The text was updated successfully, but these errors were encountered: