-
Notifications
You must be signed in to change notification settings - Fork 466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
s3: Improve error messages and handling #10125
s3: Improve error messages and handling #10125
Conversation
* Include more information in all error messages so that users can understand what caused the problem * ListBucketsError is a fatal error, but we were just sending two messages instead of making it a hard error.
fa1be10
to
5f11c57
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems good!
is the reason a list error is fatal but a get error isnt basically that we expect get's to sometimes fail as we try to obtain all objects in a bucket, but we expect the bucket to be available?
err, | ||
})) | ||
.await | ||
.unwrap_or_else(|e| tracing::debug!("Source queue has been shut down: {}", e)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this only happen during shutdown?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IoError { | ||
bucket: String, | ||
err: std::io::Error, | ||
}, | ||
} | ||
|
||
impl std::error::Error for S3Error {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nice to implement source
on the error impl, for the cases that have inner errors (which right now is all of them), maybe a good TODO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, currently we print all causes in the display impl instead of implementing the causes method. There is a rust-general discussion about what the best thing to is, but this is generally what we do in mz.
For now I think it's more confusing to both impl display of causes and the cause()
method, but I do think it would be better in general if we impl'd cause()
everywhere and used anyhow's {:#?}
format for final display of errors.
S3Error::GetObjectError { .. } | S3Error::ListObjectsFailed(_) => { | ||
Ok(NextMessage::Pending) | ||
} | ||
Some(Some(Err(e))) => match e { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does the double-Option mean? another good TODO to change that to an explicit enum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that the double-option is an an artifact of using now_or_never()
on an async queue -- recv()
yields an Option, and now_or_never()
yields an Option of the thing it's called on.
Some(Some(Err(e))) => match e { | ||
S3Error::GetObjectError { .. } => { | ||
tracing::warn!( | ||
"when reading source '{}' ({}): {}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"when reading source '{}' ({}): {}", | |
"error when reading source '{}' ({}): {}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that adding error
to warn!
logs is helpful, it just makes grep error
(and especially case-insensitive search in e.g. less
) less useful/more confusing. This will currently show up as WARN <mod> when reading source...
.
I agree that this is a sentence fragment, though.
Yes, I think that that's the original motivation. I'm not 100% convinced that it's the right decision, but right now there's no way to get a dataflow out of an error state, so I'm hesitant to error one just because any given single item was missing. If we implemented MaterializeInc/database-issues#2060 it would be possible for people to be more precise about what they want to ingest than currently, which would make it more reasonable to error on anything invalid. |
what caused the problem
instead of making it a hard error.
Motivation
This PR fixes a previously unreported bug:
We were providing useless error messages for S3 errors.
Checklist