-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fallback when checkpoints indicated by _last_checkpoint hint is missing #582
Comments
Hi @Sevenannn thanks for raising! This has been a TODO for a while I will try to get to it soon :) Looks like |
There are two different issues here:
|
Thanks for the replies! @scovich Yeah I can see the error |
Yeah, if some external entity is messing up the table's files while kernel and/or engine are trying to access them, a near-infinite set of things could go wrong. Delta spark is full of special-case code that attempts to compensate, but there's always one more case it missed, or ambiguous cases there's no good answer for. I'd much rather not get on a treadmill that encourages people to do bad/illegal things to their tables. Fail clean, and discourage users from breaking their tables. |
Thanks! That make sense. However I found it a bit hard to handle the checkpoint related error since they are all currently wrapped in Error::Generic. I created a PR to add a Error variation: InvalidCheckpoint, so that it will be easier to handle checkpoint specific error when using delta kernel crate. Would you help take a look? #593 |
Please describe why this is necessary.
When the checkpoints indicated by _last_checkpoint file is missing, the snapshot creation will simply fail. A fallback machenism to construct snapshot from a previous checkpoint versions, or a simpler fallback of completely construct snapshot from log files would be useful in this case, along with warning messages informing users that olders checkpoint file is used.
Describe the functionality you are proposing.
When the checkpoint indicated by _last_checkpoint file is missing, construct snapshot from the last valid checkpoint + subsequent logs file, or a more naive implementation of constructing snapshot purely from logs file.
Additional context
N/A
The text was updated successfully, but these errors were encountered: