-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for inability to read some parquet files (issue #816) #817
Conversation
…bles Signed-off-by: David Wood <[email protected]>
Signed-off-by: David Wood <[email protected]>
Signed-off-by: David Wood <[email protected]>
Signed-off-by: David Wood <[email protected]>
Signed-off-by: David Wood <[email protected]>
Signed-off-by: David Wood <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving, but please, add the comment
logger.error(f"Failed to convert byte array to arrow table, exception {e}. Skipping it") | ||
return None | ||
logger.warning(f"Could not convert bytes to pyarrow: {e}") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you, please put a comment here about why polars. Just copy the blur from the where you found this solution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
…iles Signed-off-by: David Wood <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if the two comments can be addressed. Thanks
data-processing-lib/python/src/data_processing/utils/transform_utils.py
Outdated
Show resolved
Hide resolved
Signed-off-by: David Wood <[email protected]>
Signed-off-by: David Wood <[email protected]>
…init(). Signed-off-by: David Wood <[email protected]>
Signed-off-by: David Wood <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
Why are these changes needed?
To allow proper reading and processing of some parquet files that are otherwise unreadable.
Related issue number (if any).
#816