Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix erroneous columns in basic 10k extraction #128

Open
3 tasks
katie-lamb opened this issue Dec 21, 2024 · 0 comments
Open
3 tasks

Fix erroneous columns in basic 10k extraction #128

katie-lamb opened this issue Dec 21, 2024 · 0 comments

Comments

@katie-lamb
Copy link
Member

Overview

During the basic 10K extraction there are some column names that get added to the output like ]fiscal_year_end, ]irs_number, ]state_of_incorporation, instead of fiscal_year_end, irs_number, and state_of_incorporation. While I haven't dug into this yet, this is probably cropping up from a bad field name in the raw basic 10k text itself (an added [ in front of the field name). I think this can be fixed by just stripping special characters from the field names, or defining the schema/field names ahead of time and enforcing that extracted field names fit into these preset fields.

Success Criteria

How will we know that we're done?

  • Extracted field names fit into a standard set of columns

Next steps

Preview Give feedback
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Icebox
Development

No branches or pull requests

1 participant