forked from jeremiahpslewis/dwdbulk
-
Notifications
You must be signed in to change notification settings - Fork 0
/
TODO
35 lines (27 loc) · 1.4 KB
/
TODO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
- Look into nonexistent time error: 1997-03-30 02:00:00
- Document that nonexistent times will return NaT...
- Add tests for forecasts code
- Enable limiting forecast data to specific parameters!
- Add tests to forecast import
- Unique station ids prefixed with domain, e.g. CDC:00005
- Correspondence table between observation and forecast measurement parameters; convert to consistent units
- Setup ongoing data collection via Azure pipelines (hourly?)
- Start on blogpost about library
- Work on couple of viz (with forecasts & how change over time...)
- Add badges to readme
- Move to cloud (Azure?)
- https://pypi.org/project/azure-storage-blob/
- Process data files -> parquet with partitioning
- Put together exploratory visualizations
- Blog post about sleeping weather
- Blog post about heat & duration
- Blog post about cloudiness (overall vs daytime)
- Migrate resource url search to use link lists, not link crawling...
- Add Metadata import..
- Change "height" field to "elevation"
- Identify and correct timezones! (Starting in 2000 UTC, up til 1999 in CET);
- For correct parsing, metadata Needs to be extracted
- Add file check step for parquet files; delete corrupted files...
- Log processed files (to avoid reprocessing!) (check update timestamps to avoid special case for 'latest' files)
- Cleanup directory after downloading files...
- Adapt parser for zipped files in hourly+ measurements