Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

divisions theme - small bounding boxes #234

Closed
missinglink opened this issue Sep 25, 2024 · 7 comments
Closed

divisions theme - small bounding boxes #234

missinglink opened this issue Sep 25, 2024 · 7 comments

Comments

@missinglink
Copy link

missinglink commented Sep 25, 2024

Hi 👋 I'm trying to query the bbox field of the divisions theme to perform a CONTAINS query but finding the bboxes to all be very small:

SELECT * FROM read_parquet('s3://overturemaps-us-west-2/release/2024-09-18.0/theme=divisions/type=division/*')
WHERE country='NZ'
AND bbox.xmin <= 174.767595
AND bbox.xmax >= 174.767595
AND bbox.ymin <= -41.285379
AND bbox.ymax >= -41.285379;

.. zero results

When I search for a specific city in the area I can retrieve bboxes which should match the query above but each dimension is less than 0.00005 degrees wide:

SELECT bbox FROM read_parquet('s3://overturemaps-us-west-2/release/2024-09-18.0/theme=divisions/type=division/*')
  WHERE country='NZ' AND names.primary='Wellington';
  
┌──────────────────────────────────────────────────────────────────────────────┐
│                                     bbox                                     │
│            struct(xmin float, xmax float, ymin float, ymax float)            │
├──────────────────────────────────────────────────────────────────────────────┤
│ {'xmin': 174.7772, 'xmax': 174.77724, 'ymin': -41.288795, 'ymax': -41.28879} │
│ {'xmin': 174.7772, 'xmax': 174.77724, 'ymin': -41.288795, 'ymax': -41.28879} │
└──────────────────────────────────────────────────────────────────────────────┘

I tested this in other countries and it seems to be the case for all bboxes.

Is this possibly a bug, or maybe a WIP?

@missinglink
Copy link
Author

missinglink commented Sep 25, 2024

SELECT * FROM read_parquet('s3://overturemaps-us-west-2/release/2024-09-18.0/theme=divisions/type=division/*')
WHERE country='NZ'
AND bbox.xmax > bbox.xmin + 0.00003;

... zero results

@phgn0
Copy link

phgn0 commented Sep 26, 2024

I just hit the same issue -- I think the bbox for division is just encompassing the center point for each feature.

division_area has the full polygon geometry and encompassing bbox.

@missinglink
Copy link
Author

Awesome thanks @phgn0, I considered that the boxes were only for the centroid but they still have some area, which is odd, for point geometries I'd expect the min/max values to be equal 🤷‍♂️

@jwass
Copy link
Contributor

jwass commented Sep 26, 2024

@missinglink The reason point geometries have min/max which differ is because the geometry coordinates in the WKB are doubles but we store the bbox as 32-bit float to save space. So when we downcast, we have to ensure the truncated min is less than the original coordinate and the truncated max is greater.

Link to the original discussion in the geoparquet bbox proposal: opengeospatial/geoparquet#188 (reply in thread)

@missinglink
Copy link
Author

Agh thanks Jacob, that makes sense, a 4-byte float should have about 7 decimal digits of precision (6 in the worst case, 9 in best), something about this process of finding the prev/next float seems to reduce that to 5. Or maybe I just happened on a degenerate case 🤔

Sounds like division_area is what I need so I'll close this ticket and give that a crack.

This issue motivated me to start publishing WOF in Parquet, which is now available if you're interested:
https://geocode.earth/blog/2024/whosonfirst-geoparquet/

@phgn0
Copy link

phgn0 commented Sep 27, 2024

This issue motivated me to start publishing WOF in Parquet, which is now available if you're interested: https://geocode.earth/blog/2024/whosonfirst-geoparquet/

@missinglink WOW this is exactly what I'm looking for to query some data that's not present in Overture.

By chance, could you also publish Parquet files for the global downloads? It's so much easier to read the data from one file rather than uncompressing the Geojson, particularly from BigQuery.

Thanks!!

@missinglink
Copy link
Author

By chance, could you also publish Parquet files for the global downloads?

@phgn0 ask & you shall receive:

SELECT id, name, placetype
FROM read_parquet('https://data.geocode.earth/wof/dist/parquet/whosonfirst-data-admin-latest.parquet')
WHERE (
  geometry_bbox.xmin <= 99.06875 AND
  geometry_bbox.xmax >= 99.06875 AND
  geometry_bbox.ymin <= 7.501039 AND
  geometry_bbox.ymax >= 7.501039
)
AND ST_ContainsProperly(geometry, ST_Point(99.06875, 7.501039));

┌───────────┬──────────────┬───────────┐
│    id     │     name     │ placetype │
│  varcharvarcharvarchar  │
├───────────┼──────────────┼───────────┤
│ 890465695 │ Ko Lanta     │ county    │
│ 85678585  │ Krabi        │ region    │
│ 85632293  │ Thailand     │ country   │
│ 102047613 │ Asia/Bangkok │ timezone  │
│ 102191569 │ Asia         │ continent │
└───────────┴──────────────┴───────────┘
Run Time (s): real 7.015 user 4.758053 sys 0.428248

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants