paleolimbot's comments

paleolimbot · 2025-09-25T02:42:31 1758768151

Agreed that the polars interface is far superior to SQL! There are a few ways to do this if there's interest...polars wasn't an option because we needed Arrow extension types (https://github.com/pola-rs/polars/issues/9112).

paleolimbot · 2025-09-25T02:27:24 1758767244

It's built on a separate stack but conceptually it's very similar (DataFusion shares a number of idioms with Spark and has a number of projects implementing various Spark compatibility)...I think the idea was to bring the successful pieces of Sedona Spark to a wider audience.

paleolimbot · 2025-09-24T18:24:09 1758738249

Currently, lazier GeoParquet reads, a K-nearest neigbours join, Coordinate Reference System tracking, and built-in GeoPandas IO. These aren't things that DuckDB spatial can't or won't do, but they are things that DuckDB hasn't prioritized over the last year that are essential to a lot of spatial pipelines.

paleolimbot · 2025-09-24T18:18:25 1758737905

SedonaDB can decode PROJJSON and authority:code CRSes at the moment, although the underlying representation is just a string. In this case you might want something like CZBOND:999 or

{ "type": "EngineeringCRS", "name": "Fish, Towel, Mouse", "datum": {"name": "Wet Kitty + Mouse In Peril"}, "coordinate_system": { "subtype": "Cartesian", "axis": [ {"name": "Fish", "abbreviation": "F", "direction": "east"}, {"name": "Towel", "abbreviation": "T", "direction": "north"}, {"name": "Mouse", "abbreviation": "M", "direction": "up"}, ] } }

(Subject to the limitations of PROJJSON, such as a 4D CRS having a temporal axis and a limited set of acceptable "direction" values)

czbond · 2025-09-24T19:28:24 1758742104

Baller references and customization. Thank you for taking the time to craft that, I really appreciate it. Looking now because that was a main requirement of mine)

paleolimbot · 2025-09-24T17:15:11 1758734111

PostGIS is great when your data is already in a Postgres table! SedonaDB and DuckDB are much faster when your data starts elsewhere (e.g., GeoParquet files).

MrPowers · 2025-09-24T17:22:12 1758734532

The "DuckDB is probably the most important geospatial software of the last decade" post has a nice related discussion: https://news.ycombinator.com/item?id=43881468

WD-42 · 2025-09-24T18:09:02 1758737342

Oh I see. So if you have some kind of a pipeline in which you don’t need or want to load the data into a DB first. That makes total sense, thanks!