The author mentions PostgreSQL's JIT compiler. Up to this day, I've only seen it...

baba92 · on April 17, 2024

A customer had the worst performance issues after switching to postgres. Weirdly it only happened in the docker and test server setup and not on the developers machine (he ran postgres via homebrew).

Turns out homebrew installed postgres without JIT support and one query on the developer machine ran in 200ms, while with JIT the query took 4-5 seconds. Took some time to figure out, since I'm not a heavy postgres user, but since then I've always disabled JIT and never looked back.

LunaSea · on April 17, 2024

The JIT compiler is great for analytical queries.

You can configure thresholds for the JIT activation in PostgreSQL as well if you want to elevate the bar from which the JIT is enabled.

refset · on April 17, 2024

For analytical queries note that you really have to learn how to express the queries efficiently with Postgres - unfortunately the optimizer is still missing lots of tricks found in more sophisticated engines [0][1][2]. JIT compilation alone can't get close to making up for those gaps.

[0] https://pganalyze.com/blog/5mins-postgres-optimize-subquerie...

[1] https://duckdb.org/2023/05/26/correlated-subqueries-in-sql.h...

[2] "Unnesting Arbitrary Queries" https://cs.emis.de/LNI/Proceedings/Proceedings241/383.pdf

Sesse__ · on April 17, 2024

Postgres isn't really hot on implementing the latest stuff from academia (most of what they do seem to be fairly home-grown, not the least because some of it was never really written about before they started doing it). Arbitrary unnesting is certainly one of the optimizations they don't do; fancy sketches and other estimation techniques generally likewise. But what they're really good at is polish. A thousand small tweaks and optimizations to iron out issues for 0.1% of queries. It tends to have fairly few sharp edges, less so than any other RDBMS I've used.

fuy · on April 24, 2024

Thomas Neumann and Alfons Kemper papers can hardly be called "home-grown", their stuff has been implemented in multiple industrial systems. Postgres optimizer, though, is very bad even with simple correlated subqueries, let alone "arbitrary", so it'd be useful to have at least some version of unnesting.

dur-randir · on April 17, 2024

But PG is much more commonly used for OLTP, not OLAP. I'm baffled to this day that they enabled it by default.

LunaSea · on April 17, 2024

OLAP is a very common use case for PostgreSQL as well.

masklinn · on April 17, 2024

Yeah pg’s JIT is a pretty good demonstration that LLVM is not great at JITs, worsened by postgres not having a good persistent shared query cache.

It would probably be less detrimental if it could perform asynchronous compilation for future queries (which really is closer to how normal JITs actually behave, at least for optimising backends)

MaxBarraclough · on April 17, 2024

> LLVM is not great at JITs

That seems like a generalisation. It's true LLVM will never be a lightweight toolkit, but if you want to generate highly optimised code it seems like a solid choice, assuming you're doing relatively 'vanilla' code-gen work.

SigmundA · on April 17, 2024

PG just needs to do like the other databases and cache query plans, also add hints please.

masklinn · on April 17, 2024

> PG just needs to do like the other databases and cache query plans

The inline blocking compilation step will still fuck your query.

> also add hints please

I’d much rather they allowed bypassing the planner entirely and had an interface to feed plans directly to the executor.

RMarcus · on April 17, 2024

PostgreSQL has coarse-grain query hints (like `enable_hashjoin`), and the excellent `pg_hint_plan` extension allows you to specify a complete or partial execution plan: https://github.com/ossc-db/pg_hint_plan

SigmundA · on April 17, 2024

>The inline blocking compilation step will still fuck your query

Only on the first execution, the long plan/jit time is usually only an issue when running a fast query over and over again.

However if plans where cached then you could also plan without jitting then background jit and update cached plan for next time which would be even smarter.

>I’d much rather they allowed bypassing the planner entirely and had an interface to feed plans directly to the executor

Other database have plan locking where you can load an existing plan from somewhere else in serialized form (XML) and force it to use that. Most of the time I prefer just a few hints in the query source solves almost all issue with the planner.

nextaccountic · on April 17, 2024

Can't postgres JIT a query once and run the compiled query multiple times?

jeltz · on April 24, 2024

Nope, because nobody has implemented re-locatable symbols for the JIT so plans cannot be reused.