Syscall latency was a much bigger deal back when we had spinning disks, but it still matters today (you can get dramatically different plans depending on slightly different costs of pulling a page from disk) and I find it silly that we've never even measured our those costs. A bigger impact might be for SQL functions...all of the Postgres SQL functions have configured cost values, but they could easily be measured. Also a simple cost model for functions can be a dramatic oversimplification. For example, some PostGIS functions have O(n) or O(n^x) behavior depending on the size and complexity of the input geometry. If we could measure exact costs, or model costs with statistical distributions, or possibly predict with ML, that would be a huge improvement.
My opinion on ML is that there is nothing in the execution planning side that couldn't be modeled and solved as a linear program, with extremely fast and mathematically-optimal results. By trying to use ML for the planning part, you're really just using ML to reverse engineer LP solvers, and it is a poor use of the compute resources.
The reason why some ML planners might have better results than typical SQL query planners is because typical SQL engines are optimized towards OLTP workloads that require small transactions executed very quickly. In order to do that, they purposefully don't explore the true planning space...they might explore 3-10 alternative ways of executing, whereas there might be hundreds or thousands of ways to do the same thing. While Postgres has explicitly chosen to not implement planning pragmas to override planner behavior, it would be really cool if you could have multiple planners optimized for different types of workloads, and be able to explicitly choose a query planner that takes 3 seconds to plan for a query that takes 1hr to execute and for which a better plan could save several minutes. I would even love a fairly naive query planner which does index scans for a deterministic and exact cardinality before planning joins.
BTW, I really like your blog and your research focuses. You're working on exceptionally hard problems that have a huge impact on computing.
Where I think ML would be much better than what we (postgres) do, is iteratively improving selectivity estimation. In today's postgres there's zero feedback from noticing at runtime that the collected statistics lead to bad estimates. In a better world we'd use that knowledge to improve future selectivity estimates.
> In order to do that, they purposefully don't explore the true planning space...they might explore 3-10 alternative ways of executing, whereas there might be hundreds or thousands of ways to do the same thing.
FWIW, often postgres' planner explores many more plan shapes than that (although not as complete plans, different subproblems are compared on a cost basis).
> While Postgres has explicitly chosen to not implement planning pragmas to override planner behavior, it would be really cool if you could have multiple planners optimized for different types of workloads,
FWIW, it's fully customizable by extensions. There's a hook to take over planning, and that can still invoke postgres' normal planner if the query isn't applicable. Obviously that's not the same as actually providing pragmas.
My opinion on ML is that there is nothing in the execution planning side that couldn't be modeled and solved as a linear program, with extremely fast and mathematically-optimal results. By trying to use ML for the planning part, you're really just using ML to reverse engineer LP solvers, and it is a poor use of the compute resources.
The reason why some ML planners might have better results than typical SQL query planners is because typical SQL engines are optimized towards OLTP workloads that require small transactions executed very quickly. In order to do that, they purposefully don't explore the true planning space...they might explore 3-10 alternative ways of executing, whereas there might be hundreds or thousands of ways to do the same thing. While Postgres has explicitly chosen to not implement planning pragmas to override planner behavior, it would be really cool if you could have multiple planners optimized for different types of workloads, and be able to explicitly choose a query planner that takes 3 seconds to plan for a query that takes 1hr to execute and for which a better plan could save several minutes. I would even love a fairly naive query planner which does index scans for a deterministic and exact cardinality before planning joins.
BTW, I really like your blog and your research focuses. You're working on exceptionally hard problems that have a huge impact on computing.