Most people that went after this tried for text-to-sql (e.g. ask a question and ...

camjw · on June 11, 2024

How is this any different (text-to-sql vs text-to-semantic-query). Isn't this just comparing text-to-sql to text-to-slightly-simpler-sql?

mritchie712 · on June 11, 2024

Yes, it's simpler, but there's a few key differences:

1. You also have complete control over what the LLM can do / access thru the semantic layer (e.g. you can remove tables that the LLM shouldn't consider for analytical questions).

2. One of the biggest choke points for text-to-sql is constructing joins. All the joins are already built into the semantic layer.

3. Calculating metrics / measures is handled in the semantic layer instead of on the fly with SQL (e.g. if you ask something like "how much revenue did we generate from product X", you wouldn't want the LLM to come up with a calculation for revenue on the fly. Instead, revenue is clearly defined in the semantic layer).

4. The query format for our semantic layer (we use cube.dev) is JSON, which is much easier to control then free form SQL.

The semantic layer gives the LLM a well defined and constrained space to operate within whereas there are hundreds of ways for it to fail writing raw SQL.