Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
In Favor of SQL: Add Follow Foreign Key Syntax
3 points by JoelJacobson on July 10, 2021 | hide | past | favorite | 11 comments
As noted by others [1], a very common case for joins is to follow foreign keys.

I suggest using the fact foreign keys are constraints with unique names, and using these names to explicitly specify what column(s) to join between the two foreing key tables.

In PostgreSQL [2], foreign key contraint names only need to be unique per table, which allows using the foreign table "as is" as the constraint name, which allows for nice short names. In other databases, the names will just need to be a little longer, since they need to be unique per schema according to the SQL spec.

Given this schema:

  CREATE TABLE baz (
  id integer NOT NULL,
  PRIMARY KEY (id)
  );
  
  CREATE TABLE bar (
  id integer NOT NULL,
  baz_id integer,
  PRIMARY KEY (id),
  CONSTRAINT baz FOREIGN KEY (baz_id) REFERENCES baz
  );
  
  CREATE TABLE foo (
  id integer NOT NULL,
  bar_id integer,
  PRIMARY KEY (id),
  CONSTRAINT bar FOREIGN KEY (bar_id) REFERENCES bar
  );
We could write a normal SQL query like this:

  SELECT
    bar.id AS bar_id,
    baz.id AS baz_id
  FROM foo
  JOIN bar ON bar.id = foo.bar_id
  LEFT JOIN baz ON baz.id = bar.baz_id
  WHERE foo.id = 123
I suggest adding a new binary operator, taking the table alias to join from as left operand, and the name of the foreign kery contraint to follow as the right operand.

Perhaps "->" could be used for this purpose, since it's currently not used by the SQL spec in the FROM clause.

This would allow rewriting the above query into this:

  SELECT
    bar.id AS bar_id,
    baz.id AS baz_id
  FROM foo
  JOIN foo->bar
  LEFT JOIN bar->baz
  WHERE foo.id = 123
Where e.g. "foo->bar" means:

  follow the foreign key constraint named "bar" on the table/alias "foo"
If the same join type is desired for multiple joins, another idea is to allow chaining the operator:

  SELECT
    bar.id AS bar_id,
    baz.id AS baz_id
  FROM foo
  LEFT JOIN foo->bar->baz
  WHERE foo.id = 123
Which would cause both joins to be left joins.

[1] https://scattered-thoughts.net/writing/against-sql/ [2] https://www.postgresql.org/



This assumes a single foreign reference What if you have two? I would suggest this syntax:

  SELECT *
  FROM foo
  JOIN bar VIA foo.bar_id
  LEFT JOIN baz VIA bar.baz_id


Your VIA suggestion seems to use the column name(s). The problem with that is when you have multiple foreign keys on the same column, or more likely, a foreign key on multiple columns.

My suggestion instead uses the constraint name of the foreign keys, which is defined using "CONSTRAINT constraint_name FOREIGN KEY ...", which avoids these two problems.

Some examples:

  CREATE TABLE users (
  id bigint NOT NULL GENERATED ALWAYS AS IDENTITY,
  parent_user_id bigint,
  PRIMARY KEY (id),
  CONSTRAINT parent FOREIGN KEY (parent_user_id) REFERENCES users
  );

  SELECT parent.id FROM users JOIN users->parent WHERE users.id = 123

  CREATE TABLE a (
  id1 integer NOT NULL,
  id2 integer NOT NULL,
  PRIMARY KEY (id1,id2)
  );
  
  CREATE TABLE b (
  id integer NOT NULL,
  a_id1 integer,
  a_id2 integer,
  PRIMARY KEY (id),
  CONSTRAINT a FOREIGN KEY (a_id1,a_id2) REFERENCES a
  );

  SELECT a.id1, a.id2 FROM b JOIN b->a WHERE b.id = 123;


Ah! Clever. I somehow missed that it was the constraint name.


I have seen a couple SQL variants which do implicit left outer joins. One uses "->"; the user uses "@".


Interesting. Can you please try to find some links to these?


are people even writing their own sql queries anymore?


Yes, a lot of people actually do for sure.

I would say especially if you also define your data models in SQL, and let the SQL database be the primary source of truth for what the data model is.


You can use database first in ef core and probably in other orms if you want the sql db model to be the source of truth. Then you can achive the same thing without having to write your own SQL. For new systems its still better to use code first. Database first is for legazy systems.


Nonsense, it's a very valid approach to design your data before your code. Especially since your database tends to far outlive any code you write against it.


thats pretty hard core. like writing machine code. i just use the wizard for creating my db. then i can use my c# class objects directly. but i don't know sql.


I’m in the opposite boat. What is this “wizard” you speak of?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: