Wednesday, August 5, 2020

Home / IFTTT / Java / SQL and jOOQ. Tutorial / Use NATURAL FULL JOIN to compare two tables in SQL

Use NATURAL FULL JOIN to compare two tables in SQL

by T3W Admin on August 05, 2020 in IFTTT, Java, SQL and jOOQ. Tutorial

There are a few ways to compare two similar tables in SQL. Assuming PostgreSQL syntax, we might have this schema:

CREATE TABLE t1 (a INT, b INT, c INT);
CREATE TABLE t2 (a INT, b INT, c INT);
INSERT INTO t1 VALUES (1, 2, 3), (4, 5, 6), (7, 8, 9);
INSERT INTO t2 VALUES            (4, 5, 6), (7, 8, 9), (10, 11, 12);

It is now possible to use UNION and EXCEPT as suggested by Chris Saxon:

You can compare two tables with

t1 minus t2 union t2 minus t1

There's a better way where you query each table once… but you need to list out all the columns

Luckily @StewAshton has built a neat package to generate the #SQL for you :)https://t.co/7OroPV6JdY

— Chris Saxon (@ChrisRSaxon) August 5, 2020

In PostgreSQL, we can write:

(TABLE t1 EXCEPT TABLE t2) 
UNION
(TABLE t2 EXCEPT TABLE t1)
ORDER BY a, b, c

Notice how TABLE x is just standard SQL, and PostgreSQL, syntax sugar for SELECT * FROM x.

And we’ll get:

a |b |c |
--|--|--|
 1| 2| 3|
10|11|12|

Unfortunately, this requires two accesses to each table. Can we do it with a single access?

Using NATURAL FULL JOIN

Yes! Using NATURAL FULL JOIN, another rare use-case for this esoteric operator.

We can write this:

SELECT *
FROM (
  SELECT 't1' AS t1, t1.* FROM t1
) t1 NATURAL FULL JOIN (
  SELECT 't2' AS t2, t2.* FROM t2
) t2 
WHERE NOT (t1, t2) IS NOT NULL;

This produces:

a |b |c |t1|t2|
--|--|--|--|--|
 1| 2| 3|t1|  |
10|11|12|  |t2|

Why? Because a NATURAL JOIN is syntax sugar for joining using all the shared column names of the two tables, and the FULL JOIN makes sure we can retrieve also the columns that are not matched by the join predicate. Another way to write this is:

-- Use JOIN .. USING, instead of NATURAL JOIN
SELECT *
FROM (
  SELECT 't1' AS t1, t1.* FROM t1
) t1 FULL JOIN (
  SELECT 't2' AS t2, t2.* FROM t2
) t2 USING (a, b, c)
WHERE NOT (t1, t2) IS NOT NULL;

Or:

-- Use JOIN .. ON, instead of JOIN .. USING
SELECT 
  coalesce(t1.a, t2.a) AS a,
  coalesce(t1.b, t2.b) AS b,
  coalesce(t1.c, t2.c) AS c,
  t1.t1,
  t2.t2
FROM (
  SELECT 't1' AS t1, t1.* FROM t1
) t1 FULL JOIN (
  SELECT 't2' AS t2, t2.* FROM t2
) t2 ON (t1.a, t1.b, t1.c) = (t2.a, t2.b, t2.c) 
WHERE NOT (t1, t2) IS NOT NULL;

Pros and cons

Pros and cons compared to the set operator solution using UNION and EXCEPT:

Pros

Each table is accessed only once
Comparison is now name based, not column index based, i.e. it can still work if only parts of the columns are the shared

Cons

If index based column comparison was desired (because the tables are the same structurally, but do not share the exact same column names), then we’d have to rename each individual column to a common column name.
If there’s duplicate data, there’s going to be a cartesian product, which might make this solution quite slower

Row value expression NULL predicate

Observe the usage of the esoteric NULL predicate for row value expressions, which uses the following truth table:

+-----------------------+-----------+---------------+---------------+-------------------+
| Expression            | R IS NULL | R IS NOT NULL | NOT R IS NULL | NOT R IS NOT NULL |
+-----------------------+-----------+---------------+---------------+-------------------+
| degree 1: null        | true      | false         | false         |  true             |
| degree 1: not null    | false     | true          | true          |  false            |
| degree > 1: all null  | true      | false         | false         |  true             |
| degree > 1: some null | false     | false         | true          |  true             |
| degree > 1: none null | false     | true          | true          |  false            |
+-----------------------+-----------+---------------+---------------+-------------------+

Yes. R IS NULL and NOT R IS NOT NULL are not the same thing in SQL…

It’s just another way of writing:

SELECT *
FROM (
  SELECT 't1' AS t1, t1.* FROM t1
) t1 NATURAL FULL JOIN (
  SELECT 't2' AS t2, t2.* FROM t2
) t2 
WHERE t1 IS NOT NULL
OR t2 IS NOT NULL;

from Java, SQL and jOOQ. https://ift.tt/3fx7ULW
via IFTTT

About

Wednesday, August 5, 2020

Use NATURAL FULL JOIN to compare two tables in SQL

Using NATURAL FULL JOIN

Pros and cons

Row value expression NULL predicate

No comments:

Post a Comment

Follow Us

Facebook

About Admin

Tags

Blog Archive

Slider Widget

Search This Blog

Categories

Music

Categories

Advertisement

Nature

Fashion

Popular Posts

Technology

Pages

Tw3 - Tutorial

Menu Footer Widget

Social Plugin

Recent

Featured

Subscribe Us

Blog Archive

Recent Post

Tags

Recent Comments