About

Monday, December 6, 2021

Why You Should Use jOOQ With Code Generation

I’m answering many jOOQ questions on Stack Overflow, and a lot of times. The problem has the same cause: People not using jOOQ’s code generator. The main reason people seem not to be using it, is because it takes some extra time to set up, but as with anything well designed, the initial investment will always pay off.

In this article, I want to briefly summarise what the code generator is, why you should use it, and what is the main reason not to use it (hint: you probably don’t have that case).

What’s the jOOQ code generator?

jOOQ is an internal DSL that pretends that you can write type safe, embedded, dynamic SQL directly in Java. Just like you can do that in PL/SQL, PL/pgSQL, T-SQL, SQL/PSM and all the other procedural dialects. Being internal, the big difference is that dynamic SQL is very easy, because jOOQ makes SQL “composable”.

As an internal DSL, it models the SQL language as a chain of Java method calls. The technique behind the fluent API design is described here. A typical jOOQ query looks like this:

var author =
ctx.select(AUTHOR.FIRST_NAME, AUTHOR.LAST_NAME)
   .from(AUTHOR)
   .where(AUTHOR.ID.eq(10))
   .fetchOne();

Now those AUTHOR table references, as well as those AUTHOR.ID column references are generated code. They are Java representations of your database meta data, and include a variety of information:

  • The catalog / schema they belong to
  • The table they belong to (if it’s a column)
  • The data type
  • Some attached converters and data type bindings
  • Other useful meta data, such as constraints (primary key, unique key, foreign key), identities, default expressions, generated expressions, etc.
  • Auxiliary DSL methods, such as type safe table aliasing methods

As you can see, you get a lot of out of the box features!

What’s the price to pay for code generation?

As I understand, a lot of users (especially new ones) are eager to try out jOOQ immediately and don’t want to bother setting up the code generator, which seems to be an unnecessary step at first. E.g. when using JAXB, you don’t have to use XJC to generate annotated bindings for your XSD files (you might not even have an XSD). But with jOOQ, it’s different.

Yes, the manual can be overwhelming, but the tutorial isn’t, and neither are the examples. You don’t even have to use the Maven or Gradle plugins for your first attempts with jOOQ. Why not just use jbang. With just a few lines of CLI, you can start right away.

To try out basic things, you can also use the DDLDatabase, which generates code from SQL files directly, without a live connection to an actual database, though that isn’t really a big issue anymore with testcontainers.org and/or docker, which allows you to spin up a sample database in a matter of seconds.

What I’m trying to say here is: Don’t be overwhelmed. Yes, there’s some initial investment, but it will pay off massively (see below), will prevent tons of problems (see below), and it just requires a few more minutes of setting up your project. Heck, you could even start playing around with the bug reporting template, which includes code generation configuration for Java, Scala, and Kotlin!

What do you get in return?

Let’s start with the obvious:

1. Compile time type safety

You already get some degree of compile time type safety from using jOOQ’s DSL alone, namely, you cannot mistype keywords, such as SELECT, or forget your parentheses around an IN predicate, such as A IN (B, C). And you get auto-completion for your SQL syntax

But without code generation, you won’t get type safety for your schema objects. This is a big deal, and once you’ve gotten used to it, you will never want to miss it again!

Every table, every column, every constraint, every index, every sequence, every procedure, every function, every package, every schema, every catalog, every domain, every enum type, every object type gets a generated Java representation in jOOQ. You don’t have to go to your ERD, or database, or wherever it is to look up your schema. You have it right there in your code. As a first step, you can use this for auto completion in your favourite IDE!

Not just that. You can also never make a mistake with your meta data:

Yep. No more typos. But also! No more regressions when someone renames a column in the database. As soon as a column name changes, or is deleted, your Java code stops compiling.

No more wrong data types. See the above screenshot? It says that FIRST_NAME is a TableField<AuthorRecord, String>. So your Java compiler, through jOOQ’s elaborate generic API, will already know that you’re projecting a String column. It’s not 2003 anymore. We have generics to prevent type casts or unsafe conversions. Check out the compilation error on this:

Why ever worry about such data type problems again? You already declared the data type once, when you created the table:

CREATE TABLE author (
  id INTEGER NOT NULL PRIMARY KEY,
  first_name TEXT NOT NULL,
  last_name TEXT NOT NULL,
  ...
);

Why go through all that busywork of re-declaring the data type again in Java? In fact, why go through all that busywork of re-declaring anything of the above again in Java? No one enjoys typing all that stuff twice or keeping it in sync. In fact, it’s an outright bad idea. There should only be one source of truth for your meta model, and that’s the DDL you ran in your database to create the schema.

Sure, you can have other representations of your data, e.g. some DDD version, some JSON version for your frontend, whatever. But whenever you query your database, there’s absolutely no reason not to use the exact meta model which you’ve already declared in your database.

I think that’s kind of obvious? Why waive all that for a little (initial) extra setup convenience?

2. Schema introspection

You don’t just get that benefit when writing your queries. You also get it when reading them. You can always quickly navigate to the column you’re interested in and read its SQL comments in the Javadoc. Maybe you have this specified? (Totally underused SQL feature!)

COMMENT ON COLUMN author.first_name IS 'The author''s first name';

Now look what jOOQ’s code generator does with it:

Seems obvious, no? Or, you want to see who is using this column? Just check the call hierarchy or whatever IDE search tool you’re interested in:

This is much better than just text searching for FIRST_NAME, which might be case sensitive and matches all strings like that, not just that particular column of the AUTHOR table.

3. Runtime meta model: Data types

In fact, you won’t just profit from this type safety at compile time, but also at runtime. Trust me. There are so many edge cases in some JDBC drivers and/or SQL dialects where the data type must be communicated to the database explicitly.

Sure, you can write something like this in jOOQ, and it works:

var author =
ctx.select(field("author.first_name"), field("author.last_name"))
   .from("author")
   .where(field("author.id").eq(10))
   .fetchOne();

The above is using plain SQL templates, a feature that is normally used by users to extend jOOQ with custom functionality. Sure, it can be used instead of using code generation, too, and occasionally, that’s just the right thing to do.

When bind variables have insufficient type contexts

But there are many cases where you should give the SQL engine more context about the bind value. For example, in Derby (or older versions of Db2), you can’t just do this:

select null
from sysibm.sysdummy1;

You’ll get an error:

SQL Error [30000] [42X01]: Syntax error: Encountered “null” at line 1, column 8.

Instead, you have to CAST.

select cast(null as int)
from sysibm.sysdummy1;

jOOQ always adds those casts for you, if we find out through our extensive integraiton tests that in any given situation, a dialect might require it. But jOOQ can only do that if you provide it with a type, and often you do, implicitly, but sometimes, you don’t, and then you have to debug a query that used to work before. That can be very unexpected.

When a Field<Object> leads to compilation errors

This one is tricky. The Java language version 8 has made a peculiar decision around overload resolution with generics. Assuming you have overloads like this (and jOOQ is full of these, for convenience reasons):

public <T> void setValue(Parameter<T> parameter, T value) {}
public <T> void setValue(Parameter<T> parameter, Field<T> value) {}

Then there has been a backwards incompatible change of compilation behaviour in Java 8.

While this works perfectly fine:

Parameter<String> p1 = ...
Field<String> f1 = ...
setValue(p1, f1);

This call doesn’t compile:

Parameter<Object> p2 = ...
Field<Object> f2 = ...
setValue(p2, f2);

It seems that Field<Object> is not more specific than Object in this case, which seems weird, but that’s how the JLS designed generic overload resolution. In fact, it’s still more specific, but another rule declares the two types as ambiguous before we get to the specificity resolution. Read the above link for details.

Let’s look again at the previous jOOQ query that doesn’t use code generation:

Record2<Object, Object> author =
ctx.select(field("author.first_name"), field("author.last_name"))
   .from("author")
   .where(field("author.id").eq(10)) // Field<Object>
   .fetchOne();

It’s all full of Field<Object>, so you’ll run into this problem eventually, trust me, mostly with the UPDATE .. SET clause. And it will be quite the puzzler.

The workaround is to always attach type information to your columns:

// Assuming this static import
import static org.jooq.impl.SQLDataType.*;

Record2<String, String> author =
ctx.select(
        field("author.first_name", VARCHAR),
        field("author.last_name", VARCHAR))
   .from("author")
   .where(field("author.id", INTEGER).eq(10)) // Field<Integer>
   .fetchOne();

But why do all that, when the code generator could have already done it for you, automatically? And what if you change your AUTHOR.ID column to be a BIGINT instead of an INTEGER? Will you manually update every query?

4. Emulations

Some emulations are not possible without generated meta data. They mostly involve the emulation of the INSERT .. RETURNING syntax, which relies on knowing both primary key and identity information, but other syntaxes may be affected as well.

You wouldn’t believe how many times I’ve supported users who then wanted to use internal API to tell jOOQ about primary key or identity meta information in convoluted ways to make this work, rather than just generate code. With generated code, the following query works in a lot of RDBMS out of the box. Out. Of. The. Box:

AuthorRecord author =
ctx.insertInto(AUTHOR, AUTHOR.FIRST_NAME, AUTHOR.LAST_NAME)
   .values("John", "Doe")
   .returning(AUTHOR.ID)
   .fetchOne();
  • Can this be run in a single query (e.g. in Firebird, Oracle, PostgreSQL as above, in SQL Server using OUTPUT, in Db2, H2 using data change delta tables)?
  • Can it be run in a single query using JDBC’s Statement.getGeneratedKeys()?
  • Does it require a second query to fetch the identity and/or other columns?

jOOQ can always find a way for you in this case, but only if it knows which column the identity is.

5. Converters

Basic SQL only really has a few built-in data types:

  • Various numeric types
  • Various string and binary string types
  • Various temporal types

But is it a good idea to work with these low level types, everywhere throughout your application? Do you want to use BigDecimal for your monetary amounts all over the place? Or would it be better to create a more useful Amount domain class? One where you can attach functionality, currency, etc.?

With jOOQ, you can use converters for that (or bindings, if that affects how you bind the values to JDBC). Have a fancy FirstName class that models various types of first names?

/**
 * Not just any ordinary first name string!
 */
record LeFirstName(String firstName) {}

Yes, you could insist on still not using code generation and attach that to your plain SQL templating query:

Record2<LeFirstName, String> author =
ctx.select(
        field("author.first_name", VARCHAR.asConvertedDataType(
            LeFirstName.class, LeFirstName::new, LeFirstName::firstName
        )), 
        field("author.last_name", VARCHAR))
   .from("author")
   .where(field("author.id", INTEGER).eq(10))
   .fetchOne();

Yes, you could extract that field definition in an auxiliary class for better reuse:

class LeAuthor {
    static Field<LeFirstName> firstName = field("author.first_name", 
        VARCHAR.asConvertedDataType(
            LeFirstName.class, LeFirstName::new, LeFirstName::firstName
        ));
    static Field<String> lastName = field("author.last_name", VARCHAR));
}

Record2<LeFirstName, String> author =
ctx.select(LeAuthor.firstName, LeAuthor.lastName)
   .from("author")
   .where(field("author.id", INTEGER).eq(10))
   .fetchOne();

And now, it looks almost like jOOQ’s code generator was used. Except it wasn’t. Why not? You can attach your converter to all of your FIRST_NAME columns of all your tables automatically with just a bit of extra configuration in the code generator, so you’ll never forget. Especially, if your converter implements some must-use logic, such as e.g. hashing or encrypting, etc.

Why write manually, what the machine can write for you? You wouldn’t manually write other derived code, such as byte code, or assembly either?

6. Type safe aliasing

Have a complex query and want to alias your tables? No problem with jOOQ. Just declare the aliases up front and you’re all set:

// Declare the alias:
var a = AUTHOR.as("a");

// Use the alias. Columns are still there, type safe
var author =
ctx.select(a.FIRST_NAME, a.LAST_NAME)
   .from(a)
   .where(a.ID.eq(10))
   .fetchOne();

While aliasing also works without code generation (everything works without code generation, which is just convenience for more complex API calls), you won’t get type safe access to your column names from aliases like the above. And you can still call your type hierarchy in the IDE to check where the FIRST_NAME column is referenced:

Or get compilation errors on typos, or auto completion, etc. etc. What’s not to love?

7. Implicit joins

One of my favourite jOOQ features is implicit joins. Look at this jOOQ query:

ctx.select(
          BOOK.author().FIRST_NAME,
          BOOK.author().LAST_NAME,
          BOOK.TITLE,
          BOOK.language().CD.as("language"))
   .from(BOOK)
   .fetch();

Starting from a child table (e.g. BOOK), jOOQ can implicitly (left) join your parent tables AUTHOR and LANGUAGE, because life’s too short to repetitively type out all the same joins and join predicates again and again. The above is equivalent to:

ctx.select(
          AUTHOR.FIRST_NAME,
          AUTHOR.LAST_NAME,
          BOOK.TITLE,
          LANGUAGE.CD.as("language"))
   .from(BOOK)
   .leftJoin(AUTHOR).on(BOOK.AUTHOR_ID.eq(AUTHOR.ID))
   .leftJoin(LANGUAGE).on(BOOK.LANGUAGE_ID.eq(LANGUAGE.ID))
   .fetch();

It’s a matter of taste and style, yes. You don’t have to use implicit joins everywhere. Some joins cannot be expressed as implicit joins, but very often, this style greatly simplifies your queries by reducing noise.

And it’s only possible with generated code!

Why? Because those BOOK.author() and BOOK.language() methods are generated, and they return again instances of generated tables containing generated columns (with implicit table aliases), and data types, and possibly converters, and what not.

Are you still categorically excluding working with the code generator just because it takes a little extra time to set up?

8. CRUD

jOOQ isn’t a full fledged ORM like JPA implementations, but you can still get some convenience to avoid having to spell out CRUD queries all the time. jOOQ calls these UpdatableRecord.

When you fetch the UpdatableRecord like this:

BookRecord book =
ctx.selectFrom(BOOK)
   .where(BOOK.ID.eq(17))
   .fetchOne();

Then, you can change its values (e.g. in a rich client), and store the record again to generate an UPDATE statement:

book.setTitle("New title");
book.store();

There are a few opt in extra features, like optimistic locking, and you guessed it:

All available only with generated code.

9. Stored procedures

If you have those (and I highly recommend you do if only for performance reasons), then using code generation is even more of a no brainer.

If you have, in PostgreSQL:

CREATE FUNCTION p_count_authors_by_name (
  author_name VARCHAR, 
  result OUT INTEGER
)
AS $$
DECLARE
  v_result INT;
BEGIN
  SELECT COUNT(*)
  INTO v_result
  FROM author
  WHERE first_name LIKE author_name
  OR last_name LIKE author_name;

  result := v_result;
END;
$$ LANGUAGE plpgsql;

Now jOOQ’s code generator generates the following routine call for you:

public static Integer pCountAuthorsByName(
      Configuration configuration
    , String authorName
) {
    // Do you really care?
}

Yep, that’s it. Do you really care what happens behind the scenes? This binding logic is about as exciting as sorting your socks. It’s the same story every time you have some RPC technology. We didn’t care about the implementation in CORBA, with SOAP, with RMI, with remote EJBs, and we don’t care how to bind to a stored procedure.

It’s just some remote callable that accepts a String and returns an Integer. Code generation is your friend here, again. And yes, as always, if you think this function is missing a second parameter, you’ll just add it and your client code stops compiling offering you a chance to refactor!

And the best thing is: This supports all of your weird edge cases including:

  • SQL types (enums, arrays, object types, UDTs)
  • PL/SQL types (record types, table types, associative arrays)
  • Ref cursor types
  • Table valued functions

10. Multitenancy

A cool feature in jOOQ is that jOOQ supports schema level multi tenancy out of the box for you where you can dynamically rename your catalogs, schemas, and table names at runtime depending on whatever (e.g. user tenant).

This feature is called schema mapping, and it obviously… doesn’t work if you use plain SQL templates without code generation. (because a plain SQL template can contain any kind of SQL string, that’s what they’re for).

Not just that. With generated code, all of your objects are fully qualified by default, and you can omit qualification when you like. With your own templates, you can’t easily change that. Want to port your schema from A to B? Well, you’ll hand-rewrite everything, good luck. With jOOQ’s generated code, it’s just flags, configurations, etc.

11. Embedded types

Remember converters? Even more sophisticated are embedded types, i.e. jOOQ’s way of wrapping multiple database columns into a single client side value, pretending your database supports UDTs (such as Oracle, PostgreSQL).

Because when you work with currencies, what you really want to do is combine an AMOUNT column with a CURRENCY column, because after all, USD 1.00 isn’t really the same thing as EUR 1.00, nor can they be directly compared or combined in arithmetics.

Embedded types are currently only available using the code generator, which produces all the meta data required for jOOQ’s runtime to map / unmap your flat result sets.

This is especially powerful when you use:

  • Embedded keys (the only thing you ever compare an BOOK.AUTHOR_ID with is AUTHOR.ID, or maybe another foreign key that points to the same primary key, so why not use a type to enforce type safety?)
  • Embedded domains (you already declared a semantic type in your database, so you want to reuse that name also in client code)

12. Data change management

Another thing that people often don’t want to setup correctly and cleanly up front because it does take some extra time is data change management (e.g. using Flyway or Liquibase). And the regrets are similar once the project grows.

Using jOOQ’s code generator kind forces you to also think about data change management early on, and that is a good thing!

Your data model changes over time, and so does your client code working on that model. jOOQ’s code generation process should be embedded in your CI/CD pipeline:

  • You automatically apply a change in your model
  • You automatically re-generate your code
  • You automatically run your integration tests
  • You automatically deploy the tested result as a unit

Here’s how to do all of that with testcontainers and jOOQ, see an example in action here.

Let me stress this once more.

If you don’t cleanly set up your CI/CD pipeline including your database early on in your project, you’ll do it later on, and it’ll be much harder. But if you do it, then adding jOOQ’s code generation to the game is a no-brainer, and you get everything above for free!

Conclusion

There are many other minor things that you get when you use jOOQ’s code generator. It’s such a powerful tool to get the most out of both jOOQ and your database. It actively encourages you to use all of your RDBMS features, including:

  • Complex models (because joining is much simpler once you have all the meta data, and even tools like implicit joins)
  • Views (those are generated too)
  • Data types (generated too)
  • Procedures (same)
  • Etc.

Using code generation in jOOQ is a clear path towards becoming a power user of your own database, which you (hopefully) design with a lot of love and craftsmanship.

Exception

I promised an exception. You can’t use the code generator when your schema is dynamic (i.e. unknown at compile time). But only few systems are designed like that.



from Java, SQL and jOOQ. https://ift.tt/3DqCu5R
via IFTTT

No comments:

Post a Comment