3 ways to auto-increment with Postgres

· 968 words · 5 minute read

Using a generated key is something DBAs often do, mainly for performance reasons. In an ideal world, we would rely on the table natural primary key and we’ll be over. However, using an artificial primary key has been proven beneficial for performance.

In this blog post, I won’t explain what’s a primary key or what’s a natural or artificial key. If you’re interested in database design, you’ll find great books out there on the matter. If you think database design is some ancient thing that shouldn’t be done anymore, you might be interested in Doctor Stonebraker’s blog post here: https://www.enterprisedb.com/blog/schema-later-considered-harmful

In this blog post, I will explain 3 ways to have an auto-increment with Postgres.

Sequences 🔗

The first obvious way to have an auto-increment number is to use sequences. (Actually, we’ll see later that the other ways also rely on sequences).

So here is how to use simply a sequence:

laetitia=# create table test(id integeri primary key, value text);
CREATE TABLE
laetitia=# create sequence my_seq;
CREATE SEQUENCE
laetitia=# insert into test (select nextval('my_seq'), 'blabla');
INSERT 0 1
laetitia=# select * from test;
 id | value
----+--------
  1 | blabla
(1 row)

But, we can do better. What if we wanted the id column to be automatically filled with the sequence next number?

laetitia=# create sequence my_seq;
CREATE SEQUENCE
laetitia=# create table test (id integer default nextval('my_seq') primary key, value text);
CREATE TABLE
laetitia=# insert into test(value) values ('blabla');
INSERT 0 1
laetitia=# select * from test;
 id | value
----+--------
  1 | blabla
(1 row)

So, that was the first way to have auto-incremented ids with Postgres.

The serial datatype 🔗

Since Postgres 8.2 (released in 2006), Postgres added serial data types. As Postgres documentation states, it will do the same thing we did before but you will do that with only one word. (see https://www.postgresql.org/docs/current/datatype-numeric.html#DATATYPE-SERIAL.

laetitia=# create table test (id serial primary key, value text);
CREATE TABLE
laetitia=# insert into test (value) values ('blabla');
INSERT 0 1
laetitia=# select * from test;
 id | value
----+--------
  1 | blabla
(1 row)

If you look at the table, you’ll find out a sequence has been created and is used as the default value for the id column:

laetitia=# \d test
                            Table "public.test"
 Column |  Type   | Collation | Nullable |             Default              
--------+---------+-----------+----------+----------------------------------
 id     | integer |           | not null | nextval('test_id_seq'::regclass)
 value  | text    |           |          | 
Indexes:
    "test_pkey" PRIMARY KEY, btree (id)
laetitia=# \ds
             List of relations
 Schema |    Name     |   Type   |  Owner   
--------+-------------+----------+----------
 public | test_id_seq | sequence | laetitia
(1 row)

The identity column 🔗

Sequences are standard SQL compliant. However, the serial data type is not. But SQL has a way to create auto-increment for a column that does not involve creating a sequence explicitly. It’s the generated as identity.

There are two flavors of that syntax, depending on what you want to do.

The first way to use this syntax is when you want to use the sequence value as the default, but allow to enter another value manually. In that case, the syntax is generated by default as identity. But this won’t prevent someone to insert a value without using the sequence:

laetitia=# create table test (id integer generated by default as identity primary key, value text);
CREATE TABLE
laetitia=# \d test
                            Table "public.test"
 Column |  Type   | Collation | Nullable |             Default              
--------+---------+-----------+----------+----------------------------------
 id     | integer |           | not null | generated by default as identity
 value  | text    |           |          | 
Indexes:
    "test_pkey" PRIMARY KEY, btree (id)

laetitia=# insert into test (value) values ('blabla');
INSERT 0 1
laetitia=# select * from test;
 id | value  
----+--------
  1 | blabla
(1 row)

laetitia=# insert into test (id, value) values (2,'blabla');
INSERT 0 1
laetitia=# select * from test;
 id | value  
----+--------
  1 | blabla
  2 | blabla
(2 rows)

laetitia=# insert into test (value) values ('blabla');
ERROR:  duplicate key value violates unique constraint "test_pkey"
DETAIL:  Key (id)=(2) already exists.

As you see, you can add a value manually, and then the insert order will error due to a duplicate key violation because the sequence is lagging behind the last number in the id column.

To prevent that to happen, we’ll create a constraint that will prevent anyone to insert manually data into that column. that’s the other generated as identity syntax.

laetitia=# create table test (id integer generated always as identity primary key, value text);
CREATE TABLE
laetitia=# \d test
                          Table "public.test"
 Column |  Type   | Collation | Nullable |           Default            
--------+---------+-----------+----------+------------------------------
 id     | integer |           | not null | generated always as identity
 value  | text    |           |          | 
Indexes:
    "test_pkey" PRIMARY KEY, btree (id)

laetitia=# insert into test (value) values ('blabla');
INSERT 0 1
laetitia=# insert into test (id, value) values (2,'blabla');
ERROR:  cannot insert a non-DEFAULT value into column "id"
DETAIL:  Column "id" is an identity column defined as GENERATED ALWAYS.
HINT:  Use OVERRIDING SYSTEM VALUE to override.

We can still check that it’s using a sequence under the hood:

laetitia=# \ds
Did not find any relations.
laetitia=# create table test (id integer generated always as identity primary key, value text);
CREATE TABLE
laetitia=# \ds
             List of relations
 Schema |    Name     |   Type   |  Owner   
--------+-------------+----------+----------
 public | test_id_seq | sequence | laetitia
(1 row)

To sum up, there are several ways to create auto-increment values for a column. I’d suggest using almost all the time the generated always as identity as this syntax will add a constraint to prevent any sequence desynchronization, due to manual inserts. Of course, should someone want to mess with the sequence behind the identity column, with the appropriate permission, they will find a way to do that.

Sequence Serial Identity column
Nextval automatically as default value No Yes Yes
Not null constraint No Yes Yes
Prevent manual insert No No With `always`