Generate Invisible Primary Key (GIPK) MySQL 8.0

https://lh3.googleusercontent.com/kzS5w9VmGnAz2f5H70FnI8TiFbepEQM4940WQlEPz5R6nWEZ8axWFMRPIGL4Z2pKnFTvadFdtG4KLYt5MXGzIoUZZZhjAPD5K7SPdB-9ZXHiPM9yzts2NrcKjoI0s6kJ2kAt05Mx1ix95ACq6LHOTOsibcOjAiiDxkVblglYOOetWRbD0SXdaGTJR63eug

The Primary key is like the hero of a row, which has more beneficial features in the table while performing any task on the table.

The DBA knows the importance of the primary key in the table and how to handle it.

  1. Notable features of having a primary key:
  2. Requirements:
  3. Enabling GIPK:
  4. Handling GIPK:
  5. Benchmarking
    1. Data loading :
  6. Limitations:
  7. Conclusion:

Notable features of having a primary key:

  1. Performing any online alter or archival 
  2. Faster replication (Row Based Replication)
  3. Table Partitioning.
  4. The primary key is mandatory in cluster environments (InnoDB Cluster / Galera / Xtradb Cluster).
  5. Better query performance 

From 8.0.30, no need to maintain a separate column for the primary key in the table. We have a feature of sql_generate_invisible_primary_key (GIPK), a dynamic global variable. We can enable it online without any downtime.

By enabling this variable, the primary key and invisible column will be automatically created in the table if any of the created tables have an absent primary key. The default name of the auto-generated column name is my_row_id.

The main advantage of this feature is that we can ease cluster migrations and faster replication synchronization. 

The structure will be replicated to the replicas only for the ROW-based replication.

Requirements:

Binlog format ROW
MySQL Version >= 8.0.30
Engine InnoDB

Enabling GIPK:

It is a global variable, we can enable it dynamically without any downtime. By default, it will be off.

+------------------------------------+-------+
| Variable_name                      | Value |
+------------------------------------+-------+
| sql_generate_invisible_primary_key | OFF   |
+------------------------------------+-------+

mysql> set global sql_generate_invisible_primary_key=1;
Query OK, 0 rows affected (0.00 sec)

+------------------------------------+-------+
| Variable_name                      | Value |
+------------------------------------+-------+
| sql_generate_invisible_primary_key | ON    |
+------------------------------------+-------+

Working with GIPK :

mysql> CREATE TABLE `gipk` (`name` varchar(50) DEFAULT NULL,  `number` int DEFAULT NULL ) ENGINE=InnoDB;
Query OK, 0 rows affected (0.07 sec)

mysql>  show create table gipk\G
*************************** 1. row ***************************
       Table: gipk
Create Table: CREATE TABLE `gipk` (
  `my_row_id` bigint unsigned NOT NULL AUTO_INCREMENT /*!80023 INVISIBLE */,
  `name` varchar(50) DEFAULT NULL,
  `number` int DEFAULT NULL,
  PRIMARY KEY (`my_row_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

1 row in set (0.00 sec)

Post enabling the GPIK restart the connection to get it applied. 

In the above example, in the create statement I have mentioned two columns name and number. But MySQL has automatically created 1 more invisible primary key column named my_row_id.

We can make the column invisible or visible based on our use case. We just need to perform the alter statement to switch between invisible and visible columns.

mysql> ALTER TABLE gipk ALTER COLUMN my_row_id SET VISIBLE;
Query OK, 0 rows affected (0.00 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql>  show create table gipk\G
*************************** 1. row ***************************
       Table: gipk
Create Table: CREATE TABLE `gipk` (
  `my_row_id` bigint unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(50) DEFAULT NULL,
  `number` int DEFAULT NULL,
  PRIMARY KEY (`my_row_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

1 row in set (0.00 sec)

Even though it is an auto-generated column, It will be visible in the show create a statement and in information_schema.columns as well.

mysql>  show create table gipk\G
*************************** 1. row ***************************
       Table: gipk
Create Table: CREATE TABLE `gipk` (
  `my_row_id` bigint unsigned NOT NULL AUTO_INCREMENT /*!80023 INVISIBLE */,
  `name` varchar(50) DEFAULT NULL,
  `number` int DEFAULT NULL,
  PRIMARY KEY (`my_row_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

1 row in set (0.00 sec)
mysql> SELECT COLUMN_NAME,DATA_TYPE FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = "gipk";
+-------------+-----------+
| COLUMN_NAME | DATA_TYPE |
+-------------+-----------+
| my_row_id   | bigint    |
| name        | varchar   |
| number      | int       |
+-------------+-----------+
3 rows in set (0.00 sec)

By turning off show_gipk_in_create_table_and_information_schema, we may fully obscure it. By doing so, the column details will fully disappear from the show and create statements as well as from information schema.columns.

It is a dynamic variable, by default it will be on.

+--------------------------------------------------+-------+
| Variable_name                                    | Value |
+--------------------------------------------------+-------+
| show_gipk_in_create_table_and_information_schema | ON    |
+--------------------------------------------------+-------+

mysql> set global show_gipk_in_create_table_and_information_schema=0;
Query OK, 0 rows affected (0.00 sec)

+--------------------------------------------------+-------+
| Variable_name                                    | Value |
+--------------------------------------------------+-------+
| show_gipk_in_create_table_and_information_schema | OFF   |
+--------------------------------------------------+-------+
mysql> SELECT COLUMN_NAME,DATA_TYPE FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = "gipk";
+-------------+-----------+
| COLUMN_NAME | DATA_TYPE |
+-------------+-----------+
| name        | varchar   |
| number      | int       |
+-------------+-----------+
2 rows in set (0.00 sec)

mysql> show create table gipk\G
*************************** 1. row ***************************
       Table: gipk
Create Table: CREATE TABLE `gipk` (
  `name` varchar(50) DEFAULT NULL,
  `number` int DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

1 row in set (0.00 sec)

Now the column is completely invisible.

Handling GIPK:

We can’t change the column name when the column is in an invisible state. 

mysql> ALTER TABLE gipk RENAME COLUMN my_row_id to id;

ERROR 4110 (HY000): Altering generated invisible primary key column 'my_row_id' is not allowed.

To achieve this, first, we need to make the column visible and then we need to perform the rename column to change the column name based on our convenience. 

mysql>  show create table gipk\G
*************************** 1. row ***************************
       Table: gipk
Create Table: CREATE TABLE `gipk` (
  `my_row_id` bigint unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(50) DEFAULT NULL,
  `number` int DEFAULT NULL,
  PRIMARY KEY (`my_row_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

1 row in set (0.00 sec)

mysql> ALTER TABLE gipk RENAME COLUMN my_row_id to id;
Query OK, 0 rows affected (0.02 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> show create table gipk\G
*************************** 1. row ***************************
       Table: gipk
Create Table: CREATE TABLE `gipk` (
  `id` bigint unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(50) DEFAULT NULL,
  `number` int DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

1 row in set (0.00 sec)

Benchmarking

We have done a benchmark on the same to identify if there is any issue occurring post-enabling GIPK.

Table structure :

mysql> show create table gipk\G
*************************** 1. row ***************************
       Table: gipk
Create Table: CREATE TABLE `gipk` (
  `my_row_id` bigint unsigned NOT NULL AUTO_INCREMENT /*!80023 INVISIBLE */,
  `id` int unsigned NOT NULL,
  `k` int unsigned NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`my_row_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

1 row in set (0.01 sec)

mysql> show create table non_gipk\G
*************************** 1. row ***************************
       Table: non_gipk
Create Table: CREATE TABLE `non_gipk` (
  `id` int unsigned NOT NULL,
  `k` int unsigned NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT ''
) ENGINE=InnoDB DEFAULT CHARSET=latin1

1 row in set (0.00 sec)

Data loading :

Table size :

+----------+----------+------------+
| Database | Table    | Size in GB |
+----------+----------+------------+
| mydbops  | non_gipk |      20.76 |
+----------+----------+------------+

+----------+-------+---------------+
| Database | Table | Table size GB |
+----------+-------+---------------+
| mydbops  | gipk  |         21.83 |
+----------+-------+---------------+

We have created tables with GIPK and without Primary key. I have used mysql random data load for loading data to tables. The surprise is, the time taken for data loading is the same with GIPK and without the Primary key. So there won’t be much latency when bulk loading is happening even if GIPK is enabled.

Full table scan :

mysql> select * from gipk order by id limit 1;
+----+------------+------------------------------------------------------------------------+------------------------------+
| id | k          | c                                                                      | pad                          |
+----+------------+------------------------------------------------------------------------+------------------------------+
|  9 | 1542554247 | fugit sapiente consectetur ab non repudiandae ducimus laboriosam quas! | dolore veritatis asperiores. |
+----+------------+------------------------------------------------------------------------+------------------------------+
1 row in set (2 min 56.14 sec)

mysql> select * from non_gipk order by id limit 1;
+----+------------+---------------------------------------+--------------------------------------+
| id | k          | c                                     | pad                                  |
+----+------------+---------------------------------------+--------------------------------------+
|  9 | 1542554247 | voluptas facere sed dolore iure nisi. | at ipsam id voluptatem et excepturi. |
+----+------------+---------------------------------------+--------------------------------------+
1 row in set (4 min 22.99 sec)

We have done the full table query execution on both with GIPK and without Primary key table, the performance improvement is there in the GIPK table. The time taken for the execution has been reduced by half. 

Online alter and archival :

For performing safer online alter and archival of data chunk by chunk, the percona toolkit plays a vital role. For percona toolkit operations ( pt-osc / archiver) , The basic requirement is the primary key. If there is no primary key on the table, the tool won’t work on that table.  

The advantage of enabling GPIK is, we will have the invisible primary key. By using that primary key, the Percona tool is able to perform like online alter or archival, etc.

[root@localhost mydbopslabs]# pt-archiver --source h=localhost,D=mydbops,t=non_gipk,u=root,p='*****' --where "1=1" --limit 5000 --progress 5000 --statistics --no-check-charset --commit-each --bulk-delete --purge --file '/home/mydbopslabs/non_gipk_%d_%m_%Y_%H_%m_%s.csv' --output-format=csv --dry-run
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
	LANGUAGE = (unset),
	LC_ALL = (unset),
	LC_CTYPE = "UTF-8",
	LANG = "en_US.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
Cannot find an ascendable index in table at /bin/pt-archiver line 3261.

[root@localhost mydbopslabs]# pt-archiver --source h=localhost,D=mydbops,t=gipk,u=root,p='******' --where "1=1" --limit 5000 --progress 5000 --statistics --no-check-charset --commit-each --bulk-delete --purge --file '/home/mydbopslabs/non_gipk_%d_%m_%Y_%H_%m_%s.csv' --output-format=csv --dry-run
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
	LANGUAGE = (unset),
	LC_ALL = (unset),
	LC_CTYPE = "UTF-8",
	LANG = "en_US.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
/home/mydbopslabs/non_gipk_06_10_2022_02_10_05.csv
SELECT /*!40001 SQL_NO_CACHE */ `my_row_id`,`id`,`k`,`c`,`pad` FROM `mydbops`.`gipk` FORCE INDEX(`PRIMARY`) WHERE (1=1) AND (`my_row_id` < '100000000') ORDER BY `my_row_id` LIMIT 5000
SELECT /*!40001 SQL_NO_CACHE */ `my_row_id`,`id`,`k`,`c`,`pad` FROM `mydbops`.`gipk` FORCE INDEX(`PRIMARY`) WHERE (1=1) AND (`my_row_id` < '100000000') AND ((`my_row_id` >= ?)) ORDER BY `my_row_id` LIMIT 5000
DELETE FROM `mydbops`.`gipk` WHERE (((`my_row_id` >= ?))) AND (((`my_row_id` <= ?))) AND (1=1) LIMIT 5000

While performing archival on the Non-primary key table, the archival got failed, but it got succeeded on the GIPK table since it has an invisible primary key.

[root@localhost mydbopslabs]# pt-online-schema-change h=localhost,D=mydbops,t=non_gipk --user='root' --password='*****' --no-check-alter  --critical-load "Threads_running=900" --recursion-method=none --max-load  "Threads_running=1000" --no-check-plan --alter "engine=innodb" --dry-run
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
	LANGUAGE = (unset),
	LC_ALL = (unset),
	LC_CTYPE = "UTF-8",
	LANG = "en_US.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
# A software update is available:
Operation, tries, wait:
  analyze_table, 10, 1
  copy_rows, 10, 0.25
  create_triggers, 10, 1
  drop_triggers, 10, 1
  swap_tables, 10, 1
  update_foreign_keys, 10, 1
Starting a dry run.  `mydbops`.`non_gipk` will not be altered.  Specify --execute instead of --dry-run to alter the table.
Creating new table...
Created new table mydbops._non_gipk_new OK.
Altering new table...
Altered `mydbops`.`_non_gipk_new` OK.
The new table `mydbops`.`_non_gipk_new` does not have a PRIMARY KEY or a unique index required for the DELETE trigger.
Please check you have at least one UNIQUE and NOT NULLABLE index.
2022-10-06T02:48:59 Dropping new table...
2022-10-06T02:48:59 Dropped new table OK.
Dry run complete.  `mydbops`.`non_gipk` was not altered.
[root@localhost mydbopslabs]# pt-online-schema-change h=localhost,D=mydbops,t=gipk --user='root' --password='*****' --no-check-alter  --critical-load "Threads_running=900" --recursion-method=none --max-load  "Threads_running=1000" --no-check-plan --alter "engine=innodb" --dry-run
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
	LANGUAGE = (unset),
	LC_ALL = (unset),
	LC_CTYPE = "UTF-8",
	LANG = "en_US.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
Operation, tries, wait:
  analyze_table, 10, 1
  copy_rows, 10, 0.25
  create_triggers, 10, 1
  drop_triggers, 10, 1
  swap_tables, 10, 1
  update_foreign_keys, 10, 1
Starting a dry run.  `mydbops`.`gipk` will not be altered.  Specify --execute instead of --dry-run to alter the table.
Creating new table...
Created new table mydbops._gipk_new OK.
Altering new table...
Altered `mydbops`.`_gipk_new` OK.
Not creating triggers because this is a dry run.
Not copying rows because this is a dry run.
Not swapping tables because this is a dry run.
Not dropping old table because this is a dry run.
Not dropping triggers because this is a dry run.
2022-10-06T02:49:15 Dropping new table...
2022-10-06T02:49:15 Dropped new table OK.
Dry run complete. `mydbops`.`gipk` was not altered.

Online alter has failed on the Non-primary key table, and the archival failed, but it succeeded on the GIPK table since it has an invisible primary key.

Limitations:

  • GIPK will fail if the CREATE table statement has an auto-increment column. 
  • It supports only InnoDB Engine.
  • GIPK supports only row-based replication. 

Conclusion:

No more need to worry about creating and maintaining a primary key separately. GPIK also solves the problem in migration to Inn0DB Cluster, where the Primary key is mandatory. By enabling sql_generate_invisible_primary_key , we have an auto primary key in place now as a lifesaver. 

Planet MySQL