Insert Random Data into Tables Using mysql_random_data_load

Insert Random Data into Tables

Insert Random Data into TablesIn this short blogpost, we’ll show you how to use the mysql_random_data_load tool to insert random data into tables. This is a great aide in testing when you have empty tables and need them to be populated with data. We’ve all done it manually (the INSERT INTO … VALUES … way), but that is surely a waste of time. It can add up to a lot if you need to test tables with many columns, or even worse, with foreign keys.

Before saying anything else, we would like to mention that this tool is still under development, and that it will insert random data into tables. Be aware if running it in a production environment!

mysql_random_data_load is a tool created by PerconLabs.

Note: PerconaLabs and Percona-QA are open source GitHub repositories for unofficial scripts and tools created by Percona staff. While not covered by Percona support or services agreements, these handy utilities can help you save time and effort.

Now, back to the disclaimer-free zone, the tool in question is named mysql_random_data_load, and can be currently found in the Percona Lab GitHub repository:

http://ift.tt/2CWxXtH

The README.md file has a lot of information on it already, so feel free to pause this reading and come back after you’ve skimmed through it. If you are eager to start using it, though, we give you the crash course below.

We offer two binaries in the releases tab, one for Darwin and another one for Linux. We will show examples of the linux_amd64 build here, but they are the same for the darwin_amd64 one.

First, get the latest build (0.1.6 as of this writing):

shell> cd ~/bin/
shell> wget http://ift.tt/2Dnxl15
shell> tar xzf mysql_random_data_loader_linux_amd64.tar.gz
shell> chmod +x mysql_random_data_loader

We will use the sakila database schema for the tests, so the second and final step is to download and install it.

shell> wget http://ift.tt/15xYnRs
shell> tar xzf sakila-db.tar.gz && rm -f sakila-db.tar.gz
shell> mysql < sakila-db/sakila-schema.sql

Ok, we are now ready to test inserting some rows with random data! Let’s try populating the actor table first. Since its primary key is a SMALLINT, we have a maximum of 65535 rows.

shell> mysql_random_data_load sakila actor 65535 --host=127.0.0.1 --port=3306 --user=my_user --password=my_password
2017/12/22 00:17:32 Starting
  4s [===============================================] 100%
Total rows inserted: 65535

Let’s see what the data looks like:

mysql> SELECT * FROM sakila.actor LIMIT 2G
*************************** 1. row ***************************
  actor_id: 1
first_name: dolores nulla sunt velit placeat minima adipi
 last_name: quo non similique enim inventore cupiditate.
last_update: 2017-01-08 13:06:44
*************************** 2. row ***************************
  actor_id: 2
first_name: at commodi tenetur est maiores nobis accusamu
 last_name: quod suscipit provident est voluptatem quis t
last_update: 2017-03-07 00:05:01
2 rows in set (0.00 sec)

Adding data to tables with Foreign Keys

Now that we have the actor table filled with data, let’s try the foreign key functionality, which is one of the tool’s biggest strengths (since coming up with data that makes sense in this context is even more laborious).

If we look at the sakila structure, a good candidate for trying this is film_actor. Apart from the actor table, it also needs the film table to have data (due to FK constraints), which in turn needs the language table to have data. Easy enough!

shell> mysql_random_data_load sakila language 15 --host=127.0.0.1 --port=3306 --user=my_user --password=my_password
2017/12/22 00:48:30 Starting
  0s [===============================================] 100%
Total rows inserted: 15
shell>  mysql_random_data_load sakila film 10000
--host=127.0.0.1 --port=3306 --user=my_user --password=my_password
2017/12/22 00:51:18 There are triggers on the film table that might affect this process:
2017/12/22 00:51:18 Trigger "ins_film", AFTER INSERT
[...output trimmed...]
2017/12/22 00:51:18 Trigger "upd_film", AFTER UPDATE
[...output trimmed...]
2017/12/22 00:51:18 Trigger "del_film", AFTER DELETE
[...output trimmed...]
2017/12/22 00:51:18 Starting
  1s [===============================================] 100%
Total rows inserted: 10000

In this last example, we already used the foreign key functionality without even modifying the command we used for “regular” tables. You can disregard the notes about the triggers in this case.

We are now ready to fill the film_actor table. Let’s see two new variables: –bulk-size and –max-fk-samples. The former can help us with creating INSERTs with as many rows as we want, which helps expedite the process if many are needed. The latter chooses the number of samples from the film and actor tables, to use for the FK constraints. Note that these two variables already have defaults, so we can choose not to include them and the tool will still work.

shell> mysql_random_data_load sakila film_actor 20000 --bulk-size=5000 --max-fk-samples=8000 --host=127.0.0.1 --port=3306 --user=my_user --password=my_password
2017/12/22 01:05:28 Starting
  1s [===============================================] 100%
Total rows inserted: 20000

A quick test shows that things went well:

mysql> SELECT a.last_name, f.title, f.special_features, f.rental_rate
FROM film_actor fa
JOIN actor a
 USING (actor_id)
JOIN film f
 USING (film_id)
LIMIT 1G
*************************** 1. row ***************************
      last_name: enim sequi recusandae dolores voluptatem!
          title: quia provident nemo labore architecto totam.
special_features: Behind the Scenes
    rental_rate: 3.15
1 row in set (0.00 sec)

Specifying queries per second

Starting from version 0.1.6, there is a new experimental feature to insert rows at a specific rate: –qps <queries per second>. This option could be used when you need not only data but also at a specific rate (for example, to test a monitoring system like PMM). Since the total number of rows to be inserted is still in effect, this option makes more sense when –bulk-size=1 is used.

Summary

As we have seen, it is really easy to insert random data into tables so you can later use them in your tests. There is still more work to do on the tool, but we have found that it is already of great help. We would love to hear back from you on it either here or, even better, via the issues GitHub page directly. If there are any use-cases or functionality you can think of, let us know!

via Planet MySQL
Insert Random Data into Tables Using mysql_random_data_load