October 2022 – Page 3

October 19, 2022

A Lion Bonks a Tree and More in the 2022 Comedy Wildlife Photo Awards

https://i.kinja-img.com/gawker-media/image/upload/c_fill,f_auto,fl_progressive,g_center,h_675,pg_1,q_80,w_1200/5c82ef56cac875ac6429401962bf72c4.jpg

One meerkat appears to strangle another in cold blood.

Photo: © Emmanuel Do Linh San / Comedywildlifephoto.com.

“I was following a group of meerkats on foot in the Kalahari Trails Game Reserve, in South Africa. Most individuals, including adults, were in a playful mood. It gave me a unique opportunity to capture very interesting and dynamic interactions between some members of the group. In the photo that I have selected, there is no aggression between individuals, but rather an interaction that reminds us of humans when one of your friends jokes about you and you pretend to strangle them and, in response, they open their mouth like a simpleton,” said photographer Emmanuel Do Linh San.

Gizmodo

October 17, 2022

MySQL HeatWave Lakehouse Announcement Lauded by Top Industry Analysts

Leading industry analysts such as Constellation research, Wikibon, Futurum, Moor Insights & Strategy, KuppingerCole Analysts, and DBInsight, had the following to say about the recent MySQL HeatWave Lakehouse announcement.Planet MySQL

October 17, 2022

Laravel 9 Model Events Every Developer Should Know

Posted

Mahedi Hasan

Category
Laravel 9

Published
September 14, 2022

Hello artisan,

If you work on react js, Vue js, or WordPress, you will notice that there are some basic hooks to functionalize the corresponding language. Laravel has some model events that are automatically called when a model makes some changes.

In this tutorial, I will share with you some knowledge of Laravel model events and how we can use them in our code to make it dynamic. Assume, you would like to trigger something having changed a model data. You know, we can call model events in this situation.

There are so many model events in Laravel 9. Such as creating, created, updating, updated, saving, saved, deleting, deleted, restoring, restored. Events allow you to easily execute code each time a specific model class is saved or updated in the database.

Take a brief look:

creating: Call Before Create Record.
created: Call After Created Record.
updating: Call Before Update Record
updated: Class After Updated Record
deleting: Call Before Delete Record.
deleted: Call After Deleted Record
retrieved: Call Retrieve Data from Database.
saving: Call Before Creating or Updating Record.
saved: Call After Created or Updated Record.
restoring: Call Before Restore Record.
restored: Call After Restore Record.
replicating: Call on replicate Record

laravel-9-model-events-example

Let’s see an example of the use case of model events in Laravel:

namespace App\Models;
  
use Illuminate\Database\Eloquent\Factories\HasFactory;
use Illuminate\Database\Eloquent\Model;
use Log;
use Str;
   
class Product extends Model
{
    use HasFactory;

    public static function boot() {
        parent::boot();

        static::creating(function($item) {            
            Log::info('This event will call before creating data'); 

            //you can write you any logic here like:
            $item->slug = Str::slug($item->name);
        });
  
        static::created(function($item) {           
            Log::info('This event will call after creating data'); 
        });
  
        static::updating(function($item) {            
            Log::info('This event will call before updating data'); 
        });
  
        static::updated(function($item) {  
            Log::info('This event will call after updating data'); 
        });

        static::deleted(function($item) {            
            Log::info('This event will call after deleting data'); 
        });
    }
}

You can also use $dispatchesEvents property of a model to call an event like:

protected $dispatchesEvents = [
   'saved' => \App\Events\TestEvent::class,
];

This TestEvent will call after saving the data. You write your logic inside TestEvent the class. Hope it can help you.

Read also: How to Use Laravel Model Events

Hope it can help you.

Laravel News Links

October 17, 2022

Require Signatures and Associate Them With Eloquent Models

https://laravelnews.imgix.net/images/laravel-signature-pad.jpg?ixlib=php-3.3.1

Laravel pad signature is a package to require a signature associated with an Eloquent model and optionally generate certified PDFs.

This package works by using a RequiresSignature trait (provided by the package) on an Eloquent model you want to associate with a signature. Taken from the readme, that might look like the following:

 1namespace App\Models;
 2 
 3use Creagia\LaravelSignPad\Concerns\RequiresSignature;
 4use Creagia\LaravelSignPad\Contracts\CanBeSigned;
 5 
 6class Delivery extends Model
 7{
 8    use RequiresSignature;
 9    // ...
10}

You can also generate PDF documents with the signature by implementing the package’s ShouldGenerateSignatureDocument interface. You can use a blade file or a PDF file as the basis for the signature document template. See the readme for details.

Once you have the package set up, a blade component provides the needed HTML to render the signature pad (shown with customizations):

1<x-creagia-signature-pad
2    border-color="#eaeaea"
3    pad-classes="rounded-xl border-2"
4    button-classes="bg-gray-100 px-4 py-2 rounded-xl mt-4"
5    clear-name="Clear"
6    submit-name="Submit"
7/>

Once you’ve collected signatures from users, you can access the signature image and document on the model like so:

1// Get the signature image path
2$myModel->signature->getSignatureImagePath();
3 
4// Get the signed document path
5$myModel->signature->getSignedDocumentPath();

This package also supports certifying the PDF, and instructions are provided in the readme. You can learn more about this package, get full installation instructions, and view the source code on GitHub.

This package was submitted to our Laravel News Links section. Links is a place the community can post packages and tutorials around the Laravel ecosystem. Follow along on Twitter @LaravelLinks

Laravel News

October 16, 2022

This woman just dropped a biblical truth bomb for the black community

https://media.notthebee.com/articles/634db34ab88c5634db34ab88c6.jpg

Amen to this!

Not the Bee

October 16, 2022

Learn how to upload files in Laravel like a Pro

https://laravelnews.imgix.net/images/uploading-files-like-a-pro.png?ixlib=php-3.3.1

One of the things that I see many people struggling with is file uploads. How do we upload a file in Laravel? What is the best way to upload a file? In this tutorial, I will walk through from a basic version using blade and routes, getting more advanced, and then how we might handle it in Livewire too.

To begin with, let’s look at how we might do this in standard Laravel and Blade. There are a few packages that you can use for this – however, I am not a fan of installing a package for something as simple as uploading a file. However, suppose you want to upload a file and associate it with a model and have different media collections for your model. In that case, Spatie has a great package called MediaLibrary and MediaLibrary Pro, which takes a lot of the hassle out of this process for you.

Let’s assume we want to do this ourselves and not lean on a package for this. We will want to create a form that allows us to upload a file and submit it – and a controller will accept this form, validate the input and handle the upload.

Before that, however, let us create a database table in which we can store our uploads. Imagine a scenario where we want to upload and attach files to different models. We wish to have a centralized table for our media that we can attach instead of uploading multiple versions for each model.

Let’s create this first using the following artisan command:

1php artisan make:model Media -m

This will create the model and migration for us to get started. Let’s take a look at the up method on the migration so we can understand what we will want to store and understand:

 1Schema::create('media', function (Blueprint $table) {
 2    $table->id();
 3 
 4    $table->string('name');
 5    $table->string('file_name');
 6    $table->string('mime_type');
 7    $table->string('path');
 8    $table->string('disk')->default('local');
 9    $table->string('file_hash', 64)->unique();
10    $table->string('collection')->nullable();
11 
12    $table->unsignedBigInteger('size');
13 
14    $table->timestamps();
15});

Our media will require a name so that we can pull the client’s original name from the upload. We then want a file name, which will be a generated name. Storing uploaded files using the original file name can be a significant issue regarding security, especially if you are not validating strongly enough. The mime type is then required so that we can understand what was uploaded, whether it was a CSV or an image. The path to the upload is also handy to store, as it allows us to reference it more easily. We record the disk we are storing this on so that we can dynamically work with it within Laravel. However, we might be interacting with our application. We store the file hash as a unique column to ensure we do not upload the same file more than once. If the file changes, this would be a new variation and ok to upload again. Finally, we have collection and size, where we can save a file to a collection such as “blog posts”, creating a virtual directory/taxonomy structure. Size is there for informational purposes mainly but will allow you to ensure that your digital assets aren’t too large.

Now we know where we want to store these uploads, we can look into how we want to upload them. We will start with a simple implementation inside a route/controller and expand from there.

Let’s create our first controller using the following artisan command:

1php artisan make:controller UploadController --invokable

This will be where we route uploads to, for now, an invokable controller that will synchronously handle the file upload. Add this as a route in your web.php like so:

1Route::post('upload', App\Http\Controllers\UploadController::class)->name('upload');

Then we can look at how we want this process to work. To start with, like all other endpoints – we want to validate the input early on. I like to do this in a form request as it keeps things encapsulated nicely. You can do this part however you feel appropriate; I will show you the rules below I use:

1use Illuminate\Validation\Rules\File;
2 
3return [
4    'file' => [
5        'required',
6        File::types(['png', 'jpg'])
7            ->max(5 * 1024),
8    ]
9];

So we must send a file across in our request, and it must be either a PNG or JPG and not be any bigger than 5Gb. You can use configuration to store your default rules for this if you find it more accessible. However, I usually create a Validator class specific for each use case, for example:

 1class UserUploadValidator
 2{
 3    public function avatars(): array
 4    {
 5        return [
 6            'required',
 7           File::types(['png', 'jpg'])
 8               ->max(5 * 1024),
 9        ];
10    }
11}

Once you have your validation in place, you can handle this in your controller however you need to. Assume that I am using a Form Request and injecting this into my controller, though. Now we have validated, we need to process. My general approach to controllers is:

In an API, I process in the background, which usually means dispatching a job – but on the web, that isn’t always convenient. Let’s look at how we might process a file upload.

 1class UploadController
 2{
 3    public function __invoke(UploadRequest $request)
 4    {
 5        Gate::authorize('upload-files');
 6 
 7        $file = $request->file('file');
 8        $name = $file->hashName();
 9 
10        $upload = Storage::put("avatars/{$name}", $file);
11 
12        Media::query()->create(
13            attributes: [
14                'name' => "{$name}",
15                'file_name' => $file->getClientOriginalName(),
16                'mime_type' => $file->getClientMimeType(),
17                'path' => "avatars/{$name}"
18,
19                'disk' => config('app.uploads.disk'),
20                'file_hash' => hash_file(
21                    config('app.uploads.hash'),
22                    storage_path(
23                        path: "avatars/{$name}",
24                    ),
25                ),
26                'collection' => $request->get('collection'),
27                'size' => $file->getSize(),
28            ],
29        );
30 
31        return redirect()->back();
32    }
33}

So what we are doing here is first ensuring that the logged-in user is authorized to upload files. Then we want to get the file uploaded and the hashed name to store. We then upload the file and store the record in the database, getting the information we need for the model from the file itself.

I would call this the default approach to uploading files, and I will admit there is nothing wrong with this approach. If your code looks something like this already, you are doing a good job. However, we can, of course, take this further – in a few different ways.

The first way we could achieve this is by extracting the upload logic to an UploadService where it generates everything we need and returns a Domain Transfer Object (which I call Data Objects) so that we can use the properties of the object to create the model. First, let us make the object we want to return.

 1class File
 2{
 3    public function __construct(
 4        public readonly string $name,
 5        public readonly string $originalName,
 6        public readonly string $mime,
 7        public readonly string $path,
 8        public readonly string $disk,
 9        public readonly string $hash,
10        public readonly null|string $collection = null,
11    ) {}
12 
13    public function toArray(): array
14    {
15        return [
16            'name' => $this->name,
17            'file_name' => $this->originalName,
18            'mime_type' => $this->mime,
19            'path' => $this->path,
20            'disk' => $this->disk,
21            'file_hash' => $this->hash,
22            'collection' => $this->collection,
23        ];
24    }
25}

Now we can look at the upload service and figure out how we want it to work. If we look at the logic within the controller, we know we will want to generate a new name for the file and get the upload’s original name. Then we want to put the file into storage and return the Data Object. As with most code I write, the service should implement an interface that we can then bind to the container.

 1class UploadService implements UploadServiceContract
 2{
 3    public function avatar(UploadedFile $file): File
 4    {
 5        $name = $file->hashName();
 6 
 7        $upload = Storage::put("{$name}", $file);
 8 
 9        return new File(
10            name: "{$name}",
11            originalName: $file->getClientOriginalName(),
12            mime: $file->getClientMimeType(),
13            path: $upload->path(),
14            disk: config('app.uploads.disk'),
15            hash: file_hash(
16                    config('app.uploads.hash'),
17                    storage_path(
18                        path: "avatars/{$name}",
19                    ),
20            ),
21            collection: 'avatars',
22        );
23    }
24}

Let us refactor our UploadController now so that it is using this new service:

 1class UploadController
 2{
 3    public function __construct(
 4        private readonly UploadServiceContract $service,
 5    ) {}
 6 
 7    public function __invoke(UploadRequest $request)
 8    {
 9        Gate::authorize('upload-files');
10 
11        $file = $this->service->avatar(
12            file: $request->file('file'),
13        );
14 
15        Media::query()->create(
16            attributes: $file->toArray(),
17        );
18 
19        return redirect()->back();
20    }
21}

Suddenly our controller is a lot cleaner, and our logic has been extracted to our new service – so it is repeatable no matter where we need to upload a file. We can, of course, write tests for this, too, because why do anything you cannot test?

 1it('can upload an avatar', function () {
 2    Storage::fake('avatars');
 3 
 4    $file = UploadedFile::fake()->image('avatar.jpg');
 5 
 6    post(
 7        action(UploadController::class),
 8        [
 9             'file' => $file,
10        ],
11    )->assertRedirect();
12 
13    Storage::disk('avatars')->assertExists($file->hashName());
14});

We are faking the storage facade, creating a fake file to upload, and then hitting our endpoint and sending the file. We then asserted that everything went ok, and we were redirected. Finally, we want to assert that the file now exists on our disk.

How could we take this further? This is where we are getting into the nitty gritty, depending on your application. Let’s say, for example, that in your application, there are many different types of uploads that you might need to do. We want our upload service to reflect that without getting too complicated, right? This is where I use a pattern I call the “service action pattern”, where our service calls an action instead of handling the logic. This pattern allows you to inject a single service but call multiple actions through it – keeping your code clean and focused, and your service is just a handy proxy.

Let us first create the action:

 1class UploadAvatar implements UploadContract
 2{
 3    public function handle(UploadedFile $file): File
 4    {
 5        $name = $file->hashName();
 6 
 7        Storage::put("{$name}", $file);
 8 
 9        return new File(
10            name: "{$name}",
11            originalName: $file->getClientOriginalName(),
12            mime: $file->getClientMimeType(),
13            path: $upload->path(),
14            disk: config('app.uploads.disk'),
15            hash: hash_file(
16                config('app.uploads.hash'),
17                storage_path(
18                    path: "avatars/{$name}",
19                ),
20            ),
21            collection: 'avatars',
22            size: $file->getSize(),
23        );
24    }
25}

Now we can refactor our service to call the action, acting as a useful proxy.

 1class UploadService implements UploadServiceContract
 2{
 3    public function __construct(
 4        private readonly UploadContract $avatar,
 5    ) {}
 6 
 7    public function avatar(UploadedFile $file): File
 8    {
 9        return $this->avatar->handle(
10            file: $file,
11        );
12    }
13}

This feels like over-engineering for a minor application. Still, for more extensive media-focused applications, this will enable you to handle uploads through one service that can be well-documented instead of fragmented knowledge throughout your team.

Where can we take it from here? Let’s step into user-land for a moment and assume we are using the TALL stack (because why wouldn’t you!?). With Livewire, we have a slightly different approach where Livewire will handle the upload for you and store this as a temporary file, which gives you a somewhat different API to work with when it comes to storing the file.

Firstly we need to create a new Livewire component that we can use for our file upload. You can create this using the following artisan command:

1php artisan livewire:make UploadForm --test

Now we can add a few properties to our component and add a trait so that the component knows it handles file uploads.

 1final class UploadForm extends Component
 2{
 3    use WithFileUploads;
 4 
 5    public null|string|TemporaryUploadedFile $file = null;
 6 
 7    public function upload()
 8    {
 9        $this->validate();
10    }
11 
12    public function render(): View
13    {
14        return view('livewire.upload-form');
15    }
16}

Livewire comes with a handy trait that allows us to work with File uploads straightforwardly. We have a file property that could be null, a string for a path, or a Temporary File that has been uploaded. This is perhaps the one part about file uploads in Livewire that I do not like.

Now that we have a basic component available, we can look at moving the logic from our controller over to the component. One thing we would do here is to move the Gate check from the controller to the UI so that we do not display the form if the user cannot upload files. This simplifies our component logic nicely.

Our next step is injecting the UploadService into our upload method, which Livewire can resolve for us. Alongside this, we will want to handle our validation straight away. Our component should not look like the following:

 1final class UploadForm extends Component
 2{
 3    use WithFileUploads;
 4 
 5    public null|string|TemporaryUploadedFile $file;
 6 
 7    public function upload(UploadServiceContract $service)
 8    {
 9        $this->validate();
10    }
11 
12    public function rules(): array
13    {
14        return (new UserUploadValidator())->avatars();
15    }
16 
17    public function render(): View
18    {
19        return view('livewire.upload-form');
20    }
21}

Our validation rules method returns our avatar validation rules from our validation class, and we have injected the service from the container. Next, we can add our logic for actually uploading the file.

 1final class UploadForm extends Component
 2{
 3    use WithFileUploads;
 4 
 5    public null|string|TemporaryUploadedFile $file;
 6 
 7    public function upload(UploadServiceContract $service)
 8    {
 9        $this->validate();
10 
11        try {
12            $file = $service->avatar(
13                file: $this->file,
14            );
15        } catch (Throwable $exception) {
16            throw $exception;
17        }
18 
19        Media::query()->create(
20            attributes: $file->toArray(),
21        );
22    }
23 
24    public function rules(): array
25    {
26        return (new UserUploadValidator())->avatars();
27    }
28 
29    public function render(): View
30    {
31        return view('livewire.upload-form');
32    }
33}

We need minimal changes to how our logic works – we can move it almost straight into place, and it will work.

This is how I find uploading files works for me; there are, of course, many ways to do this same thing – and some/most of them are a little simpler. It wouldn’t be a Steve tutorial if I didn’t go a little opinionated and overboard, right?

How do you like to handle file uploads? Have you found a way that works well for your use case? Let us know on Twitter!

Laravel News

October 16, 2022

Top Gun but with Cats

https://theawesomer.com/photos/2022/10/top_gun_with_cats_t.jpg

Link

When the local pigeons wreak havoc on his hometown, Prince Michael and his furry pals decide to take action. After a failed attempt at a ground offensive, the cats (and dog) of Aaron’s Animals take to the skies in tiny fighter jets on a mission to defeat their avian, poop-bombing foes.

The Awesomer

October 15, 2022

MySQL Workbench Keys

https://blog.mclaughlinsoftware.com/wp-content/uploads/2022/10/lookup_erd.png

As I teach students how to create tables in MySQL Workbench, it’s always important to review the meaning of the checkbox keys. Then, I need to remind them that every table requires a natural key from our prior discussion on normalization. I explain that a natural key is a compound candidate key (made up of two or more column values), and that it naturally defines uniqueness for each row in a table.

Then, we discuss surrogate keys, which are typically ID column keys. I explain that surrogate keys are driven by sequences in the database. While a number of databases disclose the name of sequences, MySQL treats the sequence as an attribute of the table. In Object-Oriented Analysis and Design (OOAD), that makes the sequence a member of the table by composition rather than aggregation. Surrogate keys are also unique in the table but should never be used to determine uniqueness like the natural key. Surrogate keys are also candidate keys, like a VIN number uniquely identifies a vehicle.

In a well designed table you always have two candidate keys: One describes the unique row and the other assigns a number to it. While you can perform joins by using either candidate key, you always should use the surrogate key for joins statements. This means you elect, or choose, the surrogate candidate key as the primary key. Then, you build a unique index for the natural key, which lets you query any unique row with human decipherable words.

The column attribute table for MySQL Workbench is:

Key	Meaning
PK	Designates a primary key column.
NN	Designates a not-null column constraint.
UQ	Designates a column contains a unique value for every row.
BIN	Designates a VARCHAR data type column so that its values are stored in a case-sensitive fashion. You can’t apply this constraint to other data types.
UN	Designates a column contains an unsigned numeric data type. The possible values are 0 to the maximum number of the data type, like integer, float, or double. The value 0 isn’t possible when you also select the PK and AI check boxes, which ensures the column automatically increments to the maximum value of the column.
ZF	Designates a zero fill populates zeros in front of any number data type until all space is consumed, which acts like a left pad function with zeros.
AI	Designates AUTO_INCREMENT and should only be checked for a surrogate primary key value.

All surrogate key columns should check the PK, NN, UN, and AI checkboxes. The default behavior checks only the PK and NN checkboxes and leaves the UN and AI boxes unchecked. You should also click the UN checkbox with the AI checkbox for all surrogate key columns. The AI checkbox enables AUTO_INCREMENT behavior. The UN checkbox ensure you have the maximum number of integers before you would migrate the table to a double precision number.

Active tables grow quickly and using a signed int means you run out of rows more quickly. This is an important design consideration because using a unsigned int adds a maintenance task later. The maintenance task will require changing the data type of all dependent foreign key columns before changing the primary key column’s data type. Assuming you’re design uses referential integrity constraints, implemented as a foreign keys, you will need to:

Remove any foreign key constraints before changing the referenced primary key and dependent foreign key column’s data types.
Change the primary and foreign key column’s data types.
Add back foreign key constraints after changing the referenced primary key and dependent foreign key column’s data types.

While fixing a less optimal design is a relatively simple scripting exercise for most data engineers, you can avoid this maintenance task. Implement all surrogate primary key columns and foreign key columns with the signed int as their initial data type.

The following small ERD displays a multi-language lookup table, which is preferable to a monolinquistic enum data type.:

A design uses a lookup table when there are known lists of selections to make. There are known lists that occur in most if not all business applications. Maintaining that list of values is an application setup task and requires the development team to build an entry and update form to input and maintain the lists.

While some MySQL examples demonstrate these types of lists by using the MySQL enum data type. However, the MySQL enum type doesn’t support multilingual implementations, isn’t readily portable to other relational database, and has a number of limitations.

A lookup table is the better solution to using an enum data type. It typically follows this pattern:

Identify the target table and column where a list is useful. Use the table_name and column_name columns as a super key to identify the location where the list belongs.
Identify a unique type identifier for the list. Store the unique type value in the type column of the lookup table.
Use a lang column to enable multilingual lists.

The combination of the table_name, column_name, type, and lang let you identify unique sets. You can find a monolingual implementation in these two older blog posts:

The column view of the lookup table shows the appropriate design checkboxes:

While most foreign keys use copies of surrogate keys, there are instances when you copy the natural key value from another table rather than the surrogate key. This is done when your application will frequently query the dependent lookup table without a join to the lang table, which means the foreign key value should be a human friendly foreign key value that works as a super key.

A super key is a column or set of columns that uniquely identifies a rows in the scope of a relation. For this example, the lang column identifies rows that belong to a language in a multilingual data model. Belonging to a language is the relation between the lookup and language table. It is also a key when filtering rows with a specific lang value from the lookup table.

You navigate to the foreign key tab to create a lookup_fk foreign key constraint, like:

With this type of foreign key constraint, you copy the lang value from the language table when inserting the lookup table values. Then, your HTML forms can use the lookup table’s meaning column in any of the supported languages, like:

SELECT lookup_id
,      type
,      meaning
FROM   lookup
WHERE  table_name = 'some_table_name'
AND    column_name = 'some_column_name'
AND    lang = 'some_lang_name';

The type column value isn’t used in the WHERE clause to filter the data set because it is unique within the relation of the table_name, column_name, and lang column values. It is always non-unique when you exclude the lang column value, and potentially non-unique for another combination of the table_name and column_name column values.

If I’ve left questions, let me know. Other wise, I hope this helps qualify a best design practice.

Planet MySQL

October 13, 2022

Analyzing queries in MySQL Database Service – Slow Query Log (part 1)

https://i0.wp.com/lefred.be/wp-content/uploads/2022/10/fn_workflow.png?w=758&ssl=1

In my previous post, I explained how to deal with Performance_Schema and Sys to identify the candidates for Query Optimization but also to understand the workload on the database.

In this article, we will see how we can create an OCI Fn Application that will generate a slow query log from our MySQL Database Service instance and store it to Object Storage.

The creation of the function and its use is similar to the one explained in the previous post about creating a logical dump of a MySQL instance to Object Storage.

The Application

We need to create an application and 2 functions, one to extract the data in JSON format and one in plain text. We also need to deploy an API Gateway that will allow to call those function from anywhere (publicly):

Let’s start by creating the application in OCI console:

We need to use the Public Subnet of our VCN to be able to reach it later from our API Gateway:

After we click on Create, we can see our new Application created:

We then need to follow the statement displayed on the rest of the page. We use Cloud Shell:

This looks like this:

fdescamp@cloudshell:~ (us-ashburn-1)$ fn update context registry iad.ocir.io/i***********j/lefred
Current context updated registry with iad.ocir.io/i***********j/lefred
fdescamp@cloudshell:~ (us-ashburn-1)$ fn update context registry iad.ocir.io/i***********j/lefred
Current context updated registry with iad.ocir.io/i***********j/lefred
fdescamp@cloudshell:~ (us-ashburn-1)$ docker login -u 'idinfdw2eouj/fdescamp' iad.ocir.io
Password: **********
WARNING! Your password will be stored unencrypted in /home/fdescamp/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded
fdescamp@cloudshell:~ (us-ashburn-1)$ fn list apps
NAME            ID
slow_query_log  ocid1.fnapp.oc1.iad.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxq

After that in Cloud Shell, we initialize our two new functions:

fdescamp@cloudshell:~ (us-ashburn-1)$ fn init --runtime python mysql_slowlog_txt
Creating function at: ./mysql_slowlog_txt
Function boilerplate generated.
func.yaml created.
fdescamp@cloudshell:~ (us-ashburn-1)$ fn init --runtime python mysql_slowlog
Creating function at: ./mysql_slowlog
Function boilerplate generated.
func.yaml created.

Both functions are initialized, we will start with the one dumping the queries in JSON format.

Function mysql_slowlog

As this is the first function of our application, we will define a Dockerfile and the requirements in a file (requirements.txt) in the folder of the function:

fdescamp@cloudshell:~ (us-ashburn-1)$ cd mysql_slowlog
fdescamp@cloudshell:mysql_slowlog (us-ashburn-1)$ ls
Dockerfile  func.py  func.yaml  requirements.txt

We need to add the following content in the Dockerfile:

FROM fnproject/python:3.9-dev as build-stage
WORKDIR /function
ADD requirements.txt /function/

RUN pip3 install --target /python/  --no-cache --no-cache-dir -r requirements.txt && rm -fr ~/.cache/pip /tmp*  requirements.txt func.yaml Dockerfile .venv && chmod -R o+r /python

ADD . /function/

RUN rm -fr /function/.pip_cache

FROM fnproject/python:3.9
WORKDIR /function

COPY --from=build-stage /python /python
COPY --from=build-stage /function /function

RUN chmod -R o+r /function && mkdir -p /home/fn && chown fn /home/fn

ENV PYTHONPATH=/function:/python
ENTRYPOINT ["/python/bin/fdk", "/function/func.py", "handler"]

The requirements.txt file needs to contain the following lines:

fdk>=0.1.48
oci
mysql-connector-python

We also need to modify the content of func.yaml file to increase the memory to 2048:

memory: 2048

All the magic of the function resides in the Python file func.py.

Modify the content of the file with the code of the file linked above.

Once done, we can deploy the function:

fdescamp@cloudshell:mysql_slowlog (us-ashburn-1)$ fn -v deploy --app slow_query_log
Deploying mysql_slowlog to app: slow_query_log
Bumped to version 0.0.1
Using Container engine docker
Building image iad.ocir.io/i**********j/lefred/mysql_slowlog:0.0.1
Dockerfile content
-----------------------------------
FROM fnproject/python:3.9-dev as build-stage
WORKDIR /function
ADD requirements.txt /function/

RUN pip3 install --target /python/  --no-cache --no-cache-dir -r requirements.txt && rm -fr ~/.cache/pip /tmp*  requirements.txt func.yaml Dockerfile .venv && chmod -R o+r /python

ADD . /function/

RUN rm -fr /function/.pip_cache

FROM fnproject/python:3.9
WORKDIR /function

COPY --from=build-stage /python /python
COPY --from=build-stage /function /function

RUN chmod -R o+r /function && mkdir -p /home/fn && chown fn /home/fn

ENV PYTHONPATH=/function:/python
ENTRYPOINT ["/python/bin/fdk", "/function/func.py", "handler"]


-----------------------------------
FN_REGISTRY:  iad.ocir.io/i**********j/lefred
Current Context:  us-ashburn-1
Sending build context to Docker daemon  9.728kB
Step 1/13 : FROM fnproject/python:3.9-dev as build-stage
 ---> 808c3fde4a95
Step 2/13 : WORKDIR /function
 ---> Using cache
 ---> 7953c328cf0e
Step 3/13 : ADD requirements.txt /function/
 ---> Using cache
 ---> 5d44308f3376
Step 4/13 : RUN pip3 install --target /python/  --no-cache --no-cache-dir -r requirements.txt && rm -fr ~/.cache/pip /tmp*  requirements.txt func.yaml Dockerfile .venv && chmod -R o+r /python
 ---> Using cache
 ---> 608ec9527aca
Step 5/13 : ADD . /function/
 ---> ae85dfe7245e
Step 6/13 : RUN rm -fr /function/.pip_cache
 ---> Running in 60421dfa5e4d
Removing intermediate container 60421dfa5e4d
 ---> 06de6b9b1860
Step 7/13 : FROM fnproject/python:3.9
 ---> d6c82f055722
Step 8/13 : WORKDIR /function
 ---> Using cache
 ---> b6bf41dd40e4
Step 9/13 : COPY --from=build-stage /python /python
 ---> Using cache
 ---> c895f3bb74f7
Step 10/13 : COPY --from=build-stage /function /function
 ---> b397ec7769a1
Step 11/13 : RUN chmod -R o+r /function && mkdir -p /home/fn && chown fn /home/fn
 ---> Running in 5af6a775d055
Removing intermediate container 5af6a775d055
 ---> fac578e4290a
Step 12/13 : ENV PYTHONPATH=/function:/python
 ---> Running in fe0bb2f24d6e
Removing intermediate container fe0bb2f24d6e
 ---> c0460b0ca6f9
Step 13/13 : ENTRYPOINT ["/python/bin/fdk", "/function/func.py", "handler"]
 ---> Running in 0ed370d1b391
Removing intermediate container 0ed370d1b391
 ---> 6907b3653dac
Successfully built 6907b3653dac
Successfully tagged iad.ocir.io/i************j/lefred/mysql_slowlog:0.0.1

Parts:  [iad.ocir.io i*************j lefred mysql_slowlog:0.0.1]
Using Container engine docker to push
Pushing iad.ocir.io/i********j/lefred/mysql_slowlog:0.0.1 to docker registry...The push refers to repository [iad.ocir.io/i**********j/lefred/mysql_slowlog]
50019643244c: Pushed 
b2b65f9f6bdd: Pushed 
4ae76999236e: Layer already exists 
9dbf415302a5: Layer already exists 
fcc297df3f46: Layer already exists 
79b7117c006c: Layer already exists 
05dc728e5e49: Layer already exists 
0.0.1: digest: sha256:e0a693993c7470557fac557cba9a2a4d3e828fc2d21789afb7ebe6163f4d4c14 size: 1781
Updating function mysql_slowlog using image iad.ocir.io/i**********j/lefred/mysql_slowlog:0.0.1...

Function mysql_slowlog_txt

Now we go in the ../mysql_slowlog_txt directory and we modify the func.yaml file to increase the memory to match the same amount as th previous function (2048).

Then we copy the content of this files in to func.py.

When done we can deploy that function too:

fdescamp@cloudshell:mysql_slowlog (us-ashburn-1)$ cd ../mysql_slowlog_txt/
fdescamp@cloudshell:mysql_slowlog_txt (us-ashburn-1)$ vi func.yaml 
fdescamp@cloudshell:mysql_slowlog_txt (us-ashburn-1)$ vi func.py 
fdescamp@cloudshell:mysql_slowlog_txt (us-ashburn-1)$ fn -v deploy --app slow_query_log

Variables

Our application requires some variables to work. Some will be sent every time the function is called, like which MySQL Instance, the user’s credentials, which Object Storage bucket to use, … Some will be “hardcoded” to not having to specify them each time (like the tencancy, oci user, …).

We use again Cloud Shell to specify those that won’t be specified each time:

fn config app slow_query_log oci_fingerprint "fe:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:3d"
fn config app slow_query_log oci_tenancy 'ocid1.tenancy.oc1..xxxxx'
fn config app slow_query_log oci_user "ocid1.user.oc1..xxxxxx"
fn config app slow_query_log namespace "i********j"
fn config app slow_query_log bucket "lefred-bucket"
fn config app slow_query_log oci_region "us-ashburn-1"

We also need to provide an OCI key as a string. The content of the string can be generated using base64 command line program:

And then we add it in Cloud Shell like this:

fn config app slow_query_log oci_key '<THE CONTENT OF THE STRING ABOVE>'

Testing

If we have the security list well configured (Private Subnet accepting connection on port 3306 from Public Subnet) and if we have an Object Storage bucket ready (with the name configured earlier), we can already test our functions directly form Cloud Shell:

fdescamp@cloudshell:~ (us-ashburn-1)$ echo -n '{"mds_host": "10.0.1.127",
 "mds_user": "admin", "mds_port": "3306", "mds_password": "Passw0rd!", 
 "mds_name": "lefred-mysql"}' | fn invoke slow_query_log mysql_slowlog
{"message": "MySQL Slow Log saved: slow_lefred-mysql_202210132114.json"}

fdescamp@cloudshell:~ (us-ashburn-1)$ echo -n '{"mds_host": "10.0.1.127",
 "mds_user": "admin", "mds_port": "3306", "mds_password": "Passw0rd!",
 "mds_name": "lefred-mysql"}' | fn invoke slow_query_log mysql_slowlog_txt
{"message": "MySQL Slow Log saved: slow_lefred-mysql_202210132124.log"}

We can see the files in Object Storage:

In the next article, we will see how to configure the API Gateway to call our application and store the statements on Object Storage.

And finally, we will see the content of those files and how to use them.

Stay tuned !

Planet MySQL

October 12, 2022

Using ClickHouse as an Analytic Extension for MySQL

https://www.percona.com/blog/wp-content/uploads/2022/10/altinity-6.png ClickHouse as an Analytic Extension for MySQL

ClickHouse as an Analytic Extension for MySQL MySQL is an outstanding database for online transaction processing. With suitable hardware, it is easy to execute more than 1M queries per second and handle tens of thousands of simultaneous connections. Many of the most demanding web applications on the planet are built on MySQL. With capabilities like that, why would MySQL users need anything else?

Well, analytic queries for starters. Analytic queries answer important business questions like finding the number of unique visitors to a website over time or figuring out how to increase online purchases. They scan large volumes of data and compute aggregates, including sums, averages, and much more complex calculations besides. The results are invaluable but can bog down online transaction processing on MySQL.

Fortunately, there’s ClickHouse: a powerful analytic database that pairs well with MySQL. Altinity is working closely with our partner Percona to help users add ClickHouse easily to existing MySQL applications. You can read more about our partnership in our recent press release as well as about our joint MySQL-to-ClickHouse solution.

This article provides tips on how to recognize when MySQL is overburdened with analytics and can benefit from ClickHouse’s unique capabilities. We then show three important patterns for integrating MySQL and ClickHouse. The result is more powerful, cost-efficient applications that leverage the strengths of both databases.

Signs that indicate MySQL needs analytic help

Let’s start by digging into some obvious signs that your MySQL database is overburdened with analytics processing.

Huge tables of immutable data mixed in with transaction tables

Tables that drive analytics tend to be very large, rarely have updates, and may also have many columns. Typical examples are web access logs, marketing campaign events, and monitoring data. If you see a few outlandishly large tables of immutable data mixed with smaller, actively updated transaction processing tables, it’s a good sign your users may benefit from adding an analytic database.

Complex aggregation pipelines

Analytic processing produces aggregates, which are numbers that summarize large datasets to help users identify patterns. Examples include unique site visitors per week, average page bounce rates, or counts of web traffic sources. MySQL may take minutes or even hours to compute such values. To improve performance it is common to add complex batch processes that precompute aggregates. If you see such aggregation pipelines, it is often an indication that adding an analytic database can reduce the labor of operating your application as well as deliver faster and more timely results for users.

MySQL is too slow or inflexible to answer important business questions

A final clue is the in-depth questions you don’t ask about MySQL-based applications because it is too hard to get answers. Why don’t users complete purchases on eCommerce sites? Which strategies for in-game promotions have the best payoff in multi-player games? Answering these questions directly from MySQL transaction data often requires substantial time and external programs. It’s sufficiently difficult that most users simply don’t bother. Coupling MySQL with a capable analytic database may be the answer.

Why is ClickHouse a natural complement to MySQL?

MySQL is an outstanding database for transaction processing. Yet the features of MySQL that make it work well–storing data in rows, single-threaded queries, and optimization for high concurrency–are exactly the opposite of those needed to run analytic queries that compute aggregates on large datasets.

ClickHouse on the other hand is designed from the ground up for analytic processing. It stores data in columns, has optimizations to minimize I/O, computes aggregates very efficiently, and parallelizes query processing. ClickHouse can answer complex analytic questions almost instantly in many cases, which allows users to sift through data quickly. Because ClickHouse calculates aggregates so efficiently, end users can pose questions in many ways without help from application designers.

These are strong claims. To understand them it is helpful to look at how ClickHouse differs from MySQL. Here is a diagram that illustrates how each database pulls in data for a query that reads all values of three columns of a table.

MySQL stores table data by rows. It must read the whole row to get data for just three columns. MySQL production systems also typically do not use compression, as it has performance downsides for transaction processing. Finally, MySQL uses a single thread for query processing and cannot parallelize work.

By contrast, ClickHouse reads only the columns referenced in queries. Storing data in columns enables ClickHouse to compress data at levels that often exceed 90%. Finally, ClickHouse stores tables in parts and scans them in parallel.

The amount of data you read, how greatly it is compressed, and the ability to parallelize work make an enormous difference. Here’s a picture that illustrates the reduction in I/O for a query reading three columns.

MySQL and ClickHouse give the same answer. To get it, MySQL reads 59 GB of data, whereas ClickHouse reads only 21 MB. That’s close to 3000 times less I/O, hence far less time to access the data. ClickHouse also parallelizes query execution very well, further improving performance. It is little wonder that analytic queries run hundreds or even thousands of times faster on ClickHouse than on MySQL.

ClickHouse also has a rich set of features to run analytic queries quickly and efficiently. These include a large library of aggregation functions, the use of SIMD instructions where possible, the ability to read data from Kafka event streams, and efficient materialized views, just to name a few.

There is a final ClickHouse strength: excellent integration with MySQL. Here are a few examples.

ClickHouse can ingest mysqldump and CSV data directly into ClickHouse tables.
ClickHouse can perform remote queries on MySQL tables, which provides another way to explore as well as ingest data quickly.
The ClickHouse query dialect is similar to MySQL, including system commands like SHOW PROCESSLIST, for example.
ClickHouse even supports the MySQL wire protocol on port 3306.

For all of these reasons, ClickHouse is a natural choice to extend MySQL capabilities for analytic processing.

Why is MySQL a natural complement to ClickHouse?

Just as ClickHouse can add useful capabilities to MySQL, it is important to see that MySQL adds useful capabilities to ClickHouse. ClickHouse is outstanding for analytic processing but there are a number of things it does not do well. Here are some examples.

Transaction processing – ClickHouse does not have full ACID transactions. You would not want to use ClickHouse to process online orders. MySQL does this very well.
Rapid updates on single rows – Selecting all columns of a single row is inefficient in ClickHouse, as you must read many files. Updating a single row may require rewriting large amounts of data. You would not want to put eCommerce session data in ClickHouse. It is a standard use case for MySQL.
Large numbers of concurrent queries – ClickHouse queries are designed to use as many resources as possible, not share them across many users. You would not want to use ClickHouse to hold metadata for microservices, but MySQL is commonly used for such purposes.

In fact, MySQL and ClickHouse are highly complementary. Users get the most powerful applications when ClickHouse and MySQL are used together.

Introducing ClickHouse to MySQL integration

There are three main ways to integrate MySQL data with ClickHouse analytic capabilities. They build on each other.

View MySQL data from ClickHouse. MySQL data is queryable from ClickHouse using native ClickHouse SQL syntax. This is useful for exploring data as well as joining on data where MySQL is the system of record.
Move data permanently from MySQL to ClickHouse. ClickHouse becomes the system of record for that data. This unloads MySQL and gives better results for analytics.
Mirror MySQL data in ClickHouse. Make a snapshot of data into ClickHouse and keep it updated using replication. This allows users to ask complex questions about transaction data without burdening transaction processing.

Viewing MySQL data from ClickHouse

ClickHouse can run queries on MySQL data using the MySQL database engine, which makes MySQL data appear as local tables in ClickHouse. Enabling it is as simple as executing a single SQL command like the following on ClickHouse:

CREATE DATABASE sakila_from_mysql
ENGINE=MySQLDatabase('mydb:3306', 'sakila', 'user', 'password')

Here is a simple illustration of the MySQL database engine in action.

The MySQL database engine makes it easy to explore MySQL tables and make copies of them in ClickHouse. ClickHouse queries on remote data may even run faster than in MySQL! This is because ClickHouse can sometimes parallelize queries even on remote data. It also offers more efficient aggregation once it has the data in hand.

Moving MySQL data to ClickHouse

Migrating large tables with immutable records permanently to ClickHouse can give vastly accelerated analytic query performance while simultaneously unloading MySQL. The following diagram illustrates how to migrate a table containing web access logs from ClickHouse to MySQL.

On the ClickHouse side, you’ll normally use MergeTree table engine or one of its variants such as ReplicatedMergeTree. MergeTree is the go-to engine for big data on ClickHouse. Here are three important features that will help you get the most out of ClickHouse.

Partitioning – MergeTree divides tables into parts using a partition key. Access logs and other big data tend to be time ordered, so it’s common to divide data by day, week, or month. For best performance, it’s advisable to pick a number that results in 1000 parts or less.
Ordering – MergeTree sorts rows and constructs an index on rows to match the ordering you choose. It’s important to pick a sort order that gives you large “runs” when scanning data. For instance, you could sort by tenant followed by time. That would mean a query on a tenant’s data would not need to jump around to find rows related to that tenant.
Compression and codecs – ClickHouse uses LZ4 compression by default but also offers ZSTD compression as well as codecs. Codecs reduce column data before turning it over to compression.

These features can make an enormous difference in performance. We cover them and add more performance tips in Altinity videos (look here and here.) as well as blog articles.

The ClickHouse MySQL database engine can also be very useful in this scenario. It enables ClickHouse to “see” and select data from remote transaction tables in MySQL. Your ClickHouse queries can join local tables on transaction data whose natural home is MySQL. Meanwhile, MySQL handles transactional changes efficiently and safely.

Migrating tables to ClickHouse generally proceeds as follows. We’ll use the example of the access log shown above.

Create a matching schema for the access log data on ClickHouse.
Dump/load data from MySQL to ClickHouse using any of the following tools:
1. Mydumper – A parallel dump/load utility that handles mysqldump and CSV formats.
2. MySQL Shell – A general-purpose utility for managing MySQL that can import and export tables.
3. Copy data using SELECT on MySQL database engine tables.
4. Native database commands – Use MySQL SELECT OUTFILE to dump data to CSV and read back in using ClickHouse INSERT SELECT FROM file(). ClickHouse can even read mysqldump format.
Check performance with suitable queries; make adjustments to the schema and reload if necessary.
Adapt front end and access log ingest to ClickHouse.
Run both systems in parallel while testing.
Cut over from MySQL only to MySQL + ClickHouse Extension.

Migration can take as little as a few days but it’s more common to take weeks to a couple of months in large systems. This helps ensure that everything is properly tested and the roll-out proceeds smoothly.

Mirroring MySQL Data in ClickHouse

The other way to extend MySQL is to mirror the data in ClickHouse and keep it up to date using replication. Mirroring allows users to run complex analytic queries on transaction data without (a) changing MySQL and its applications or (b) affecting the performance of production systems.

Here are the working parts of a mirroring setup.

ClickHouse has a built-in way to handle mirroring: the experimental MaterializedMySQL database engine, which reads binlog records directly from the MySQL primary and propagates data into ClickHouse tables. The approach is simple but is not yet recommended for production use. It may eventually be important for 1-to-1 mirroring cases but needs additional work before it can be widely used.

Altinity has developed a new approach to replication using Debezium, Kafka-compatible event streams, and the Altinity Sink Connector for ClickHouse. The mirroring configuration looks like the following.

The externalized approach has a number of advantages. They include working with current ClickHouse releases, taking advantage of fast dump/load programs like mydumper or direct SELECT using MySQL database engine, support for mirroring into replicated tables, and simple procedures to add new tables or reset old ones. Finally, it can extend to multiple upstream MySQL systems replicating to a single ClickHouse cluster.

ClickHouse can mirror data from MySQL thanks to the unique capabilities of the ReplacingMergeTree table. It has an efficient method of dealing with inserts, updates, and deletes that is ideally suited for use with replicated data. As mentioned already, ClickHouse cannot update individual rows easily, but it inserts data extremely quickly and has an efficient process for merging rows in the background. ReplicatingMergeTree builds on these capabilities to handle changes to data in a “ClickHouse way.”

Replicated table rows use version and sign columns to represent the version of changed rows as well as whether the change is an insert or delete. The ReplacingMergeTree will only keep the last version of a row, which may in fact be deleted. The sign column lets us apply another ClickHouse trick to make those deleted rows inaccessible. It’s called a row policy. Using row policies we can make any row where the sign column is negative disappear.

Here’s an example of ReplacingMergeTree in action that combines the effect of the version and sign columns to handle mutable data.

Mirroring data into ClickHouse may appear more complex than migration but in fact is relatively straightforward because there is no need to change MySQL schema or applications and the ClickHouse schema generation follows a cookie-cutter pattern. The implementation process consists of the following steps.

Create schema for replicated tables in ClickHouse.
Configure and start replication from MySQL to ClickHouse.
Dump/load data from MySQL to ClickHouse using the same tools as migration.

At this point, users are free to start running analytics or build additional applications on ClickHouse whilst changes replicate continuously from MySQL.

Tooling improvements are on the way!

MySQL to ClickHouse migration is an area of active development both at Altinity as well as the ClickHouse community at large. Improvements fall into three general categories.

Dump/load utilities – Altinity is working on a new utility to move data that reduces schema creation and transfer of data to a single. We will have more to say on this in a future blog article.

Replication – Altinity is sponsoring the Sink Connector for ClickHouse, which automates high-speed replication, including monitoring as well as integration into Altinity.Cloud. Our goal is similarly to reduce replication setup to a single command.

ReplacingMergeTree – Currently users must include the FINAL keyword on table names to force the merging of row changes. It is also necessary to add a row policy to make deleted rows disappear automatically. There are pull requests in progress to add a MergeTree property to add FINAL automatically in queries as well as make deleted rows disappear without a row policy. Together they will make handling of replicated updates and deletes completely transparent to users.

We are also watching carefully for improvements on MaterializedMySQL as well as other ways to integrate ClickHouse and MySQL efficiently. You can expect further blog articles in the future on these and related topics. Stay tuned!

Wrapping up and getting started

ClickHouse is a powerful addition to existing MySQL applications. Large tables with immutable data, complex aggregation pipelines, and unanswered questions on MySQL transactions are clear signs that integrating ClickHouse is the next step to provide fast, cost-efficient analytics to users.

Depending on your application, it may make sense to mirror data onto ClickHouse using replication or even migrate some tables into ClickHouse. ClickHouse already integrates well with MySQL and better tooling is arriving quickly. Needless to say, all Altinity contributions in this area are open source, released under Apache 2.0 license.

The most important lesson is to think in terms of MySQL and ClickHouse working together, not as one being a replacement for the other. Each database has unique and enduring strengths. The best applications will build on these to provide users with capabilities that are faster and more flexible than using either database alone.

Percona, well-known experts in open source databases, partners with Altinity to deliver robust analytics for MySQL applications. If you would like to learn more about MySQL integration with ClickHouse, feel free to contact us or leave a message on our forum at any time.

Percona Database Performance Blog