MySQL or PostgreSQL: Which is Better?

https://www.percona.com/blog/wp-content/uploads/2023/06/MySQL-or-PostgreSQL-150×150.pngMySQL or PostgreSQL

For more than a quarter of a century, people have been discussing “Which is better, MySQL or PostgreSQL?” — with no resolution. When people ask me which is better, I have to ask them what they want to do and how they want to do it. 

I’ll explain using a bad analogy: 

What type of car is best? This depends on your needs. If you want to go fast, a top fuel dragster will set you back close to a million dollars by the time you buy the chassis, spare engines, tooling, and transporter. That is for a car that goes a quarter-mile at a time. If you want fast, it will do more than 300 miles per hour for about 1,000 feet. But you cannot parallel park it or use it to run down to the store for some chips and beer.

A small economy car is better for that store run and is operated at a fraction of the cost of the dragster. You do not need to wear a flame-resistant suit to drive it or repack the parachutes each time you want to stop after a trip. For most purposes, this is enough of a car for most people. If you are a drag racer, that car will not be competitive. 

Both MySQL and PostgreSQL do the basics very well

From a high level, one relational database management system is pretty much like every other relational database management system. Pick either PostgreSQL or MySQL, and you can be happy, leading a fulfilling and satisfying life full of joy. Both store data more than adequately and will do at least eighty percent of what you want to do with ease.

However, if you have certain data processing needs, budget requirements, limitations in the skill level of support staff, or infrastructure issues, then you need to be a little pickier.  

MySQL was criticized for years for doing dumb things with data such as allowing bad calendar dates, truncating data without warning, and other idiosyncrasies that its users learned to avoid. These problems have been rectified, but the old reputation lives on in the annals of the Internet and the memories of critics. It does not follow the SQL Standard as closely as other databases, and some useful functions such as MERGE() are missing.

PostgreSQL has enjoyed a reputation of being the open source database closest to the SQL Standard, but close might not be good enough for you. When you look at the multi-version concurrency control (MVCC) design that PostgreSQL uses and compare it to the designs of other databases, it looks chaotic. Dealing with dead tuples can be tricky but is much improved with automatic vacuuming, as long as the automated process runs well. And frankly, index bloat seems like something that should have been fixed years ago.

Neither is perfect, and each has its peccadillos that must be honored. Those will be covered a little later herein, but first, in honest fashion, you should ask yourself: “What do I need in a database?”

Determining whether MySQL or PostgreSQL better suits your needs

The vast majority of databases being used today do not get close to using all of their capabilities. Most of the work is CRUD — Create, Read (SELECT), Update, and Delete queries — which will probably never scratch the surface of the advanced features found in that database. Both of the databases being considered here do this type of work exceedingly well.

If you need a feature found in one but not the other, such as JSON_TABLE() in MySQL or MERGE() in PostgreSQL, then your choice has been made for you, maybe. JSON_TABLE() may make it into PG 17 in 2024 after being dropped at the last moment from PG 15. Or if you process a lot of JSON formatted data, you have two choices in PostgreSQL with JSON or JSONB that have their quirks while MySQL has one JSON datatype.

The question becomes this: Is either database good enough for your needs? And these: Does it have the functions you need? Window Functions are great for analytics, but is that something you are going to use? Both MySQL and PostgreSQL have Window Functions, but PostgreSQL is a little more elaborate in its offerings. Both have various ways of replicating data to other servers, but the implementation details of one particular approach might be unpalatable for you. The more exacting your needs, the easier it becomes to identify your choice.

If at this point you realize that you could conceivably use either database, at least in some abstract theoretical way, then we should make the next big step — to the care and feeding of your database.

Care and feeding — which database is easier to manage?

Databases are the toddlers of the software world. While other products can be installed and ignored without worry, databases need attention, constant attention, and lots of it. Ignoring your database can be disastrous.

MySQL is easier to take care of and administer in most cases. MySQL distributions, like Percona’s, tend to be one-stop offerings. The server, client, connectors, etc., are usually offered in one place.”

With PostgreSQL, it can be harder to get all the requisite components because you might have to visit several websites. There are several options for connection poolers, load balancers, and replication packages from various. Installing extensions to the server is easy, but does that new extension work with other parts of your server? In this case, you have to do the testing, whereas MySQL tests its components as a group.

PostgreSQL has benefitted from a lot of engineering in the past several years, which helps make it perform overhead tasks much easier. MySQL has automated most of those overhead tasks so you do not have to worry about them. For example, in PostgreSQL, copies of outdated rows in a table must be vacuumed separately to avoid bloat while InnoDB in MySQL handles this automatically. That alone makes MySQL easier to administer than PostgreSQL.

There are differences in connecting to the server to submit a query. MySQL uses a pool of threads, which is much less work for the server than PostgreSQL’s needing to fork off a process to make the connection. That is a higher load on the server, but it can be rectified by using a connection pooler.

Backups are a necessary part of owning a database. Both databases have many backup tools, which again requires you to make another choice. In the MySQL arena, Percona XtraBackup is the best tool hands down, and I am not just saying that as a Percona employee, but as someone who has used the product. 

PostgreSQL has many options, but no one product stands head and shoulders above the rest. If your database instance is in the cloud, then you can peruse the feature set of your cloud vendor’s offering. But I advise you to make copies of your backups off-premise or off-cloud, no matter your choice.

I used to be the Certification Manager for MySQL AB (and Sun Microsystems and Oracle) and spoke frequently to hiring managers who routinely told me it was very hard to find qualified MySQL DBAs. And they said it was impossible to find qualified PostgreSQL DBAs. If you and your staff have experience and skill in one of the databases, then you will probably skew your criteria in that direction 

MySQL’s InnoDB Cluster is the best thought-out and easiest-to-implement replication architecture. PostgreSQL is playing catchup, as its alternatives are not as simple to implement. Both do logical replication well, but Oracle’s product is more polished.

PostgreSQL is a richer environment with more data types and more operators, and it’s closer to the SQL standard implementation. I am a big fan of the MERGE() function, as I spent part of my career in the processing of cash register transaction logs where this function shines. This might seem like a trivial thing unless you are processing similar data and then it becomes of major importance. PostgreSQL has an almost embarrassing number of index types and the ability to index only some values in a column.  

The PostgreSQL and MySQL communities

Both MySQL and PostgreSQL have large, thriving communities. There are meetups, conferences, mailing lists, slack channels, and tutorials galore for both. One big difference is that PostgreSQL is pretty much developed by contributors using mailing lists while MySQL is mainly produced by Oracle’s MySQL Engineers. The difference is also notable in that Oracle determines the future of upstream MySQL, while PostgreSQL is vendor-neutral.

In both cases, a few hundred individuals work on the main server code. The main difference in the development is that PostgreSQL’s new functionality is open for observation (if you are on the right mailing list) while Oracle often provides little or no notice of something new. 

Conclusion: Do I choose PostgreSQL or MySQL?

Right now, both PostgreSQL and MySQL are great choices for a database. MySQL is easier to implement and run but might lack the features you need. PostgreSQL is feature-rich but needs more care to configure and operate.

Another option is Percona software for MySQL, which has enterprise features such as data basking, at-rest encryption, RocksDB, and an improved connection pooler. The software is also freely available. Percona software for PostgreSQL is also a high-quality offering with many of the most popular extensions already available, making it easier to run PostgreSQL in your production and mission-critical environments. 

Learn more about Percona software for MySQL

 

Learn more about Percona software for PostgreSQL

Percona Database Performance Blog

No Place to Place: Inside China’s Bicycle Graveyards

https://theawesomer.com/photos/2023/06/no_place_to_place_china_bike_graveyards_t.jpg

No Place to Place: Inside China’s Bicycle Graveyards

Link

With over 1.4 billion people, China generates a lot of waste. In 2017, it created a new problem. After shared bicycle programs cluttered streets with more than 25 million bikes, the government enforced fleet size limits, and countless bikes ended up in massive graveyards. Guoyong Wu’s short film uses aerial photography to show just how big the problem got.

The Awesomer

MySQL Key Performance Indicators (KPI) With PMM

https://percona.com/blog/wp-content/uploads/2023/06/mysql-replication-1024×520.pngMySQL Key Performance Indicators

As a MySQL database administrator, keeping a close eye on the performance of your MySQL server is crucial to ensure optimal database operations. A monitoring tool like Percona Monitoring and Management (PMM) is a popular choice among open source options for effectively monitoring MySQL performance.

However, simply deploying a monitoring tool is not enough; you need to know which Key Performance Indicators (KPIs) to monitor to gain insights into your MySQL server’s health and performance.

In this blog, we will explore various MySQL KPIs that are basic and essential to track using monitoring tools like PMM. We will also discuss related configuration variables to consider that can impact these KPIs, helping you gain a comprehensive understanding of your MySQL server’s performance and efficiency.

Let’s dive in and learn how (and what) to effectively monitor MySQL performance, along with examples from PMM, by understanding the critical KPIs to watch for.

Database uptime and availability

Monitoring database uptime and availability is crucial as it directly impacts the availability of critical data and the performance of applications or websites that rely on the MySQL database.

PMM monitors the MySQL uptime:

Percona Monitoring and Management KPI

show global status like 'uptime';

Indicates the amount of time (seconds) the MySQL server has been running since the last restart.

Query performance

Query performance is a key performance indicator (KPI) in MySQL, as it measures the efficiency and speed of query execution. This includes metrics such as query execution time, the number of queries executed per second, and the utilization of query cache and adaptive hash index.

Too many slow queries, inefficient queries, or long-running queries can indicate potential performance issues that may negatively impact the database’s performance and why monitoring query performance is crucial.

On PMM, we have the following panels showing the gist of query execution and summarizing the pattern. This is not an exhaustive list but an example of what we can watch for.

Number of slow queries recorded

MySQL slow queries

Select types, sorts, locks, and total questions against a database

Command counters and handlers used by queries give an overall traffic summary

MySQL traffic summary

Along with this, PMM also comes with Query Analytics giving much detailed information about queries getting executed.

Query Analytics

MySQL Query Analytics

You should watch out for status variables for these, for example:

show global status like '%sort%';
show global status like '%slow%';

Some of the configuration variables to note for:

  • slow_query_log: Enables / Disables slow query log.
  • long_query_time: Defines the threshold for query execution time, and queries taking longer than this threshold are considered slow and logged for further analysis
  • innodb_buffer_pool_size: Sets the size of the InnoDB buffer pool.
  • query cache: Disable (query_cache_size: 0, query_cache_type:OFF)
  • innodb_adaptive_hash_index: Check adaptive hash index usage to determine its efficiency.

Indexing efficiency

Monitoring indexing efficiency in MySQL involves analyzing query performance, using EXPLAIN statements, utilizing performance monitoring tools, reviewing error logs, performing regular index maintenance, and benchmarking/testing. This KPI is also directly related to Query Performance and helps improve it.

There are multiple tables MySQL internal system manages that come in handy, identifying the inefficient indexes, to name a few: sys.schema_unused_indexes, and information_schema.index_statistics.

The pt-duplicate-key-checker is another Percona Toolkit utility to eliminate duplicate indexes and improve indexing efficiency.

Connection usage

Connection usage is a critical key performance indicator (KPI) in MySQL, as it involves tracking the number of concurrent connections to the MySQL server and ensuring that it does not exceed the allowed limit. Improper configuration of connection settings can have catastrophic effects on the production database, resulting in it becoming inaccessible during connection spikes or even experiencing out-of-memory (OoM) kills of the MySQL daemon.

By monitoring and managing connection usage, you can proactively identify and address potential issues such as connection spikes, resource exhaustion, or improper configuration that may impact the availability and performance of the MySQL database.

PMM captures the MySQL connection matrix

MySQL connection matrix

It is important to provide appropriate max_connections and also monitor max_used_connections, max_used_connections_time to review the history of max usage to estimate the traffic.

Implementing appropriate connection pooling or choosing appropriate connection settings can help optimize resource utilization and reduce downtime or performance degradation. Hint Hint, ProxySQL helps.

CPU and memory usage

Monitor CPU and memory utilization of the MySQL server to ensure efficient resource utilization. It is advisable to have a dedicated production MySQL Server that can independently claim the system resources as needed. That said, it should also be monitored for usage, which will exhibit the traffic pressuring them.

PMM dashboard – CPU utilization and memory details

CPU utilization and memory details

The most effective memory configuration variable, innodb_buffer_pool_size, sets the size of the InnoDB buffer pool. 

Another related variable, innodb_buffer_pool_instances, determines the number of buffer pool instances for the InnoDB storage engine, which can improve the performance of multi-core systems by reducing contention on the buffer pool latch.

That said, CPU or memory usage is not only limited to these two variable configurations, and further analysis is required to track what’s causing the usage spikes. The point we’re making here is they are critical for MySQL Performance.

Disk space usage

Monitor the disk space usage of MySQL data files, log files, and temporary files.

You may review the fragmented tables, binary-logs, and duplicate or redundant indexes to reclaim the space. As a best practice, It is advisable to have different mounts for MySQL data and log files with specific system configurations.

PMM – Disk Details, which includes disk usage as well as disk performance charts

MySQL disk performance charts

Replication lag

Monitoring replication lag is important as it can affect the consistency and reliability of data across multiple database instances in a replication setup. 

Replication lag can occur due to various factors such as network latency, system resource limitations, complex transactions, or heavy write loads on the primary/master database. If replication lag is too high, it can result in stale or outdated data on the replica/slave databases, leading to data inconsistencies and potential application issues.

PMM – MySQL Replication Summary dashboard

Related configuration or status variables to consider:

  • Seconds_Behind_Master: In SHOW REPLICA STATUS command, this value indicates the replication lag in seconds.

One of the possible improvements in lag would be utilizing Parallel Replication.

Backup and recovery metrics

Backup and recovery metrics are key performance indicators (KPIs) for MySQL that provide insights into the reliability, efficiency, and effectiveness of backup and recovery processes. They include backup success rate, backup duration, recovery time objective (RTO), recovery point objective (RPO), and backup storage utilization. Monitoring these metrics helps ensure data protection, minimize downtime, and ensure business continuity.

You should not only monitor the backup mount for disk space and backup log but also regularly test the restores and log to match RPO and RTO objectives.

Error rates

The MySQL error log contains information about errors, warnings, and other issues that occur within the MySQL database. By monitoring the error log, you can quickly identify and resolve any problems that may arise, such as incorrect queries, missing or corrupt data, or database server configuration issues.

  • error_log: Specifies the location of the MySQL error log.

Conclusion

In this blog, we discussed the basics of MySQL’s key performance indicators (KPIs) using PMM. By monitoring these KPIs, such as database uptime and availability, query performance, indexing efficiency, connection usage, CPU and memory usage, disk space usage, replication lag, backup and recovery metrics, and error rates, you can gain valuable insights into your MySQL server’s health and performance.

Note that the specific configuration variables and their optimal values may vary depending on the MySQL version, system hardware, workload, and other factors. It’s important to thoroughly understand the MySQL configuration variables and their impacts before making any changes and to carefully monitor the effects of configuration changes to ensure they improve the desired KPIs.

Percona Monitoring and Management is a best-of-breed open source database monitoring solution. It helps you reduce complexity, optimize performance, and improve the security of your business-critical database environments, no matter where they are located or deployed.

 

Download Percona Monitoring and Management Today

Percona Database Performance Blog

Migrate passwords from a legacy PHP application to Laravel

https://leopoletto.com/assets/images/migrate-legacy-passwords-to-laravel.png

Migrating a legacy PHP application to Laravel will probably require a custom hashing driver.

This happens because Laravel’s default hashing driver is bcrypt and has argon as another built-in option, while MD5, SHA-1, SHA-256, and SHA-512 were and still are widely used, especially when the application does not rely on a modern framework.

Considering that we already have a table storing the hashed passwords, we need to make Laravel use the correct hash algorithm to compare the users’ raw passwords when authenticating.

Create a custom hash drive on Laravel

It should implement the Illuminate\Contracts\Hashing\Hasher
interface and extend the Illuminate\Hashing\AbstractHasher class:

app/Hashing/Md5Hasher.php

namespace App\Hashing;

use Illuminate\Contracts\Hashing\Hasher;
use Illuminate\Hashing\AbstractHasher;

class Md5Hasher extends AbstractHasher implements Hasher
{
    public function make($value, array $options = []): string
    {
        return md5($value . config('hashing.md5.salt'));
    }

    public function check($value, $hashedValue, array $options = []): bool
    {
        return $this->make($value) === $hashedValue;
    }

    public function needsRehash($hashedValue, array $options = []): bool
    {
        return false;
    }
}

Register the new driver in your application

Register it in the boot method of the following class:

app/Providers/AuthServiceProvider.php

namespace App\Providers;

use App\Hashing\Md5Hasher;
use Illuminate\Support\Facades\Hash;
use Illuminate\Support\ServiceProvider;

class AuthServiceProvider extends ServiceProvider
{
    // ...

    public function boot(): void
    {
        // ...

        Hash::extend('md5', static function () {
            return new Md5Hasher();
        });
    }
}

Define the hashing SALT (Optional)

Your legacy application may use a SALT to concatenate before hashing the password. We can define it in the config and delegate its value to the .env file.
If your legacy application does not use SALT, you won’t need to add it to the .env file.

config/hashing.php

return [
    // ...

    'md5' => [
        'salt' => env('MD5_SALT'),
    ],
];

.env

MD5_SALT=my_salt

Update the passwords

To rehash the password, we can intercept the users’ attempts to login and check if the MD5 hashed password matches the one in the database.
We can do that by listening to the Illuminate\Auth\Events\Attempting::class event.

php artisan make:listener UpdateMd5Password

app/Providers/EventServiceProvider.php

class EventServiceProvider extends ServiceProvider
{
    //...

    protected $listen = [

        //...

        'Illuminate\Auth\Events\Attempting::class' => [
            'App\Listeners\UpdateMd5Password::class',
        ],
    ];

    //...
}

The following implementation checks if the credentials match the legacy algorithm (MD5) and update to the new one.
The authentication flow continues, and the user will be successfully authenticated using the default driver (bcrypt).

app/Listeners/UpdateSha1Password.php

namespace App\Listeners;

use App\Models\User;
use Illuminate\Support\Facades\Hash;

class UpdateMd5Password
{
    public function handle(object $event): void
    {
        $user = User::where('email', $event->credentials['email'])->first();

        $md5Password = Hash::driver('md5')->make($event->credentials['password']);

        if ($user && $user->getAuthPassword() === $md5Password) {
            $user->password = Hash::make($event->credentials['password']);
            $user->save();
        }
    }
}

In closing

You may have another hashing algorithm on your legacy PHP application. You can make the necessary changes to achieve the same behavior.

Join the discussion on Twitter.

Laravel News Links

Laravel File Uploads: Save Filename in DB with Folder and URL?

https://laraveldaily.com/storage/423/Copy-of-Copy-of-ModelpreventLazyLoading();-(6).png

When uploading files with Laravel, how to store the filename in the DB? Should you store filename.png? Or, include the folder of avatars/filename.png? Or, the full path https://website.com/avatars/filename.png? Let me tell you my opinion.

This tutorial will have an avatar field on the register page and an avatar column in the users table. Let’s see how we can save it.


The easiest way is to store just the filename and create a separate storage disk.

config/filesystems.php:

return [

 

// ...

 

'disks' => [

 

// ...

 

'avatars' => [

'driver' => 'local',

'root' => storage_path('app/public/avatars'),

'url' => env('APP_URL').'/storage/avatars',

'visibility' => 'public',

'throw' => false,

],

 

],

 

// ...

 

];

Then, when storing the file we need to specify the new avatars disk.

app/Http/Controllers/Auth/RegisteredUserController.php:

class RegisteredUserController extends Controller

{

// ...

 

public function store(Request $request): RedirectResponse

{

$request->validate([

'name' => ['required', 'string', 'max:255'],

'email' => ['required', 'string', 'email', 'max:255', 'unique:'.User::class],

'password' => ['required', 'confirmed', Rules\Password::defaults()],

'avatar' => ['nullable', 'image'],

]);

 

if ($request->hasFile('avatar')) {

$avatar = $request->file('avatar')->store(options: 'avatars');

}

 

$user = User::create([

'name' => $request->name,

'email' => $request->email,

'password' => Hash::make($request->password),

'avatar' => $avatar ?? null,

]);

 

// ...

}

}

This way, your full URL filename https://website.com/storage/avatars/filename.png consists of three things:

  • Domain: https://website.com is stored in your APP_URL in .env file: so it is flexibly different for your local and production servers
  • Folder: /storage/avatars is in the config('disks.avatars.url') which corresponds to the internal structure of /storage/app/public/avatars described in the same config file. Both also can be flexibly changed if needed.
  • Filename: filename.png is the only thing actually landing in the DB column

To get the URL for the image in the Blade file, we would use URL method on the Storage facade providing the disk.

<img src="" alt="" />

But what if, after some time, you would need to go from local disks to, let’s say, Amazon S3?

The only change you would need to make is to change the disk, and maybe instead of the url, use the temporaryUrl method to provide the expiration time for the link.

<img src="" alt="" />

Laravel News Links