https://media.notthebee.com/articles/6a4565e7d3b046a4565e7d3b05.jpg
This might be one of the best videos on the internet right now.
Not the Bee
Just another WordPress site
https://media.notthebee.com/articles/6a4565e7d3b046a4565e7d3b05.jpg
This might be one of the best videos on the internet right now.
Not the Bee
https://ollieread.com/storage/og/articles/you-probably-dont-need-a-database-per-tenant.png
You could argue that the term "database per tenant" actually
covers two possible approaches, separate instances and
separate databases in the same instance. However, the majority of the time the term
is used, it is referring to the latter, and is often assumed to be the only way, or at least, the best way, to implement
multitenancy. It’s not. Most of the time at least.
This does, however, raise the question, what is the best way to implement data isolation in multitenancy? The answer,
like most things, is that it depends, but with the caveat that it will most often not be a database per tenant. It
depends on the application, the tenants, the database engine, the requirements, the scope, the resources, the scale, and
so on and so forth. There is no one-size-fits-all approach, though there is no harm in starting with something simple
and then working your way up when necessary. Don’t optimise for situations that may never happen, and don’t
overcomplicate things when you don’t need to.
Rather than me writing some sort of hit piece on "database per tenant", and those that advocate for it, this article
is instead going to cover the spectrum of approaches to data isolation in multitenancy. I’ll be laying it all out,
details and all, with some opinion sprinkled in here and there, and then you can make your own decision on
what is best for your application. The goal of this article is to help you make an informed decision.
The whole assumption that "database per tenant" is the best approach is based on people confusing isolation
and separation. Sure, they’re related, but they are not the same thing. Isolation is an issue of security and
correctness, making it so that Tenant A cannot see Tenant B. Separation is a storage and operational concern, making
it so that Tenant A is stored in a different place to Tenant B. You can achieve isolation through separation,
but you can also achieve isolation without separation.
When it comes to data isolation, there are a number of approaches, each with their own trade-offs. The best way to look
at them all is to consider them on a spectrum, from the heaviest and most isolated, with the highest operational cost,
to the lightest and least isolated, with the lowest operational cost. Developers often end up taking a far heavier
approach to achieve a guaranteed level of isolation, when a lighter approach would have been sufficient.
Below are the most common approaches, listed from heaviest to lightest. Each approach is described in terms of its
isolation, operational cost, how to work with it, and the best fit for its use. Each approach also includes a simple
diagram to illustrate the architecture; here’s a quick legend to help you understand them.
instance / server
shared infrastructure
tenant A
tenant B
With this approach, each tenant has its own database instance, each with its own credentials and connection information.
It is the heaviest, and most isolated approach, but also the most operationally expensive. If you were to ask me,
I’d say that this should be the true approach for "database per tenant".
Tenant data is completely physically isolated from other tenants. All processes, resources, networking, and data
are completely separate. Not only will no data leak between tenants, but no tenant can even affect the performance of
another tenant. This is the absolute strongest and best level of isolation you can achieve.
As you can probably imagine, that means that this is also the most operationally expensive approach. You have
N instances (where N is the number of tenants), to provision, monitor, patch, scale, and back up. Migrating becomes
more complex, having to run N times over N hosts. You won’t run into connection pooling issues, though, and backing
up, restoring, and even fully deleting a tenant is trivial.
Cross-tenant queries are virtually impossible, and if you had to do any sort of admin or analytics across tenants, you
would have to build some sort of pipeline and/or aggregation system to do it. Onboarding a new tenant requires you to
provision a new instance, which can be slow and expensive. You also need a secure mechanism for storing the
credentials and connection details for each tenant.
This approach is best suited when there is a legal requirement for physical isolation. Since the instances are
completely separate, they can be placed in different regions, which is an important thing for certain regulatory
requirements. It’s also useful for resource isolation when you have a tenant that is a noisy neighbour, and you
want to make sure that it doesn’t affect the performance of other tenants.
This approach is similar to the previous one, except each tenant has its own database within the same instance. This is
a lighter approach, for the most part, and it’s what people mean when they say "database per tenant". It is still a
fairly heavy approach and is often overkill for most applications. This is also the first approach where the database
engine in use is a factor.
With this approach there is no resource isolation between tenants, meaning a noisy neighbour tenant can affect
the performance of others. As for data isolation, that depends entirely on the engine you’re using.
{db}.{table} syntax. The isolation here is only as good as your privilege management.Migrations on this approach need to be run N times, once for each tenant. Backing up, restoring, and deleting a
tenant is trivial, as you can just back up, restore, or delete the database. Monitoring is also pretty
straightforward as you’re only dealing with a single instance and can often get stats per database. The engine you’re
using, though, will have an operational impact.
Changing the database on the fly won’t cause a reconnection when using MySQL or MariaDB, but will
when using PostgreSQL.
Onboarding is the same for every engine, as you just need to create the database and migrate it. Developers will
often make the database name deterministic, so you don’t need to store it and can derive it from some data on the
tenant. If you’re also using separate users per tenant, which you should be, then you’ll need to create those and
securely store the credentials. Administering or aggregating across tenants depends on the engine.
ATTACH to load an SQLite database file.Shared users with MySQL and MariaDB get you all the operational cost, with none of the isolation, and SQLite has no real
enforceable isolation at all. So, if you’re using either SQLite or shared users, then this approach is not a good fit
for multitenancy.
If you have users per tenant, or are using PostgreSQL, then this approach works best for small applications with a small
number of tenants, ideally one where the tenants are fully known and controlled, and where the number of tenants is not
expected to grow significantly. Anything larger, or with the possibility of exponential growth, will very quickly hit a
ceiling with connections.
This approach is an interesting one, as it’s really two different approaches, with one that only works on a
particular database engine. You can simplify it by thinking of this approach as namespacing. All the tenants share
the same database, but their data is split by a namespace. For PostgreSQL, that namespace is a schema, and for MySQL,
MariaDB, and pretty much every other engine, that namespace is just a prefix on the table name.
This approach isolates data by namespace, which is a much softer form of isolation than the previous approach. In
most cases, the isolation is conventional and must be enforced at the query level. There is also absolutely no
resource isolation between tenants, so noisy neighbours are an issue. The last deciding factor, for isolation,
is the engine in use.
search_path in the connection, which will isolate it to that schema. It’s possible to strengthen theUsing a single database means that connection pooling issues are effectively gone, at least, from a per-tenant
perspective. This approach does, unfortunately, still incur the ongoing operational cost of migrations, needing to
be run N times, once for each tenant, regardless of whether it’s schemas or prefixes. There are a few
considerations for the engine in use, though.
Cross-tenant queries are easy, as you can just query across schemas or tables, meaning administration tasks and
analytics are far easier than every other approach so far. Onboarding is also easier than the previous approaches,
though there is still some complexity, as you’re still running all the migrations for the new tenant.
This approach best fits applications that aren’t large and need tenant data to be separate, though not
necessarily isolated, and can’t pay the operational cost of the heavier approaches. With the following caveats based
on engine.
Partitioned tables are a feature of some database engines that allow you to split a table into multiple physical
storage tables, based on a partitioning key. This can be used to implement multitenancy by partitioning on a
discriminator column. Without PostgreSQL and RLS, this approach adds
nothing in terms of data isolation, beyond what the below discriminator column approach does.
Partitioning is a storage concern, so the isolation is as good as the query convention. If you miss the WHERE
clause with the discriminator column, then you’re querying the entire dataset across all tenant boundaries. There
are, however, additional enforcement mechanisms that can be put in place, depending on the engine.
Unlike all other approaches so far, migrations only need to be run once, exactly the same way that you would for a
single-tenant application. It even gets rid of the possible connection pooling issue that has appeared in almost
every approach so far. It does, however, come with some new problems that we’ve not seen before. Partitioning
requires additional thought and planning when it comes to designing your database schema.
The key used to partition the table across all engines that support it must be part of the primary key and
every unique key. This is a particular issue with things like Eloquent, Laravel’s ORM, which has limited support for
composite keys.
Partition management is also a concern, which differs based on the strategy you’ve taken. You can partition by list
or by hash. The former would be a partition per tenant, and the latter would distribute tenants across a fixed set
of partitions. If you’re taking the former approach, you can back up, restore, and delete tenants by deleting their
partition. If you take the latter approach, you lose all of that, and everything is done at the table level.
There are also a few things to consider based on the engine in use.
Since queries are filtered using a discriminator column, cross-tenant queries are as simple as not adding a WHERE
clause. Administration and aggregation operations become as easy as they would be if you had no multitenancy at all.
This approach has a better performance footprint, as the database engine can optimise queries to only hit the relevant
partitions, rather than scanning the entire dataset. Onboarding depends on the strategy you’ve taken. Partitioning by
list requires you to create a new partition for the new tenant, which is an operation that needs to be done on the
database. Partitioning by hash does not require any additional operations, as the new tenant will be automatically
distributed across the existing partitions.
Reach for this approach when you’re using one of the below approaches, but you’ve got a table that is large enough
that the physical layout is important, i.e. per-tenant pruning for speed, and/or per-tenant partition operations for
backups and the likes. This approach is better suited as a scaling decision for the cheaper options, than one for
isolation. Again, there are caveats based on the engine in use.
The discriminator column approach is the lightest and least isolated approach, but also the cheapest and easiest to
implement. It’s also all that most applications need. All tables that contain tenant data have a discriminator column,
typically a foreign key to the tenant’s table, which is used to filter the queries to only return data for the current
tenant. For the purpose of explanation, the discriminator column will be referred to as tenant_id, though in reality
it would have a domain appropriate name, like account_id, organization_id, or company_id. This approach is
also the only one that is truly engine agnostic, as it can be implemented on any engine, even without foreign keys.
This approach has the weakest isolation of all the approaches, as it is entirely convention-based. There’s no
enforcement in the database, or in the framework or underlying functionality. It’s something you have to enforce in
your application code, and if you ever miss the WHERE clause, you’re going to get data leakage. That being said,
providing that you’re using a proper testing strategy and ORM, the approach is pretty simple to implement, and is
more often than not, sufficient. There is only one note to be made in regard to the engine.
Migrations are straightforward, as only one version of the schema exists, and it only needs to be run once. Connection
pooling is also not an issue, as all tenants share the same database and connection. Backing up, restoring, and deleting
tenants becomes more complex, as you’ve got to unpick the data for that tenant from all the tables, which is a
non-trivial operation. Monitoring is also a little more difficult, as you can’t monitor per tenant as default, and
instead need to build that in yourself.
There are also a few considerations for this particular approach. The first is to do with indexing. Since all tenants
share the same table, you need to make sure that the discriminator column is part of the index for any queries that
filter on it. The second is to do with table volume. Since all tenants share the same table, the table can grow to be
quite large, which can have a performance impact, especially for smaller tenants. This is the particular problem that
usually leads to people reaching for partitioning.
Cross-tenant queries are trivial, you just don’t add the WHERE clause, and onboarding a tenant is as simple as
create the tenant row, something you’d have to do anyway. This is the cheapest and simplest approach you could
implement, and as mentioned above, it can be upgraded to the partitioned tables approach if you
need to scale it.
The only real cost with this approach is the burden it puts on the application code to enforce the isolation, which is
an ongoing continual burden, rather than simply a one-off operational cost. In practice, you’d centralise the logic
that handles the isolation, whether by using a global query scope, or a default query constraint/criteria. You
wouldn’t ever expect anyone to have to manually add the WHERE clause to every query, as that would be a recipe for
disaster. It does make the enforcement a bit fragile, though, as even with centralised logic, it’s split across the
whole codebase, and if you ever miss it, you’re going to have a problem.
Honestly, this approach is the best fit for most applications, unless you have some strict criteria that requires a
higher level of isolation. At the very least, this is a great place to start, as it’s easier to migrate to a heavier
approach when the time comes than it is to migrate to a lighter approach. This particular approach is the one that was
used by Shopify, though they also sharded tenants across multiple instances to spread the load better, and they’re
operating at a scale that most applications will never reach. The only caveat is that you need to be using an ORM or
framework that allows you to centralise the logic that enforces the isolation.
The only real weakness of this approach is that the isolation is enforced at the application level, which honestly,
can be solved by using PostgreSQL and RLS, which is covered below.
Row-level security (RLS) isn’t an approach to multitenancy, but rather a particular feature of some database engines,
most notably PostgreSQL. It is used alongside the discriminator column, and by association,
the partitioned tables approach, to provide a stronger level of isolation. It allows you to
move the enforcement from a query convention in the application code to a security policy enforced by the database
engine.
All that is required to make use of the features, once the policies are defined, is to set the tenant
in the connection, which is usually done by setting a session variable. While this does close the gap on isolation
and will do wonders to make you feel better about the risk of data leakage, there are caveats and potential issues
to be aware of.
FORCE ROW LEVEL SECURITY option on the table.BYPASSRLS, regardlessFORCE ROW LEVEL SECURITY set or not.However, since PostgreSQL is the gift that keeps on giving, there are ways to mitigate all the above issues.
For the table owner and migration issues, the commonly recommended way of setting this up
is to have two different users. The first is the table owner and isn’t subject to RLS policies, allowing them to
run migrations without issue. The second is the application user, who is subject to RLS policies and is used for all
application queries. This does add some slight operational overhead, but it’s a one-off cost during setup, and pales
in comparison to the ongoing operational cost of the heavier approaches.
And for the transaction mode connection pooler issue, you can scope the session variable to the current transaction
using SET LOCAL instead of SET, ensuring that the session variable is reset when the transaction ends. This does
mean that you need to ensure that every query is wrapped in a transaction, otherwise the session variable will not be
set, and the RLS policies will not be applied. When creating the policy, there’s a second argument that tells the
engine what needs to happen when the session variable is not set, with the choices being between returning an
empty result or erroring. In almost every case, you want it to error, as an empty result will be indistinguishable from
a query that simply has no results, which is a recipe for disaster.
It’s pretty clear from the title of this article, and pretty much everything above, that I’m of the opinion that you
can achieve perfectly adequate isolation of tenant data, without having to separate it. I do stand by this, but
there are some things that just can’t be achieved, at least easily, without separation. So, when these things are a
genuine requirement, something that’s important enough that you pay the operational and complexity cost of
implementing separation, then you should go for it.
All of these have been briefly mentioned, mainly in the "Best fit" section for the relevant approaches, but here they
are.
While the above are all genuinely valid reasons to separate tenants, in most cases they are not applicable to every
tenant. The operational overhead of separation is significant, but it scales linearly with the number of tenants. So,
if you can only separate the tenants that need it and keep the rest on a lighter approach, you can get the best of both
worlds. After all, it’s a spectrum, and you don’t have to pick just one.
If you’ve read through even some of the above approaches, you’ll have noticed that many of them reference others
and either build on top of them or sit alongside them. This is because they aren’t mutually exclusive, and they aren’t
simply a linear choice of one or the other. As I said at the start of the article, it’s a spectrum, and how you pick
depends on what you need.
This strategy is one of exceptions, meaning that all tenants follow the default approach you picked for
data-isolation, with a few that are exceptions to that rule.
Let’s say that your application uses the
discriminator column approach, but you have a handful of tenants that are noisy neighbours,
and one that is a large enterprise with additional requirements. For your everyday tenants, they stay in the shared
database instance using the discriminator column, but for the noisy neighbours, they each have their own database on
a shared instance, following the separate database, same instance approach. Your
large enterprise tenant has its own dedicated instance in a jurisdiction that meets its regulatory requirements,
following the separate instance approach.
How you combine approaches is entirely up to you and depends entirely on your needs. The one thing to consider is that
combining approaches has a cost, though how much depends on the approaches being combined.
All the approaches covered in this article operate on one of two layers, either the connection layer, or the query
layer. Combining approaches that operate on the same layer is trivial, but combining approaches that operate across
layers adds complexity.
The connection layer covers approaches that are handled at the connection level, so
separate instance, separate database, same instance,
schemas (schema only), and even RLS.
The query layer covers approaches that are handled at the query level, so
partitioned tables, prefixes (prefixes only), and the discriminator
column.
Some of you eagle-eyed individuals will have noticed that the schemas and/or prefixes
approach is listed in both layers, which I assure you is intentional. That particular approach is a bit of a hybrid, and
the layer it sits in depends on the mechanism being used. If you’re using schemas with PostgreSQL, it’s connection
level, but if you’re using regular old table prefixes, it’s at the query level.
This strategy is almost entirely identical to the above, except that instead of using different approaches based on
specific tenants, you use different approaches based on specific datasets. This particular strategy is often overlooked,
despite it being something that many multitenanted applications will implement without realising.
Imagine you’re isolating tenants at the connection level, each with
a separate database on the same instance, but you want users to be able to
authenticate
against tenants that they are added to, with the same credentials. You’re already going to have a global, or
"landlord", database that contains the tenant table, so you can look up the correct tenant before creating the
connection to the tenant’s database. So to achieve this, rather than duplicate user data across multiple databases,
you’d add the users table to the global database, along with
a tenant_user pivot table that contains a discriminator column to filter the users by tenant.
The only real downside to this strategy comes when the approaches you combine cross a database or instance boundary:
you sacrifice referential integrity.
In the above example, the tenant_user table can have a foreign key to
the tenant table because they both live in the global database, but none of the tenant-specific databases can have a
foreign key to the users table. Without foreign keys, there is no referential integrity at the database level.
Another example, but going the other way would be when your application has a small subset of data that has particular
regulatory requirements and needs to either stay in a particular jurisdiction, or needs to be isolated to a higher
degree.
Rather than isolating every tenant’s data to that level, you could isolate just the data that needs it. Your whole
application makes use of the discriminator column approach, but the subset of data uses
the separate database, same instance, or even
the separate instance approach.
When you separate by data rather than by tenant, you avoid the cross-layer cost completely. With this approach, your
models, entities, or whatever you call them, can belong to different connections or have different scopes. That’s
something easily done by almost every modern ORM or framework. This is much cheaper than the previous strategy,
which would require your models, entities, etc to support multiple different ways simultaneously, based on the
current context.
You do also incur additional operational costs for more complex/heavier approaches; however, since you’re only
isolating a small subset of data, it is slightly reduced compared to the previous strategy, which would require you to
isolate all of a tenant’s data.
Sharding is entirely a concern of scale and has nothing to do with isolation, unlike the previous two strategies.
While data isolation in multitenancy is a spectrum, as I keep saying, they’re intentionally laid out in this article
to be vertical. Sharding is horizontal, which is why it has been left out of the discussion until now. It has no
fixed isolation level, and inherits the isolation level of whatever approach it is that you’re sharding.
The only approach that you can’t shard, at least, not in the way that I’m talking about it here, is the
separate instance approach, as each tenant is already isolated to its own instance. However,
all the other approaches can be sharded.
To really get the most out of sharding a multitenanted application, you’d split tenant data across multiple database
instances, with the particular structure dependent on the approach. For the
separate database, same instance approach, each tenant would still have their
own database, but they’d be equally spread out across the instances. Sharding tenants for this approach is the
closest to the generic concept of data sharding, outside a multitenanted context.
Schemas and/or prefixes,
partitioned tables, and discriminator column approaches would still
have the tenants split across the instances; the difference being that the structure is identical on every instance,
with only the tenants differing.
The cost of sharding depends entirely on how you’re routing tenants to their shard. Evenly distributing tenants across
shards is simple, but basing it on an attribute of the tenant makes it more complex.
The important consideration, however, is whether a tenant’s shard is fixed or can change. If the shard is derived
from an attribute of the tenant, that tenant is effectively locked to that shard, making the process of adding new
shards or balancing for performance far more complex and costly. If tenants are instead mapped to a shard based on
some arbitrary logic during onboarding, it becomes much easier to add new shards and balance, as the cost and
complexity are entirely in the migration of the tenant’s data. Mapping does incur the cost of maintenance, though
it’s trivial and often worth paying a little bit every now and then to save a lot in the future.
All the bits above deal with the how and why you’d combine approaches, but the most important thing to consider is
the when. You don’t have to start with the most isolated approach, and in fact, you probably shouldn’t. Start with
the cheapest and simplest approach that meets your requirements and then escalate to a heavier approach when the
need arises. You move your tenants to a heavier approach in response to a need, not in anticipation of one. Don’t
try to preemptively solve a problem that doesn’t, and may not ever exist.
It is always going to be cheaper and easier to move a tenant to a heavier approach than it is to move them to a
lighter one. In almost all cases, the heavier approach is additive when it comes to complexity, and as every one of
us knows far better than we’d comfortably like to admit, it’s always easier to make something more complex than it
is to make it simpler.
Any of you that have worked in any sort of agency or sales/marketing heavy situation will know that there are two
types of requirements for any project. The first are the technical requirements, the things that are actually
required to make the project work. The second are the "requirements" that are just things that people want or think
they should have.
I can tell you, from experience, that most people that think they need full separation from the start don’t
actually need it at all. This is something that you or your leadership team will need to determine, and isn’t something
that I can dictate to you. Unless, of course, you want to hire me to do exactly that.
Each approach covered in this article has two costs associated with it. The first is the
architectural cost of implementing the approach, and the second is the ongoing operational cost of maintaining it.
The architectural cost is paid once, per approach. You pay it during the initial building of the application, and
then again each time you implement a new approach, which is unlikely to be a common occurrence.
The operational cost, however, is paid for every tenant, every day, for the lifetime of the application.
When you’re picking an approach, it’s the second operational cost that you’re choosing, and it’s the one that you need
to be most aware of.
Do you want to be able to run a new migration once, or do you want to have it run N times, once per tenant? If
you’re thinking that waiting for a migration to run for every tenant is no big deal, then consider that the time it
takes to run a migration isn’t the only thing here. If you’re running the migration once, and it breaks, you can
either roll back the migration and/or roll back the application. That’s going to be relatively quick and simple. If
you’re running the migration N times, and it breaks on a tenant, then you’ve got a much more complex problem. Sure,
you roll back the migration and/or the application, but now you’ve got to figure out which tenants the migration ran
successfully for, and which ones it didn’t. Building a migration pipeline that can handle this is non-trivial, and
even the best implementation can be fragile.
What about backups? Sure, it’s nice to be able to back up every tenant separately, but again, you’ve got to consider
the operational cost of that. It’s the same as with the migrations, waiting for the backups to run isn’t that much
of an issue, it’s the complexity of the backup pipeline that you need to consider, and the management of the backups.
If you’ve got fully separated tenants, and then you’re backing them all up to the same location, you’ve undermined
the entire point of separating them in the first place. I know that I listed backups as a genuinely good reason to
separate tenants, but that was in the context of a requirement, not a nice-to-have. So, be honest; do you really
need a backup per tenant? Sure, it ticks a box and looks good on a slide, or feature list, but in reality, is it
worth it? Is it frequent enough to pay both the architectural cost of setting it up and the ongoing
operational cost of maintaining it? I’d be willing to bet that if you even need it at all, it’ll be so infrequent and
such a special case that you could just manually do it.
Too many people approach multitenancy and try to answer the question "how isolated do I need it to be?", when the
real question is, and always was, "what am I willing to pay, per tenant, forever, for isolation?". Unless you
absolutely, genuinely, have to have some of the things I
listed above, a PostgreSQL database with a
discriminator column and row-level security will buy
you all the isolation you need, without the operational cost of separation. Compared to this, the
separate database, same instance approach, without any of the requirements, will
incur all the cost, and provide none of the benefits.
If you take only one thing away from this article, let it be this:
The best data isolation approach for any given application will vary from application to application, but it’s
almost never going to be "database per tenant".
Laravel News Links
https://d2908q01vomqb2.cloudfront.net/887309d048beef83ad3eabf2a79a64a389ab1c9f/2026/06/25/4899.png
In this post, we show you how long-running transactions affect AWS Database Migration Service (AWS DMS) change data capture (CDC) latency, walk through monitoring approaches for Oracle, PostgreSQL, MySQL, and SQL Server, and provide ready-to-use scripts to identify and resolve problematic transactions before they impact your replication performance.
Long-running transactions are one of the most common and often overlooked causes of rising CDC source latency during AWS Database Migration Service (AWS DMS) migrations. When a database transaction stays open for hours instead of seconds, DMS must hold its replication position, buffering all subsequent changes until that transaction commits. The result is a cascading delay across all pending changes that can stall your migration pipeline right when low latency matters most, during your final cutover window. At that point, every minute of replication delay translates directly to extended downtime and higher migration risk.
By detecting these transactions early, you can avoid unnecessary replication instance scaling, reduce cutover downtime, and keep migration on schedule.
When working with AWS DMS, source latency (CDCLatencySource) is a key metric. It measures the delay between the commit time of the last event captured from the source endpoint and the current system timestamp of the replication instance. In other words, it measures how far behind the replication instance is from the latest committed change at the source. Lower latency indicates healthier replication performance.
Several factors contribute to elevated source latency: an uncommitted transaction prevents DMS from applying/forwarding changes to the target, network throughput limitations between source and replication instance, source database performance bottlenecks, and heavy workload during peak processing times. These issues often interconnect, compounding their impact on replication performance.
To troubleshoot effectively, analyze latency patterns over time at hourly, daily, and weekly intervals. This helps you distinguish between temporary spikes during expected peak periods and systematic issues that indicate underlying problems.
The path to resolving source latency isn’t always straightforward, but it typically involves investigating six key areas:
Each of these factors requires a different troubleshooting approach. Pinpointing the exact root cause before initiating remediation is essential for an effective fix. For the scope of this post, we focus on long-running transactions.
Long-running transactions are database operations that remain open for minutes or even hours. In AWS DMS, they can cause significant replication delays.
When a transaction stays open, DMS must retain all transaction log entries from the point that transaction began. For example, if a transaction opens at 9:00 AM and does not commit until 11:00 AM, DMS preserves two hours of log data. This increases storage consumption and memory usage on the replication instance.
The open transaction also blocks DMS from advancing its log-reading position. All changes that commit after the open transaction must wait in queue, causing CDC source latency to climb steadily. In high-transaction environments, a single long-running transaction can create a backlog that takes hours to clear after the transaction finally completes.
Common causes include complex queries processing large datasets, batch operations that run longer than expected, application code that holds transactions open unnecessarily, and missing commit or rollback statements. Each cause calls for a different troubleshooting approach.
To reduce the impact, set up automated monitoring to detect open transactions that exceed a defined threshold. Keep transaction duration as short as possible by optimizing application code and committing frequently. Avoid user interaction within open transactions. Also make sure the replication instance has sufficient storage and computing resources to handle temporary backlogs.
Some long-running transactions are unavoidable for certain business operations. The goal is not to eliminate them entirely but to detect them early and prevent them from stalling your replication pipeline.
Long-running transactions directly impact your AWS DMS CDC source latency. Identifying them is the first step to troubleshooting performance issues.
When DMS operates in CDC mode, it captures changes from the source database’s transaction logs (such as Oracle redo logs, SQL Server transaction logs, PostgreSQL WAL, or MySQL binlogs). However, DMS can only capture and replicate committed transactions to maintain data consistency.
A bottleneck forms because DMS cannot skip past the open transaction:
CDCLatencySource increases from seconds to potentially hours.CDCIncomingChanges accumulates.CDCChangesMemorySource or CDCChangesDiskSource increases as DMS buffers pending changes.For instance, consider a batch job that starts a transaction at 10:00 AM and runs for 2 hours without committing. Any changes captured during that window cannot be applied to the target until the transaction commits at 12:00 PM. The DMS replication task is the unit of work responsible for reading the source transaction log and applying changes to the target. During this period, the task buffers all subsequent changes it has read, regardless of whether those changes involve related tables. It maintains transactional ordering within its scope, so no captured changes can be applied until the open transaction completes at noon. This creates a significant replication delay that grows proportionally with the transaction’s duration.
Identifying long-running transactions helps you differentiate between DMS configuration issues and database-level problems when diagnosing source latency. This identification process follows a systematic approach:
CDCLatencySource to identify when replication is falling behind, then investigate replication instance metrics and source database open transactions to determine the root cause.CDCLatencySource spikes. Track specific identifiers such as Oracle SCN (such as 12345678) or SQL Server LSN (such as 0000000A:00000B00:0001).Without identifying these long-running transactions, teams often waste time tuning DMS task settings (like
MemoryLimitTotalor parallel threads) or scaling up replication instances unnecessarily, when the actual solution requires addressing the source database’s transaction management practices. The monitoring scripts thus serve as a diagnostic bridge between observing DMS performance symptoms and pinpointing their database-level causes.
To proactively manage this issue, implementing alerts for long-running transactions is essential. Here are the monitoring queries for each database engine:
SELECT
CONCAT('PID: ', p.id,
' | User: ', p.user, '@', p.host,
' | DB: ', IFNULL(p.db, 'None'),
' | Duration: ', p.time, 's',
' | State: ', IFNULL(p.state, 'N/A'),
' | Locked Rows: ', IFNULL(t.trx_rows_locked, 0),
' | Modified Rows: ', IFNULL(t.trx_rows_modified, 0),
' | Query: PID_', p.id)
FROM information_schema.processlist p
LEFT JOIN information_schema.innodb_trx t ON p.id = t.trx_mysql_thread_id
WHERE p.time > 900
AND p.command != 'Sleep'
AND p.command != 'Daemon'
AND p.user NOT IN ('event_scheduler', 'rdsadmin', 'system user')
ORDER BY p.time DESC;
We’ve developed monitoring scripts for Oracle, PostgreSQL, MySQL, and SQL Server. These scripts alert when long-running transactions stall DMS replication. They detect and alert on long-running transactions (default: 15 minutes). Each script is designed for its specific database engine while following a consistent monitoring framework.
The full scripts are available on GitHub: sample-dms-long-running-transaction-detection.
sqlplus, psql, mysql, or sqlcmd)All four scripts share a consistent design pattern with the following features:
| Feature | Description |
| Interactive Prompts | All connection parameters are collected through interactive prompts at runtime. Credentials are securely retrieved from AWS Secrets Manager |
| TLS/SSL Encryption | All database connections are encrypted by default: TCPS for Oracle, -N flag for SQL Server, sslmode=require for PostgreSQL, --ssl-mode=REQUIRED for MySQL |
| AMAZON SNS Alerts | Alerts are published to a configurable AMAZON SNS topic, enabling distribution to multiple subscribers (email, SMS, AWS Lambda, and more) |
| Prerequisite Checks | Each script validates that required tools (AWS CLI, database client) are installed before proceeding |
| False Positive Filtering | Internal/system processes and client warnings are filtered out to prevent false alerts |
Each script implements these core functions:
collect_inputs(): Interactive prompt-based parameter collection.validate_inputs(): Input validation and prerequisite checks.check_long_sessions() / check_long_transactions(): Core monitoring query execution.send_alert(): SNS alert publishing with formatted message.log_message(): Timestamped logging to file.build_*_connstr() / build_*_args(): TLS/SSL connection string construction.main(): Orchestration with configuration summary and log directory setup.This bash script automatically detects Oracle database sessions running longer than the configured threshold and publishes alerts to Amazon SNS. It connects using the TCPS protocol for TLS/SSL encryption, with optional Oracle Wallet support for certificate-based authentication.
Usage:
Interactive prompts:
TLS/SSL configuration: The script uses TCPS protocol by default (port 2484). For Oracle Wallet-based SSL, set the following environment variables before running:
Key filters: Excludes BACKGROUND processes and the RDSADMIN user. Only monitors ACTIVE sessions.
This bash script automatically detects PostgreSQL transactions exceeding the configured threshold and publishes alerts to Amazon SNS. It enforces TLS/SSL via the sslmode connection parameter.
Usage:
Interactive prompts:
TLS/SSL configuration: Default sslmode=require. For stricter verification, set environment variables:
False positive prevention: The script separates stderr from stdout to prevent psql client warnings (such as libpq.so version messages) from being treated as transaction data.
This bash script automatically detects MySQL transactions running longer than the configured threshold, providing detailed reports and publishing alerts to Amazon SNS. It enforces TLS/SSL via the --ssl-mode parameter.
Usage:
Interactive prompts:
TLS/SSL configuration: Default --ssl-mode=REQUIRED. For stricter verification with the RDS CA bundle:
False positive prevention: The script filters out MySQL internal processes (event_scheduler, rdsadmin, system user, Daemon commands) and strips the mysql: [Warning] Using a password on the command line interface can be insecure warning from output.
This bash script automatically detects SQL Server transactions exceeding the configured threshold and publishes alerts to Amazon SNS. It enforces TLS/SSL via the -N flag with sqlcmd.
Usage:
Interactive prompts:
TLS/SSL configuration: The script uses -N (encrypt) and -C (trust server certificate) by default. For strict certificate verification, install the RDS CA bundle into the system trust store and set:
False positive prevention: Filters to is_user_process = 1 only and strips (N rows affected) noise from sqlcmd output. SQL text is truncated to 200 characters.
When a long-running transaction is detected, the script publishes a formatted alert to the configured SNS topic. Here is an example alert from the Oracle monitor:
All subscribers to the SNS topic (email, SMS, Lambda, and more) receive this alert, so the operations team can respond quickly.
For continuous monitoring, configure the scripts as cron jobs. Since the scripts use interactive prompts, you can create a wrapper script that pipes the inputs, or modify the scripts to retrieve credentials from AWS Secrets Manager. Alternatively, for crontab usage, you can set the variables directly in a non-interactive wrapper:
Note: The scripts use bash-specific syntax (such as, [[ =~ ]] for regex matching). Always invoke them with bash script_name.sh rather than sh script_name.sh.
We are going to perform a demo of the script in action by choosing Oracle as our source endpoint and check if the script is indeed able to capture the long-running sessions. For the demo we have used Amazon Relational Database Service (Amazon RDS) for Oracle 19c Enterprise Edition as source:
EVENTS_LOB table. Table structure:
CDCLatencySource as the long-running transaction holds up replication.The monitoring scripts incorporate several security best practices:
ps output, shell history, or process listings.read -s), preventing exposure in shell history or process listings. For automated scheduling, the scripts support --secret-id to retrieve credentials directly from AWS Secrets Manager, eliminating plaintext passwords entirely.SQL_ID/query_id references, not actual SQL text. Use these identifiers to look up queries in database monitoring views if needed.Long-running transactions can increase your CDC source latency from seconds to hours and cause replication delays that grow until the transaction commits. The monitoring scripts provided for Oracle, PostgreSQL, MySQL, and SQL Server offer a proactive solution to identify these problematic transactions before they become critical issues. As demonstrated in our Oracle example, these tools effectively detect and alert administrators about long-running sessions, enabling timely intervention and optimization.
The scripts enforce TLS/SSL encryption on all database connections, use Amazon SNS for scalable alert distribution, and include false-positive filtering to make sure alerts are actionable. By implementing this monitoring suite, you transform from reactive troubleshooting to proactive management. This post helps you identify potential latency issues before they impact database operations or AWS DMS replication, ensuring smooth, efficient database operations across all your platforms.
Planet for the MySQL Community
https://media.notthebee.com/articles/6a43fd453533d6a43fd453533e.jpg
A prankster at East Brook Middle School in Paramus, New Jersey, just pulled off what might be the most insane yearbook prank you’ll ever see.
Not the Bee
https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgmktJunYqMacdP812wQ-9ETWDfxDZhk0VxnTrc39wD8RbK4uxjd_K-6KpYbz1jaY_zOkkBLjvLCX-h9PWV2QFYwqvd94jwy-PMrwck9JeovAbkhVRrtoZCRereTigTuxcNAsPHVTQhArHfxTiplocSl6VoE8xLur1yUJ7PEtN7tGxnjv9dsUJP7OtgIPye/w1200-h630-p-k-no-nu/Screenshot%202026-06-26%20072921.png
First, there were the MySQL Query Browser and the MySQL Administrator, collectively known as the MySQL GUI Tool Bundle. But they were given End-of-Life status in 2009. Next was MySQL Workbench, but that tool is now EOL as well.
![]() |
| MySQL Workbench provides a warning if you try to use it with later versions of MySQL. |
You could still run Workbench, but it will grumble if you try to use it with any version other than 5.6, 5.7, or 8.0. Heck, it warns you that 8.4 may be a bit sketchy. On, and you need an older version for MySQL 5.5 and earlier.
Many people swore by Workbench as the tool for checking on system status, wiring queries, and making backups. Others swore at it.
Yes, it had quirks. But Workbench did a lot of this right.
There is a free, open-source database tool you should consider as a replacement. DBeaver Community Edition is also a database tool, but it supports many databases, not just MySQL. It uses JDBC connectors. DBeaver is an IDE that offers smarter auto-completion, advanced sorting, cell inline-editing, and a "Copy As" configuration feature. Additionally, the community version allows developers to integrate OpenAI or Copilot to translate regular text descriptions into SQL queries. Built with superior data-handling agility. You can directly import data from CSV, XML, or Excel XLSX files. It also features robust utilities to migrate data smoothly between two different database formats with minimal manual intervention.
![]() |
| Some of the available database connectors for the DBeaver Community Edition. |
Gone are the days when you had one database. In this cloudy and Kubed world, you may be lucky if you have only a dozen databases to work with. Finding a universal interface, like DBeaver, keeps you from having to try to remember that the psql equivalent of \G is in the MySQL Shell.
A big bonus is the ability to open a CSV file, transform the data, and then load it into a table in a database. Or compare that data to decide what subset of that data to move into that table.
Establishing a connection from DBeaver is going to be familiar for Workbench fans.
![]() |
| You will need the hostname, database name, username and password for DBeaver CE. |
Your connection details are going to be the same, bout double check that you did not ‘fat finger’ the information
On the right, you will see my configuration. On the bottom left is a MySQL ‘Sakila’ icon and the World database. Ignore the rest of the database connections, at least for now.
The World database has been around for decades. It is the standard for MySQL documentation, examples, demos, and tutorials. And it is a great place to start for the rest of this blog.
Expanding the Greater Than character to the left of that icon and name, DBeaver will reveal some details about the world database.
![]() |
| World Database at a high-level view |
We see Databases, Users, Administer, and System Info.
As we expand, we see even more details about the server. We see the World database, the various users, a session manager, and general information about the instance.
Right about now, you probably see a lot of overlap between the two products. Both are data tools, with DBeaver having a wider scope than the MySQL-only Workbench.
Let’s look at the tables in that world database.
![]() |
| The world database has three tables. |
We can drill down to see the tables. Then we can look at the columns in a specific table.
![]() |
| The city table has five columns |
Double-click on the city table icon, and the data will appear.
![]() |
| The Data |
Or you can view the entity-relationship map.
![]() |
| ERM |
And you may want to see the table structure.
![]() |
| The city table |
You can edit data or structure with these views.
You need to be able to make queries.
SQL Editor -> Open SQL Console
![]() |
| The Query |
You get command completion, highlight coloring, and all the stuff you expect. Click the top-left orange triangle to see the query results.
![]() |
| The results |
What if you
This has been a quick intro to DBeaver Community Edition for those who are used to MySQL Workbench. I always found Workbench useful. Hopefully, you will be pleasantly surprised with DBeaver.
Planet for the MySQL Community
https://blog.holoviz.org/posts/panel_live_server/images/panel-live-server.pngPlanet Python
https://minervadb.com/wp-content/uploads/2026/06/shutterstock_2360159259-1024×520.jpg
MySQL query optimization is one of the most critical skills a database administrator or developer can possess. Whether you are managing a high-traffic e-commerce platform, a data warehouse with billions of rows, or a transactional OLTP system, poorly optimized queries are the leading cause of performance degradation, increased I/O, excessive CPU usage, and frustrated end users. At the heart of MySQL’s query optimization toolkit lies the EXPLAIN statement — a powerful diagnostic command that reveals how the MySQL query optimizer intends to execute a given SQL statement.
In this comprehensive guide, we will explore MySQL query optimization from the ground up: understanding the query execution lifecycle, dissecting every column of the EXPLAIN and EXPLAIN ANALYZE output, identifying common anti-patterns, and applying proven optimization strategies that MySQL DBAs and developers rely on in production environments every day. By the end of this article, you will be equipped with the knowledge to analyze execution plans, eliminate slow queries, and design indexes that drive maximum throughput.
Before diving into EXPLAIN, it is essential to understand what the MySQL query optimizer does. The optimizer is a cost-based component within the MySQL server that evaluates multiple possible execution plans for a given query and selects the one with the lowest estimated cost. This cost is calculated based on statistics about tables and indexes stored in the Information Schema and the InnoDB storage engine‘s internal data dictionary.
The optimizer considers factors such as row estimates, index selectivity, join order, and available access methods before producing an execution plan. However, the optimizer is not perfect — it relies on statistics that may be stale or inaccurate, which is why understanding EXPLAIN and knowing how to guide the optimizer with hints is an indispensable skill for any serious MySQL DBA or developer.
MySQL provides several variants of the EXPLAIN statement, each offering different levels of detail about query execution. Understanding when to use each variant is key to efficient query diagnostics.
-- Basic EXPLAIN EXPLAIN SELECT * FROM orders WHERE customer_id = 1001; -- EXPLAIN with FORMAT=JSON for richer, structured output EXPLAIN FORMAT=JSON SELECT * FROM orders WHERE customer_id = 1001; -- EXPLAIN ANALYZE (MySQL 8.0.18+) - executes query and returns real metrics EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id = 1001; -- EXPLAIN for DML statements EXPLAIN UPDATE orders SET status = 'shipped' WHERE order_date < '2024-01-01'; EXPLAIN DELETE FROM audit_log WHERE created_at < NOW() - INTERVAL 90 DAY; EXPLAIN INSERT INTO archive_orders SELECT * FROM orders WHERE status = 'closed';
Throughout this guide, we use a realistic e-commerce schema to demonstrate every optimization technique hands-on.
CREATE TABLE customers (
customer_id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
email VARCHAR(255) NOT NULL,
country_code CHAR(2) NOT NULL,
created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
status TINYINT(1) NOT NULL DEFAULT 1,
UNIQUE KEY uk_email (email),
KEY idx_country_status (country_code, status),
KEY idx_created_at (created_at)
) ENGINE=InnoDB;
CREATE TABLE orders (
order_id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
customer_id INT UNSIGNED NOT NULL,
order_date DATE NOT NULL,
total_amount DECIMAL(12,2) NOT NULL,
status ENUM('pending','processing','shipped','delivered','cancelled') NOT NULL,
KEY idx_customer_id (customer_id),
KEY idx_order_date_status (order_date, status),
CONSTRAINT fk_orders_customer FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
) ENGINE=InnoDB;
CREATE TABLE order_items (
item_id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
order_id BIGINT UNSIGNED NOT NULL,
product_id INT UNSIGNED NOT NULL,
quantity SMALLINT UNSIGNED NOT NULL,
unit_price DECIMAL(10,2) NOT NULL,
KEY idx_order_id (order_id),
KEY idx_product_id (product_id),
CONSTRAINT fk_items_order FOREIGN KEY (order_id) REFERENCES orders(order_id)
) ENGINE=InnoDB;
CREATE TABLE products (
product_id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
sku VARCHAR(50) NOT NULL,
category_id INT UNSIGNED NOT NULL,
price DECIMAL(10,2) NOT NULL,
stock_qty INT NOT NULL DEFAULT 0,
UNIQUE KEY uk_sku (sku),
KEY idx_category_id (category_id)
) ENGINE=InnoDB;
The id column represents the sequential identifier of each SELECT within the query. Simple queries have a single id of 1. Subqueries and unions produce multiple rows with different id values. Rows with the same id execute as a join; rows with higher id values represent inner subqueries executed before the outer query.
The select_type column describes the type of SELECT involved. Key values include: SIMPLE (no subqueries or unions), PRIMARY (the outermost SELECT), SUBQUERY (a subquery in SELECT or WHERE), DERIVED (a subquery in the FROM clause), UNION (subsequent SELECT in a UNION), and DEPENDENT SUBQUERY (a correlated subquery — a critical performance red flag indicating the subquery re-evaluates for each outer row).
-- SIMPLE: No subqueries or unions
EXPLAIN SELECT customer_id, email FROM customers WHERE country_code = 'US';
-- PRIMARY + SUBQUERY: Subquery in WHERE clause
EXPLAIN
SELECT order_id, total_amount FROM orders
WHERE customer_id IN (
SELECT customer_id FROM customers WHERE country_code = 'DE'
);
-- PRIMARY + DERIVED: Subquery in FROM clause (derived table)
EXPLAIN
SELECT d.country_code, COUNT(*) AS order_count
FROM (
SELECT c.country_code, o.order_id
FROM customers c
JOIN orders o ON o.customer_id = c.customer_id
WHERE o.status = 'delivered'
) d
GROUP BY d.country_code;
-- UNION: Multiple SELECT statements combined
EXPLAIN
SELECT customer_id, 'active' AS label FROM customers WHERE status = 1
UNION ALL
SELECT customer_id, 'inactive' AS label FROM customers WHERE status = 0;
The type column — also called the join type or access type — is the most important field in the entire EXPLAIN output. It tells you how MySQL accesses rows in a table. From best to worst performance:
-- const: Primary key lookup EXPLAIN SELECT * FROM customers WHERE customer_id = 42; -- type: const, rows: 1 -- eq_ref: Unique index join (best for joins) EXPLAIN SELECT c.email, o.order_id, o.total_amount FROM orders o JOIN customers c ON c.customer_id = o.customer_id WHERE o.order_date = '2024-06-01'; -- type for customers: eq_ref (primary key join) -- ref: Non-unique index lookup EXPLAIN SELECT order_id, order_date, status FROM orders WHERE customer_id = 1001; -- type: ref -- range: Index range scan EXPLAIN SELECT * FROM orders WHERE order_date BETWEEN '2024-01-01' AND '2024-03-31'; -- type: range -- ALL: Full table scan (must be fixed for large tables!) EXPLAIN SELECT * FROM orders WHERE total_amount > 5000; -- type: ALL if no index on total_amount -- Solution: CREATE INDEX idx_total_amount ON orders(total_amount);
The possible_keys column lists all indexes MySQL considered; key shows the index actually chosen. When key is NULL despite available indexes in possible_keys, MySQL chose a full table scan — often because statistics suggest too many rows match. Run ANALYZE TABLE to refresh statistics.
The key_len column shows how many bytes of the chosen index are used. For composite indexes, this reveals how many columns are utilized. The rows column is MySQL’s estimated row examination count — minimize this product across joined tables for optimal performance. The filtered percentage shows what fraction of rows examined actually pass the WHERE clause.
The Extra column contains the most actionable diagnostic signals: Using index (covering index — ideal), Using temporary (temp table — investigate), Using filesort (sort without index — add covering index), Using index condition (Index Condition Pushdown active — good), and Using MRR (Multi-Range Read active — good for range scans).
-- Using index: Covering index (zero table row access)
ALTER TABLE orders ADD INDEX idx_cust_covering
(customer_id, order_id, order_date, total_amount, status);
EXPLAIN
SELECT order_id, order_date, total_amount, status
FROM orders WHERE customer_id = 1001;
-- Extra: Using index
-- Using temporary + Using filesort: Performance red flag
EXPLAIN
SELECT country_code, COUNT(*) AS cnt
FROM customers GROUP BY country_code ORDER BY cnt DESC;
-- Fix: add index on (country_code) to avoid temp table
-- Using filesort on non-indexed ORDER BY
EXPLAIN SELECT order_id, total_amount FROM orders
ORDER BY total_amount DESC LIMIT 20;
-- Fix: CREATE INDEX idx_total_amount ON orders(total_amount);
-- Using index condition: ICP optimization
EXPLAIN SELECT * FROM orders
WHERE order_date BETWEEN '2024-01-01' AND '2024-12-31'
AND status = 'shipped';
-- Extra: Using index condition
EXPLAIN ANALYZE, introduced in MySQL 8.0.18, executes the query and returns both estimated and actual metrics for each node in the execution plan tree. This is critical for identifying cardinality estimation errors — cases where the optimizer’s row estimates diverge wildly from reality, leading to suboptimal plan selection.
EXPLAIN ANALYZE
SELECT
c.country_code,
COUNT(DISTINCT o.order_id) AS total_orders,
SUM(oi.unit_price * oi.quantity) AS total_revenue
FROM customers c
JOIN orders o ON o.customer_id = c.customer_id
JOIN order_items oi ON oi.order_id = o.order_id
WHERE c.status = 1
AND o.order_date >= '2024-01-01'
AND o.status = 'delivered'
GROUP BY c.country_code
ORDER BY total_revenue DESC;
-> Sort: total_revenue DESC (actual time=142.5..142.7 rows=48 loops=1)
-> Aggregate using temporary table (actual time=142.2..142.2 rows=48 loops=1)
-> Nested loop inner join (cost=18540.23 rows=9820)
(actual time=0.8..138.6 rows=87342 loops=1)
-> Nested loop inner join (cost=5421.12 rows=3240)
(actual time=0.5..22.4 rows=28918 loops=1)
-> Filter: (c.status = 1) (cost=1240.80 rows=8400)
(actual time=0.3..8.7 rows=71230 loops=1)
-> Index scan on c using idx_country_status
(cost=1240.80 rows=84000)
(actual time=0.2..6.9 rows=84000 loops=1)
-> Filter: (o.order_date >= '2024-01-01') and (o.status='delivered')
(cost=0.25 rows=1) (actual time=0.00019..0.00019 rows=0 loops=71230)
-> Index lookup on o using idx_customer_id
(customer_id=c.customer_id) (cost=0.25 rows=1)
(actual time=0.00017..0.00017 rows=1 loops=71230)
-> Index lookup on oi using idx_order_id (order_id=o.order_id)
(cost=1.12 rows=3) (actual time=0.003..0.004 rows=3 loops=28918)
Key analysis points: compare the estimated rows against actual rows. When these diverge by orders of magnitude, consider running ANALYZE TABLE or increasing innodb_stats_persistent_sample_pages. The actual time=start..end values are in milliseconds. The loops value shows how many times each node executed — high loop counts on expensive inner operations are the primary target for optimization.
Wrapping an indexed column inside a function prevents MySQL from using the index, forcing a full table scan. This is one of the most common and damaging anti-patterns found in production SQL workloads — and the fix is almost always straightforward.
-- BAD: Function prevents index usage
EXPLAIN SELECT * FROM orders
WHERE YEAR(order_date) = 2024 AND MONTH(order_date) = 6;
-- type: ALL (full table scan on potentially millions of rows)
-- GOOD: Rewrite as range condition (uses index)
EXPLAIN SELECT * FROM orders
WHERE order_date >= '2024-06-01' AND order_date < '2024-07-01';
-- type: range, Extra: Using index condition
-- BAD: LIKE with leading wildcard (no index possible)
EXPLAIN SELECT * FROM products WHERE sku LIKE '%ABC%';
-- Consider FULLTEXT index for arbitrary substring searches
ALTER TABLE products ADD FULLTEXT INDEX ft_sku (sku);
SELECT * FROM products WHERE MATCH(sku) AGAINST('ABC' IN BOOLEAN MODE);
-- GOOD: LIKE with trailing wildcard (uses index prefix scan)
EXPLAIN SELECT * FROM products WHERE sku LIKE 'ABC%';
-- type: range
-- BAD: Function on indexed column breaks index usage
EXPLAIN SELECT * FROM customers WHERE LOWER(email) = 'user@example.com';
-- GOOD: Functional index (MySQL 8.0+) preserves index access
ALTER TABLE customers ADD INDEX idx_email_lower ((LOWER(email)));
EXPLAIN SELECT * FROM customers WHERE LOWER(email) = 'user@example.com';
-- type: ref, key: idx_email_lower
The N+1 problem occurs when an application executes one query to retrieve N records and then fires an additional query for each record — N+1 total round trips. This is catastrophic at scale and entirely preventable with proper JOIN usage or batch fetching.
-- BAD: N+1 pattern (500 pending orders = 501 queries!)
-- Query 1: SELECT order_id FROM orders WHERE status = 'pending';
-- Then for each order_id:
-- Queries 2..501: SELECT * FROM order_items WHERE order_id = ?;
-- GOOD: Single JOIN eliminates N+1 completely
EXPLAIN
SELECT
o.order_id, o.order_date, o.total_amount,
oi.item_id, oi.product_id, oi.quantity, oi.unit_price
FROM orders o
JOIN order_items oi ON oi.order_id = o.order_id
WHERE o.status = 'pending'
ORDER BY o.order_id, oi.item_id;
-- type for orders: ref (idx_status)
-- type for order_items: ref (idx_order_id)
-- One query, complete result set
Using SELECT * prevents covering index usage, transfers unnecessary data across the network, and makes execution plans less predictable as schemas evolve. Always project only the columns your application actually needs.
-- BAD: SELECT * forces table row access even when index could cover query
EXPLAIN SELECT * FROM orders WHERE customer_id = 1001;
-- GOOD: Project only needed columns enables covering index
ALTER TABLE orders ADD INDEX idx_cust_cover
(customer_id, order_id, order_date, total_amount, status);
EXPLAIN
SELECT order_id, order_date, total_amount, status
FROM orders WHERE customer_id = 1001;
-- type: ref, Extra: Using index (all data from index - zero table access)
Composite indexes follow the left-prefix rule: MySQL can only use an index starting from the leftmost column. A composite index on (A, B, C) supports queries on A, A+B, or A+B+C — but not B or C alone. Design composite indexes with equality columns first, range condition columns second, and ORDER BY / GROUP BY columns last to eliminate filesort operations.
-- Query: WHERE status = 'shipped' AND order_date BETWEEN x AND y ORDER BY order_date
-- Optimal: equality first, range second, ORDER BY aligned with range column
ALTER TABLE orders ADD INDEX idx_status_date_opt (status, order_date);
EXPLAIN
SELECT order_id, customer_id, total_amount
FROM orders
WHERE status = 'shipped'
AND order_date BETWEEN '2024-01-01' AND '2024-06-30'
ORDER BY order_date;
-- type: range, key: idx_status_date_opt
-- Extra: Using index condition (NO filesort! ORDER BY uses index)
-- Verify index columns being used via key_len
-- status ENUM NOT NULL = 1 byte
-- order_date DATE NOT NULL = 3 bytes
-- key_len = 4 means BOTH columns are utilized
-- Covering composite index for aggregate queries
ALTER TABLE orders ADD INDEX idx_grp_covering
(status, order_date, customer_id, total_amount);
EXPLAIN
SELECT status, order_date, COUNT(*) AS cnt, SUM(total_amount) AS revenue
FROM orders
WHERE status IN ('shipped', 'delivered')
AND order_date >= '2024-01-01'
GROUP BY status, order_date;
-- Extra: Using index (full covering index - no table access whatsoever)
MySQL 8.0 introduced invisible indexes, which the optimizer ignores while InnoDB continues maintaining them. This allows DBAs to safely validate the impact of removing an index before permanently dropping it — an indispensable tool for production index lifecycle management.
-- Make an index invisible to test impact of removing it ALTER TABLE orders ALTER INDEX idx_status INVISIBLE; -- EXPLAIN now shows optimizer ignoring this index EXPLAIN SELECT * FROM orders WHERE status = 'pending'; -- possible_keys: NULL (invisible index ignored) -- Re-enable the index ALTER TABLE orders ALTER INDEX idx_status VISIBLE; -- Allow session to see invisible indexes for targeted testing SET SESSION optimizer_switch = 'use_invisible_indexes=on'; EXPLAIN SELECT * FROM orders WHERE status = 'pending'; SET SESSION optimizer_switch = 'use_invisible_indexes=off'; -- Check visibility status of all indexes SELECT index_name, is_visible FROM information_schema.STATISTICS WHERE table_schema = 'ecommerce' AND table_name = 'orders' GROUP BY index_name, is_visible;
When the MySQL optimizer makes a poor index selection — often due to outdated statistics or unusual data distributions — index hints and optimizer hints allow targeted intervention. Use them sparingly and always validate with EXPLAIN, as they bypass the optimizer’s cost model.
-- FORCE INDEX: Optimizer must use this index (ignores all others)
EXPLAIN SELECT * FROM orders FORCE INDEX (idx_order_date_status)
WHERE order_date >= '2024-01-01' AND status = 'delivered';
-- USE INDEX: Suggests an index (optimizer may still ignore)
EXPLAIN SELECT * FROM orders USE INDEX (idx_customer_id)
WHERE customer_id = 1001;
-- IGNORE INDEX: Prevents use of a specific index
EXPLAIN SELECT * FROM orders IGNORE INDEX (idx_status)
WHERE status = 'pending' AND order_date >= '2024-01-01';
-- Optimizer hints (MySQL 8.0+ preferred method)
SELECT /*+ NO_HASH_JOIN(o, c) */
o.order_id, c.email, o.total_amount
FROM orders o
JOIN customers c ON c.customer_id = o.customer_id
WHERE o.status = 'pending';
-- SET_VAR hint: Change variable scope for a single query
SELECT /*+ SET_VAR(sort_buffer_size=4194304) */
customer_id, SUM(total_amount) AS revenue
FROM orders
GROUP BY customer_id
ORDER BY revenue DESC
LIMIT 100;
Subqueries can be highly efficient or devastating for performance depending on how they are written. The most dangerous anti-pattern is the correlated subquery — a subquery with a DEPENDENT SUBQUERY select_type that re-evaluates for every row of the outer query. MySQL 8.0’s Common Table Expressions (CTEs) provide both performance parity with JOINs and dramatically improved readability for complex multi-step queries.
-- BAD: Correlated subquery (re-evaluated N times for N outer rows)
EXPLAIN
SELECT o.order_id, o.total_amount,
(SELECT SUM(oi.unit_price * oi.quantity)
FROM order_items oi
WHERE oi.order_id = o.order_id) AS calculated_total
FROM orders o
WHERE o.order_date >= '2024-01-01';
-- select_type: DEPENDENT SUBQUERY (executed once per outer row!)
-- GOOD: JOIN with aggregation (single pass over data)
EXPLAIN
SELECT o.order_id, o.total_amount, oi_agg.calculated_total
FROM orders o
JOIN (
SELECT order_id, SUM(unit_price * quantity) AS calculated_total
FROM order_items GROUP BY order_id
) oi_agg ON oi_agg.order_id = o.order_id
WHERE o.order_date >= '2024-01-01';
-- BEST: CTE for readability with equivalent performance (MySQL 8.0+)
WITH order_totals AS (
SELECT order_id, SUM(unit_price * quantity) AS calculated_total
FROM order_items GROUP BY order_id
)
SELECT o.order_id, o.total_amount, ot.calculated_total
FROM orders o
JOIN order_totals ot ON ot.order_id = o.order_id
WHERE o.order_date >= '2024-01-01';
-- Recursive CTE: Hierarchical queries (category trees, org charts)
WITH RECURSIVE category_tree AS (
SELECT category_id, parent_id, name, 0 AS depth
FROM categories WHERE parent_id IS NULL
UNION ALL
SELECT c.category_id, c.parent_id, c.name, ct.depth + 1
FROM categories c
JOIN category_tree ct ON ct.category_id = c.parent_id
)
SELECT category_id, CONCAT(REPEAT(' ', depth), name) AS indented_name
FROM category_tree ORDER BY category_id;
Naive pagination using high OFFSET values is a classic performance trap. As OFFSET grows, MySQL must scan and discard increasingly large numbers of rows before returning the requested page — a problem known as deep pagination. For large datasets, cursor-based pagination using the last seen primary key delivers constant-time performance regardless of page depth.
-- BAD: High offset forces full scan of 1,000,100 rows
EXPLAIN SELECT order_id, order_date, total_amount
FROM orders ORDER BY order_id
LIMIT 100 OFFSET 1000000;
-- rows: 1000100 (scans and discards 1,000,000 rows)
-- GOOD: Cursor-based (keyset) pagination - constant performance
-- First page:
SELECT order_id, order_date, total_amount
FROM orders WHERE order_id > 0
ORDER BY order_id LIMIT 100;
-- Next page (pass last_order_id from previous result set):
SELECT order_id, order_date, total_amount
FROM orders
WHERE order_id > :last_order_id
ORDER BY order_id LIMIT 100;
-- type: range, rows: 100 (reads exactly what is needed)
-- Alternative: Late row lookup for complex multi-column sort
SELECT o.*
FROM orders o
JOIN (
SELECT order_id FROM orders
ORDER BY total_amount DESC, order_id
LIMIT 100 OFFSET 50000
) ids ON ids.order_id = o.order_id
ORDER BY o.total_amount DESC, o.order_id;
-- Inner query works only with index pages; outer fetches only 100 full rows
The MySQL optimizer’s decisions are only as good as the statistics it uses. Stale or inaccurate statistics lead to poor plan choices — wrong join orders, missed index usage, and cardinality estimation errors. As a MySQL DBA, proactively managing statistics is a core operational responsibility, especially after bulk data loads or large DELETE operations.
-- Refresh table statistics
ANALYZE TABLE orders, customers, order_items, products;
-- View table statistics and sizes
SELECT table_name,
table_rows,
ROUND(data_length / 1024 / 1024, 2) AS data_mb,
ROUND(index_length / 1024 / 1024, 2) AS index_mb,
update_time
FROM information_schema.TABLES
WHERE table_schema = 'ecommerce'
ORDER BY data_length DESC;
-- Check index cardinality (higher = more selective = better)
SELECT index_name, column_name, seq_in_index, cardinality, nullable
FROM information_schema.STATISTICS
WHERE table_schema = 'ecommerce' AND table_name = 'orders'
ORDER BY index_name, seq_in_index;
-- Increase sample pages for better statistics on large tables
ALTER TABLE orders STATS_SAMPLE_PAGES = 50;
ANALYZE TABLE orders;
-- InnoDB persistent statistics settings
SHOW VARIABLES LIKE 'innodb_stats%';
-- innodb_stats_persistent = ON (recommended for production)
-- innodb_stats_persistent_sample_pages = 20 (increase for accuracy)
-- Check when InnoDB table statistics were last updated
SELECT * FROM mysql.innodb_table_stats
WHERE database_name = 'ecommerce';
MySQL’s Performance Schema provides comprehensive instrumentation tables for real-time query performance monitoring. For MySQL DBAs, mastering the Performance Schema is essential for identifying the highest-impact optimization targets in production — revealing far more than the slow query log alone.
-- Top 10 slowest queries by total execution time
SELECT
DIGEST_TEXT AS query_template,
COUNT_STAR AS exec_count,
ROUND(SUM_TIMER_WAIT / 1e12, 3) AS total_time_sec,
ROUND(AVG_TIMER_WAIT / 1e12, 6) AS avg_time_sec,
ROUND(MAX_TIMER_WAIT / 1e12, 6) AS max_time_sec,
SUM_ROWS_EXAMINED AS total_rows_examined,
ROUND(SUM_ROWS_EXAMINED / COUNT_STAR, 0) AS avg_rows_examined,
SUM_NO_INDEX_USED AS full_scans
FROM performance_schema.events_statements_summary_by_digest
WHERE SCHEMA_NAME = 'ecommerce'
ORDER BY SUM_TIMER_WAIT DESC
LIMIT 10;
-- Queries performing full table scans in production
SELECT
DIGEST_TEXT,
COUNT_STAR,
SUM_NO_INDEX_USED,
ROUND(AVG_TIMER_WAIT / 1e12, 6) AS avg_sec
FROM performance_schema.events_statements_summary_by_digest
WHERE SCHEMA_NAME = 'ecommerce' AND SUM_NO_INDEX_USED > 0
ORDER BY SUM_NO_INDEX_USED DESC LIMIT 10;
-- sys schema: Simplified top-level performance view
SELECT * FROM sys.statement_analysis
WHERE db = 'ecommerce'
ORDER BY total_latency DESC LIMIT 10;
-- sys schema: All queries doing full table scans
SELECT * FROM sys.statements_with_full_table_scans
WHERE db = 'ecommerce'
ORDER BY no_index_used_count DESC;
When EXPLAIN and EXPLAIN ANALYZE do not provide sufficient insight, the Optimizer Trace delivers a complete JSON log of every decision the optimizer made — including all alternative plans considered and their cost estimates. This is the ultimate diagnostic instrument for resolving the most difficult query optimization problems.
-- Enable optimizer trace SET SESSION optimizer_trace = 'enabled=on'; SET SESSION optimizer_trace_max_mem_size = 1048576; -- Run the query to analyze SELECT order_id, customer_id, total_amount FROM orders WHERE status = 'shipped' AND order_date BETWEEN '2024-01-01' AND '2024-06-30' ORDER BY total_amount DESC LIMIT 50; -- Retrieve the trace (JSON format) SELECT QUERY, TRACE FROM information_schema.OPTIMIZER_TRACE\G -- Key JSON sections to examine: -- "considered_execution_plans": All plans evaluated -- "best_access_path": Index chosen and why -- "rows_estimation": Cardinality estimates per table -- "cost_info": read_cost, eval_cost, prefix_cost per plan -- Disable optimizer trace SET SESSION optimizer_trace = 'enabled=off';
Beyond index design, several MySQL server variables directly influence query execution performance. Understanding and tuning these variables is a critical complement to query-level optimization in production environments.
-- Sort buffer: used when ORDER BY/GROUP BY cannot use an index SHOW VARIABLES LIKE 'sort_buffer_size'; -- Default: 256KB SET SESSION sort_buffer_size = 4 * 1024 * 1024; -- 4MB for heavy sorts -- Join buffer: used for Block Nested Loop joins (non-indexed joins) SHOW VARIABLES LIKE 'join_buffer_size'; -- Default: 256KB SET SESSION join_buffer_size = 2 * 1024 * 1024; -- 2MB for large joins -- Temporary table memory thresholds (exceeding causes disk spill) SHOW VARIABLES LIKE 'tmp_table_size'; -- Default: 16MB SHOW VARIABLES LIKE 'max_heap_table_size'; -- Default: 16MB -- Set both equal to prevent disk-based temp tables -- InnoDB buffer pool: the single most impactful performance variable SHOW VARIABLES LIKE 'innodb_buffer_pool_size'; -- Target: 70-80% of total RAM -- Enable slow query log for continuous production monitoring SET GLOBAL slow_query_log = ON; SET GLOBAL long_query_time = 1; -- Capture queries > 1 second SET GLOBAL log_queries_not_using_indexes = ON; -- Capture queries without indexes SHOW VARIABLES LIKE 'slow_query_log_file'; -- Check log file location -- Read buffer: sequential scan performance SHOW VARIABLES LIKE 'read_buffer_size'; -- Default: 128KB SHOW VARIABLES LIKE 'read_rnd_buffer_size'; -- Default: 256KB
The following checklist provides a systematic approach to diagnosing and resolving slow queries in MySQL production environments. Apply these steps in order for every optimization engagement.
performance_schema.events_statements_summary_by_digest, or sys.statement_analysis to identify the highest-impact queries by total execution time and examination count.type (eliminate ALL and index scans), rows (minimize the cross-join product), and Extra (eliminate Using filesort and Using temporary where feasible).ANALYZE TABLE after bulk data changes to ensure the optimizer works with accurate cardinality estimates.
MySQL query optimization is both a science and an art. The science lies in understanding how the cost-based optimizer works, how indexes are structured and accessed internally by InnoDB, and how to interpret every field of the EXPLAIN and EXPLAIN ANALYZE output with precision. The art lies in applying this knowledge pragmatically — knowing when to add a composite index, when to rewrite a correlated subquery as a JOIN, when to refresh statistics, and when to override the optimizer with targeted hints.
Mastering the techniques in this guide — from dissecting EXPLAIN columns and eliminating full table scans, to designing optimal composite and covering indexes, avoiding deep pagination traps, leveraging invisible indexes for safe lifecycle management, and using the Performance Schema for continuous monitoring — equips you to build MySQL-backed systems that scale confidently to hundreds of millions of rows and thousands of concurrent connections.
The return on investment in MySQL query optimization skills is exceptional: reduced infrastructure costs, dramatically improved user experience, fewer on-call incidents, and a more resilient, predictable database tier. Every millisecond shaved from a high-frequency query executed millions of times daily translates directly into meaningful savings and competitive advantage. Start every optimization engagement with EXPLAIN, follow the evidence rigorously, and let the data guide every decision you make.
Planet for the MySQL Community
Ford executives said they’ve hired 350 veteran engineers — some of them former employees — after AI and automated systems failed to deliver the desired quality, reports TechCrunch:
Bloomberg reports the company’s chief operating officer Kumar Galhotra told journalists that Ford had been "relying more and more on automated quality systems" with disappointing results. So the company "brought back technical specialists," and those specialists "hunt for failure points before a part ever reaches the plant floor." Charles Poon, Ford’s vice president of vehicle hardware engineering, added, "Mistakenly we thought that by just introducing artificial intelligence and ingesting the design requirements that we had, that that would produce a high-quality product."
The article points out that Ford is using the rehired gray beard engineers to train younger staff — and, to reprogram its AI tools.
Read more of this story at Slashdot.
Slashdot
Today we released RonDB 26.04.1, a beta release. It contains a
lot of new features, but the most interesting one is that RonSQL now
supports pushdown join aggregation and CTEs, so that complex queries run
with low, predictable latency.
RonDB has always been able to answer complex queries through a MySQL Server.
The problem with that path is predictability. The application asks for an
answer, but it has no guarantee about how fast that answer arrives: the MySQL
optimizer picks a plan that may or may not parallelise the query across the RonDB
data nodes, and a plan that looks fine on a small table can fall off a cliff as
the data grows.
RonSQL takes a different contract. The rule is simple:
Anything RonSQL accepts can be pushed down to the RonDB data nodes for
parallel execution.
If a query parses and plans in RonSQL, it runs as a parallel pushdown —
there is no fallback to a slow, single-threaded plan. That means the latency of a
complex query is something the application can actually reason about up front,
instead of discovering it in production.
RonSQL grew out of the needs of AI applications built on Feature
Stores, and in particular on-demand (real-time)
transformations in Hopsworks.
Traditionally an online Feature Store only does primary-key lookups. To keep
those lookups fast, every feature has to be pre-computed and
written back before serving. That works, but it has two costs:
RonSQL attacks both problems:
CTEs (Common Table Expressions, the SQL WITH clause) are what let
you combine these two ideas in a single, readable query: aggregate the fresh fact
rows in a CTE, then join the result against your normalised dimension tables.
Consider a fraud-scoring model. At inference time it needs a feature vector for
one card, computed over that card’s most recent activity. The raw transactions
arrive continuously and are inserted straight into RonDB:
-- Fact table: one row per card transaction, inserted in real time.
CREATE TABLE txn (
txn_id BIGINT NOT NULL,
cc_num BIGINT NOT NULL, -- card / account identifier
merchantkey INT NOT NULL, -- references merchant.m_merchantkey
amount INT NOT NULL, -- minor units (cents)
txn_time DATETIME(6) NOT NULL,
is_declined TINYINT NOT NULL,
PRIMARY KEY USING HASH (txn_id),
-- Ordered index: range-scan one card's recent activity cheaply.
INDEX idx_card_time (cc_num, txn_time)
) ENGINE=NDB
COMMENT='NDB_TABLE=TTL=604800@txn_time'; -- auto-expire rows after 7 days
-- Small dimension table: replaces a per-card Avro BLOB of merchant attributes.
CREATE TABLE merchant (
m_merchantkey INT NOT NULL,
m_category VARCHAR(16) NOT NULL,
m_risk_score INT NOT NULL,
PRIMARY KEY USING HASH (m_merchantkey)
) ENGINE=NDB;
The simplest on-demand feature is a scalar aggregate over the card’s last hour
of transactions. No pre-computation, no BLOB — just an index range scan that
includes whatever was inserted milliseconds ago:
SELECT
COUNT(*) AS txns_1h,
SUM(amount) AS amount_1h,
MAX(amount) AS max_amount_1h,
AVG(amount) AS avg_amount_1h,
SUM(CASE WHEN is_declined = 1 THEN 1 ELSE 0 END) AS declines_1h
FROM txn
WHERE cc_num = 4716253018273645
AND txn_time >= DATE_SUB('2026-06-29 14:30:00', INTERVAL 1 HOUR);
RonSQL turns the WHERE into an ordered-index range
scan on idx_card_time — it touches only this card’s
last hour — and pushes the COUNT/SUM/MAX/AVG
and the CASE expression down to the data nodes, which aggregate in
parallel and return a single row.
Now suppose the model wants spend broken down by merchant category.
The category does not live on the transaction — it lives on the
merchant dimension. The classic Feature Store approach would
denormalise the category into a packed BLOB per card. With RonSQL we keep the
data normalised and join at query time:
WITH spend_by_merchant AS (
SELECT merchantkey AS m,
SUM(amount) AS spend,
COUNT(*) AS txns
FROM txn
WHERE cc_num = 4716253018273645
AND txn_time >= DATE_SUB('2026-06-29 14:30:00', INTERVAL 1 HOUR)
GROUP BY merchantkey
)
SELECT m.m_category AS category,
SUM(spend_by_merchant.spend) AS spend_last_hour,
SUM(spend_by_merchant.txns) AS txns_last_hour
FROM merchant AS m
JOIN spend_by_merchant ON spend_by_merchant.m = m.m_merchantkey
GROUP BY m.m_category;
This query is easy to reason about, top to bottom:
spend_by_merchant runs anidx_card_time, restricted to one cardSUM(amount) and COUNT(*) grouped bymerchantkey, returning just a handful of rows (one per merchantm_merchantkey is the primary key of merchant, so eachmerchant is a small dimension table.m_category, producing one row per merchant category — aEvery stage is a pushdown, and stages such as the index scan and the lookups
run in parallel across the data nodes. We could even execute several CTEs in
parallel. Because RonSQL guarantees the whole thing pushes down, the latency is
bounded and predictable — which is exactly the contract a real-time
inference path needs.
RonSQL is reachable two ways:
rondb-cli shell sends a line straight to it withRONSQL prefix.ronsql_cli. A standalone client for scripting--execute,--execute-file, or stdin and can emit results as JSONTEXT.Both paths support EXPLAIN. Prefixing a query with
EXPLAIN shows the chosen pushdown plan — which index drives
each scan, which joins become lookups, and where the aggregation happens —
so “will this be fast?” is a question you answer before you
ship, not after.
RonSQL is a read-only, aggregation-focused SQL subset designed so that
everything it accepts can be pushed down:
SELECT only (plusEXPLAIN). No DDL/DML.WITH clausesINNER JOIN,LEFT [OUTER] JOIN, self-joins, and comma cross-joins over scalara.x = b.x AND a.y = b.y).WHERE —= <> < <= > >=, LIKE,IN (list), IS [NOT] NULL,AND/OR/XOR/NOT, arithmetic, bitwise ops, andCASE WHEN.EXISTS,IN (subquery), and scalar subqueries.COUNT(*), COUNT(expr),SUM, MIN, MAX, AVG.GROUP BYHAVING,ORDER BY ASC/DESC, LIMIT.CASE WHEN,GREATEST/LEAST, and date/time functionsDATE_ADD, DATE_SUB, EXTRACT,INTERVAL.FORCE INDEX,USE INDEX, IGNORE INDEX.Because the Feature Store has to compute the same feature in two very
different settings. Batch training and batch inference run on
engines like Spark SQL and DuckDB — both
batch query engines, chosen for different characteristics (Spark scales the work
across a cluster for very large datasets; DuckDB runs embedded and is hard to
beat on a single node for moderate data). Online serving runs on
RonSQL, computing the feature fresh at inference time. When all
of them speak SQL, the same feature logic can be expressed as the same query text
on each engine, which eliminates a notorious source of
training/serving skew — features that subtly differ
between the model’s training data and what it sees live at inference.
RonSQL is already useful, but there is a clear roadmap, much of it driven
directly by Feature Store needs:
COUNT(DISTINCT ...),DISTINCT and OFFSET more generally.STDDEV andVARIANCE (for z-score features), andGROUP_CONCAT.FROM, UNION, RIGHT/FULL OUTER
JOIN, and recursive CTEs for hierarchy/graph features.The core contribution stays the same: predictable low latency for
complex queries over fresh data, expressed in portable SQL —
exactly what an online Feature Store needs to serve fresh, skew-free features to
an AI model.
Planet for the MySQL Community
https://webyog.com/wp-content/uploads/2017/11/connections-and-buffer-pool-usage-1.png
As databases power increasingly complex workloads — from AI-driven applications to cloud-native
microservices and containerized deployments — the ability to monitor MySQL performance with precision has
never been more important. Whether you’re running MySQL 8.0/8.4 LTS in a stable production environment or
experimenting with the MySQL 9.x Innovation series, the fundamentals of connection management and buffer
pool tuning remain the bedrock of a healthy database.
This post, part of our ongoing MySQL monitoring series, dives into two critical areas: connection metrics and
the InnoDB buffer pool. Mastering these will help you catch problems before they become outages.
Modern infrastructure has raised the stakes for database performance. AI inference workloads generate
high-throughput, low-latency query patterns. Cloud deployments scale horizontally but introduce new failure
modes. Containerized MySQL instances — whether on Kubernetes or ECS — spin up and down rapidly, making
consistent monitoring essential.
The good news: MySQL’s built-in instrumentation is richer than ever. MySQL 8.4 LTS and the 9.x Innovation
releases ship with improved Performance Schema coverage, enhanced replication visibility, and better
diagnostics for connection errors. Knowing which metrics to watch — and what thresholds to act on — separates
reactive firefighting from proactive operations.
Every client request to MySQL passes through the connection manager thread. MySQL maintains a pool of
threads to handle these connections, and each active connection consumes memory and CPU. In
high-concurrency workloads — think e-commerce flash sales, real-time analytics pipelines, or AI feature stores
— connection pressure is one of the first things that breaks.
The default max_connections value is 151, which is appropriate for development but far too low for
production. Most production environments should set this to hundreds or even thousands, depending on
available RAM and workload patterns.
| Metric | What It Tells You |
| Threads_connected | Number of currently open connections |
| Threads_running | Connections actively executing queries (not idle) |
| Connections | Cumulative total connections since server start |
| Connection_errors_internal | Errors from internal server issues |
| Aborted_connects | Failed connection attempts |
| Aborted_clients | Failed connection attempts |
Threads_running is arguably the most important of these. A spike here — especially if it approaches
max_connections — signals that your server is under stress. If Threads_running climbs while
Threads_connected stays flat, you likely have slow queries piling up.
MySQL surfaces granular connection error counters that help you diagnose the root cause of failed connections:
The InnoDB buffer pool is MySQL’s most important memory structure. It caches table data and index pages in
RAM, reducing the need for expensive disk reads. A well-sized buffer pool can serve the majority of reads from
memory — dramatically reducing latency and I/O load.
The default buffer pool size is 128MB, which is a reasonable starting point for development. For dedicated
database servers, the best practice is to allocate approximately 80% of available RAM to the buffer pool.
The buffer pool size must align with this formula:
innodb_buffer_pool_size = N × innodb_buffer_pool_chunk_size × innodb_buffer_pool_instances
MySQL 8.x allows online resizing of the buffer pool, meaning you can adjust it without a restart — a significant
operational improvement. In cloud environments where instance sizes change frequently, this matters.
InnoDB uses a variant of the Least Recently Used (LRU) algorithm to manage which pages stay in the buffer
pool. Rather than a simple LRU list, MySQL uses a midpoint insertion strategy: newly loaded pages enter at
the midpoint of the list, not the head. This prevents large full-table scans from flushing your hot working set out of
the pool.
Two tuning parameters control this behavior:
For OLTP workloads with repeated access to the same rows, the defaults work well. For mixed workloads
running analytical queries alongside transactional ones — increasingly common as AI pipelines run batch
feature extraction alongside live serving — you may need to tune these values to protect your hot page set.
| Metric | What It Tells You |
| Innodb_buffer_pool_read_requests | Total logical read requests (memory hits + disk reads) |
| Innodb_buffer_pool_reads | Physical reads from disk (cache misses) |
| Innodb_buffer_pool_pages_total | Total pages in the buffer pool |
| Innodb_buffer_pool_pages_free | Pages currently available (not in use) |
Cache Miss Rate = (Innodb_buffer_pool_reads / Innodb_buffer_pool_read_requests) × 100
A healthy production system should have a cache miss rate below 1% — meaning 99%+ of reads are served
from memory. If your miss rate climbs above this threshold, your buffer pool is undersized for your working data
set.
Watch Innodb_buffer_pool_pages_free as well. A consistently low free page count means the pool is
under memory pressure, and MySQL is spending time evicting pages rather than serving data.
Manual monitoring with SHOW GLOBAL STATUS is a starting point, but it doesn’t scale. For teams running
MySQL in production — especially across multiple instances or cloud regions — a dedicated monitoring tool is essential.
SQL Diagnostic Manager for MySQL (part of the Webyog/IDERA family) provides real-time dashboards for all
the metrics discussed here, plus automated alerting, query analysis, and root cause diagnostics. Whether you’re
managing a single server or dozens of replicas behind a load balancer, having these metrics in one place makes
the difference between proactive tuning and reactive recovery.
Connection management and buffer pool sizing are foundational to MySQL performance. In 2026’s environment
of AI workloads, cloud scaling, and containerized deployments, these metrics deserve continuous attention —
not just one-time configuration.
Stay tuned for the next post in this series, where we cover InnoDB I/O metrics and query performance
diagnostics.
In most workloads, Threads_running should stay well below max_connections. If it consistently exceeds
20–30% of your connection limit, investigate slow queries or lock contention. A sudden spike often points to a
rogue query or a batch job gone wrong.
Check your cache miss rate: (Innodb_buffer_pool_reads / Innodb_buffer_pool_read_requests)
× 100. A value above 1% is a warning sign. Also watch Innodb_buffer_pool_pages_free — if free pages hover near zero, MySQL is under memory pressure and evicting data it needs.
Yes — MySQL 8.x supports online buffer pool resizing via SET GLOBAL innodb_buffer_pool_size. The
resize happens in chunks and may take a few seconds to minutes depending on pool size. No restart required.
Aborted clients typically indicate application-side connection leaks — connections opened but not properly
closed. Check your application’s connection pooling configuration and ensure connections are returned to the
pool after each operation.
Critical metrics like Threads_running and buffer pool efficiency should be monitored continuously with
alerting thresholds. Review trends weekly and investigate any sustained drift from your baseline.
Yes. These metrics are exposed in cloud-managed MySQL instances and most providers surface them in their
native monitoring dashboards. You can also connect SQL Diagnostic Manager for MySQL to RDS and Cloud
SQL instances for deeper analysis.
Stop flying blind on your MySQL performance. SQL Diagnostic Manager for MySQL gives you real-time
dashboards, automated alerting, and root cause analysis — covering every metric discussed in this post and
more.
Visit webyog.com to get started today.
Planet for the MySQL Community