Working With Large PostgreSQL Databases

https://www.percona.com/blog/wp-content/uploads/2022/04/Working-With-Large-PostgreSQL-Databases-200×105.pngWorking With Large PostgreSQL Databases

It’s a funny thing when the topic of database sizes comes up. Calling one small, medium, large, or even huge isn’t as straightforward as you’d think. Distinguishing the size of a database is based upon a number of factors whose characteristics can be classified as either “tangible”, things that you can measure in an objective manner, or “intangible”, those attributes best expressed using the catch-all phrase “it depends”. For example, a 2TB database is, to many people, a large database. On the other hand, a veteran DBA could describe a PostgreSQL database cluster as large when it enters the realm of Peta-Bytes.

Here’s a recap of some of PostgreSQL’s basic capabilities:

database size

unlimited

number of databases

4,294,950,911

relations per database

1,431,650,303

table size

32TB

rows per table, defined by the number
of tuples that can fit onto the page

4,294,967,295 pages

field per table

1,600

field size

1GB

identifier length

63 bytes

indexes per table

unlimited

columns per index

32

partition keys

32

NB: Despite possible physical constraints one faces when creating large numbers of schema, there is no theoretical limitation to the number created in postgres.

I’ve come to differentiate a small database from a large one using the following caveats. And while it is true that some of the caveats for a large database can be applied to a small one, and vice-versa, the fact of the matter is that most of the setups out there in the wild follow these observations:

  1. Small databases are often administered by a single person
  2. Small databases can be managed manually.
  3. Small databases are minimally tuned.
  4. Small databases can typically tolerate production inefficiencies more than large ones.
  5. Large databases are managed using automated tools.
  6. Large databases must be constantly monitored and go through an active tuning life cycle.
  7. Large databases require rapid response to imminent and ongoing production issues to maintain optimal performance.
  8. Large databases are particularly sensitive to technical debt.

Large databases often bring up the following questions and issues:

  • Is the system performance especially sensitive to changes in production load?
  • Is the system performance especially sensitive to minor tuning effects?
  • Are there large amounts of data churn?
  • Does the database load system saturate your hardware’s capabilities?
  • Do the maintenance activities, such as logical backups and repacking tables, take several days or even more than a week?
  • Does your Disaster Recovery Protocol require having a very small Recovery Point Objective (RPO) or Recovery Time Objective (RTO)?

The key difference between a small vs large database is how they are administered:

  1. Whereas it is common that small databases are manually administered, albeit it’s not best practice, using automation is the industry default mode of operation in many of these situations for large databases.
  2. Because circumstances can change quickly, large databases are particularly sensitive to production issues.
  3. Tuning is constantly evolving; while it is true that newly installed architectures are often well-tuned, circumstances change as they age and large databases are especially vulnerable.

Good planning is your friend: addressing potential issues for a large database by anticipating future conditions is the goal i.e. testing the entire infrastructure before it goes into production. 

Scripting your build environment using tools such as Ansible, Puppet, Terraform, etc. mitigates human error when provisioning the underlying infrastructure. It’s important to be able to build in a consistent and repeatable manner.

Once a database is in production it must be monitored and wired with alerts for the various critical thresholds. Aside from the standard errors, consider configuring your monitoring solution to follow the “Rule Of Three”. Select and watch only three metrics that track and alert for a specific “change of state”. This is not to be confused with following a particular issue, rather it is meant to inform you that you should pay attention to your system in order to understand that something has changed from what is considered normal behavior. Depending on your preferences you may want to watch for known production issues or when the system is stable you might be more interested in trending alerts such as query performance which have slowed below a predefined threshold.

In regards to system tuning: while small databases can, after a fashion, perform in a satisfactory manner using the default values large databases cannot. Configuring initial tuning parameters such as the shared_buffers etc is de rigueur but you should also monitor the environment in order to trend issues such as for example bloat and long-term query performance. Remember, the most common problem experienced by an otherwise stable and well-thought-out architecture is table and index bloat. Addressing bloat by tuning the autovacuum characteristics is essential.

Monitoring, especially before and after maintenance windows, is required because they can catch potential problems to the update before becoming production issues.

Pay close attention to following the regular maintenance activities during the life-cycle of your system:

  • Logical backups and misc database redundancies
  • Architectural evolution:
    • application stack updates, upgrades, and rollbacks
    • application/database scaling
  • PostgreSQL server upgrades:
    • minor
    • major
  • Database migrations
  • Hardware upgrades
  • Scaling the cluster by adding and removing server nodes

Maintenance activities such as logical backups and PostgreSQL minor upgrades are performed at regular intervals.

Plan for space utilization requirements of logical dumps and WAL archives.

In regards to logical backups: it can be difficult to justify backing up an entire database when it can take a week. Alternatively, differential backups are a potential solution. Backing up tables that are updated and deleted regularly can be archived at a faster frequency than the slower changing tables which can be stored without changes for a longer period of time. This approach however requires the appropriate architectural design considerations such as using table partitioning. 

An alternative to logical backups is to consider Copy On Write (COW), or stacked file systems, such as ZFS and BTRFS. Environments within containers for example can leverage snapshots and clones allowing for near-instant recoveries in a disaster recovery scenario.

Complex operations, such as hardware and database scaling, encompass many sub-activities and can often involve working with several teams at the same time. In this case, maintaining reference documentation is critical. Activities such as these are best tracked and planned in a Kanban, or Scrum, environment.

In regards to Disaster Recovery (DR) consider automating the following operations:

  • Recovery via Logical backups
  • Failover of a PRIMARY to a REPLICA
  • Dropping/Building a new REPLICA
  • Point In Time Recovery (PITR): rebuilding a cluster to a point in time

As an aside to PITR: instead of rebuilding an entire data cluster from scratch to a particular point in time, one can instead create a STANDBY host that is replicated on a delay and can be recovered to a particular point in time or promoted in its current state. Refer to run-time parameter recovery_min_apply_delay for more information.

In conclusion, while small databases can be managed by administrating in an ad hoc manner, the administration of a large database must always be performed using a more rigorous and conscientious approach. And what you learn from administering a large database can be carried over to administering a small one.

REFERENCES:

Percona Database Performance Blog