Writing SQL that works on PostgreSQL, MySQL and SQLite

I am one of those crazy people who attempts to write SQL that works on
SQlite, MySQL and PostgreSQL. First I should explain why:
This is all for my project sabre/dav. sabre/dav is a server for CalDAV,
CardDAV and WebDAV. One of the big design goals is that it this project has to
be a library first, and should be easily integratable into existing
applications.
To do this effectively, it’s important that it’s largely agnostic to the host
platform, and one of the best ways (in my opinion) to achieve that is to have
as little dependencies as possible. Adding dependencies such as Doctrine
is a great idea for applications or more opinionated frameworks, but for
sabre/dav lightweight is key, and I need people to be able to understand the
extension points fairly easily, without requiring them to familiarize them
with the details of the dependency graph.
So while you are completely free to choose to add Doctrine or Propel
yourself, the core adapters (wich function both as a default implementation
and as samples for writing your own), all only depend on an instance of
PDO.
The nice thing is that ORMs such as Doctrine and Propel, you can get access
to the underlying PDO connection object and pass that, thus reusing your
existing configuration.
For the longest time we only supported SQLite and MySQL, but I’m now working
on adding PostgreSQL support. So I figured, I might as well write down my
notes.
But how feasable is it to write SQL that works everywhere?
Well, it turns out that this is actually not super easy. There is such as
thing as Standard SQL, but all of these databases have many of their
own extensions and deviations.
The most important thing is that this will likely only work well for you if
you have a very simple schema and simple queries.
Well, this blog post is not intended as a full guide, I’m just listing the
particular things I’ve ran into. If you have your own, you can edit this blog
post on github, or leave a comment.
My approach
I try to keep my queries as simple as possible.
If I can rewrite a query to work on every database, that query will have the
preference.
I avoid stored procedures, triggers, functions, views. I’m really just
dealing with tables and indexes.
Even if that means that it’s not the most optimal query. So I’m ok with
sarcrificing some performance, if that means my queries can stay generic,
within reason.
If there’s no possible way to do things in a generic way, I fall back on
something like this:
<?php
if ($pdo->getAttribute(PDO::ATTR_DRIVER_NAME) === ‘pgsql’) {
$query = "…";
} else {
$query = "…’;
}
$stmt = $pdo->prepare($query);
?>
DDL
First there is the “Data Definition Language” and “Data Manipulation Language”
the former is used for queries starting with CREATE, ALTER, DROP, etc,
and the latter SELECT, UPDATE, DELETE, INSERT.
There really is no sane way to generalize your CREATE TABLE queries, as the
types and syntax are vastly different.
So for those we have a set of .sql files for every server.
Quoting
In MySQL and SQlite you can use either quotes ‘ or double quotes " to wrap
a string.
In PostgreSQL, you always have to use single quotes ‘.
In MySQL and SQLite you use backticks for identifiers. PostgreSQL uses single
quotes. SQlite can also use single quotes here if the result is unambigious,
but I would strongly suggest to avoid that.
This means that this MySQL query:
SELECT * FROM `foo` WHERE `a` = "b"
is equivalent to this PostgreSQL query:
SELECT * FROM "foo" WHERE "a" = ‘b’
Luckily you can often just write this query, which works for all databases:
SELECT * FROM foo WHERE a = ‘b’
But keep in mind that when you create your tables, using double quotes will
cause PostgreSQL to retain the upper/lower case characters. If you do not use
quotes, it will normalize everything to lower case.
For compatibility I would therefore suggest to make sure that all your table
and column names are in lower case.
REPLACE INTO
The REPLACE INTO is a useful extension that is supported by both SQLite and
MySQL. The syntax is identical INSERT INTO, except that if it runs into a
key conflict, it will overwrite the existing record instead of inserting a new
one.
So REPLACE INTO basically either updates or inserts a new record.
This works on both SQLite and MySQL, but not PostgreSQL. Since version 9.5
PostgreSQL gained a new feature that allows you to achieve the same effect.
This statement from MySQL or SQLite:
REPLACE INTO blog (uuid, title) VALUES (:uuid, :title)
then might become something like this in PostgreSQL:
INSERT INTO blog (uuid, title) VALUES (:uuid, :title)
ON CONFLICT (uuid) DO UPDATE SET title = :title
So the major difference here is with PostgreSQL we specifically have to tell
it which key conflict we’re handling (uuid) and what to do in that case
(UPDATE).
In addition to REPLACE INTO, MySQL also has this syntax to do the same thing:
INSERT INTO blog (uuid, title) VALUES (:uuid, :title)
ON DUPLICATE KEY UPDATE title = :title
But as far as I know SQLite does not have a direct equivalent.
BLOB
SQLite and MySQL have a BLOB type. This type is used for storing data as-is.
Whatever (binary) string you store, you will retrieve again and no conversion
is attempted for different character sets.
PostgreSQL has two types that have a similar purpose: Large Objects and
the bytea type.
The best way to describe large objects, is that they are stored ‘separate’ from
the table, and instead of inserting the object itself, you store a reference to
the object (in the form of an id).
bytea is more similar to BLOB, so I opted to use that. But there are some
differences.
First, if you do a select such as this:
<?php
$stmt = $pdo->prepare(‘SELECT myblob FROM binaries WHERE id = :id’);
$stmt->execute([‘id’ => $id]);
echo $stmt->fetchColumn();
?>
On MySQL and Sqlite this will just work. The myblob field is represented as
a string.
On PostgreSQL, byta is represented as a PHP stream. So you might have to
rewrite that last statement as:
<?php
echo stream_get_contents($stmt->fetchColumn());
?>
Or:
<?php
stream_copy_to_stream($stmt->fetchColumn(), STDOUT);
?>
Luckily in sabre/dav we pretty much support streams where we also support
strings, so we were already agnositic to this, but some unittests had to be
adjusted.
Inserting bytea is also a bit different. I’m not a fan of of using
PDOStatement::bindValue and PDOStatement::bindParam, instead
I prefer to just send all my bound parameters at once using execute:
<?php
$stmt = $pdo->prepare(‘INSERT INTO binaries (myblob) (:myblob)’);
$stmt->execute([
‘myblob’ => $blob
]);
?>
While that works for PostgreSQL for some strings, it will throw errors
when you give it data that’s invalid in the current character set. It’s also
dangerous, as PostgreSQL might try to transcode the data into a different
character set.
If you truly need to store binary data (like I do) you must do this:
<?php
$stmt = $pdo->prepare(‘INSERT INTO binaries (myblob) (:myblob)’);
$stmt->bindParam(‘myblob’, $blob, PDO::PARAM_LOB);
$stmt->execute();
?>
Luckily this also just works in SQlite and MySQL.
String concatenation
Standard SQL has a string concatenation operator. It works like this:
SELECT ‘foo’ || ‘bar’
// Output: foobar
This works in PostgreSQL and Sqlite. MySQL has a function for this:
SELECT CONCAT(‘foo’, ‘bar’)
PostgreSQL also has this function, but SQLite does not. You can enable Standard
SQL concatenation in MySQL by enabling it:
SET SESSION sql_mode = ‘PIPES_AS_CONCAT’
I’m not sure why this isn’t the default.
Last insert ID
The PDO object has a lastInsertId() function. For SQLite and MySQL you can
just call it as such:
<?php
$id = $pdo->lastInsertId();
?>
However, PostgreSQL requires an explicit sequence identifier. By default this
follows the format tablename_idfield_seq, so we might specifiy as this:
<?php
$id = $pdo->lastInsertId(‘articles_id_seq’);
?>
Luckily the parameter gets ignored by SQLite and MySQL, so we can just specify
it all the time.
Type casting
If you have an INT field (or similar) and you access it in this way:
<?php
$result = $pdo->query(‘SELECT id FROM articles’);
$id = $result->fetchColumn();
?>
With PostgreSQL $id will actually have the type php type integer. If you
use MySQL or SQlite, everything gets cast to a php string, which is
unfortunate.
The sane thing to do is to cast everything to int after the fact, so you can
correctly do PHP 7 strict typing with these in the future.
Testing
I unittest my database code. Yep, you read that right! I’m one of those people.
It’s been tremendously useful.
Since adding PostgreSQL I was able to come up with a nice structure. Every
unittest that does something with PDO now generally looks like this:
<?php
abstract PDOTest extends \PHPUnit_Framework_TestCase {
abstract function getPDO();
/** all the unittests go here **/
}
?>
Then I create one subclass for PostgreSQL, Sqlite and MySQL that each only
implement the getPDO() function.
This way all my tests are repeated for each driver.
I’ve also rigged up Travis CI to have a MySQL and a PostgreSQL database
server running, so everything automatically gets checked every time.
If a developer is testing locally, we detect if a database server is running,
and automatically just skip the tests if this was not the case. In most cases
this means only the Sqlite tests get hit, which is fine.
Conclusions
Created a monster.
PostgreSQL is by far the sanest database, and I would recommend everyone to
move from MySQL towards it.
via Planet MySQL
Writing SQL that works on PostgreSQL, MySQL and SQLite