No such thing as smooth Database switch

Unless you summon the rulers of Olympus.

When designing a database abstraction layer, it is very tempting to fall into the trap of wanting to create a single module that handles MySQL, Postgres, MSSQL or any other database engine. I know I did.

A dream come true

SQL, like the name suggests is a very simple query language. You can easily read it and learn the basics in a matter of days. Sure, it can become very complex depending on the project but I would say any programmer that spends enough time and efforts can learn it for his project fairly quickly.

SELECT `name`, `address`, `website` FROM `contractor`

The query itself is self explanatory. In the example above, we are trying to get the name, address and website of contractors from our database.

There has been a lot of effort made to turn applications that use SQL database engines into a business readable entities. What this means, the guys in charge of making business decisions can read and write an even simpler version of the query language without caring much about the underlying technology.

In reality of course the business guys and gals have no business looking at the code. But because of this effort we now have Object Relational Models or ORM for short.

$result = $orm->find('name','address','homepage')
        ->from("contractor")
        ->execute();

This is very readable indeed. The person writing it does not need to know what kind of database engine is holding the data. When I started a new project in Symfony, my team used Doctrine as an ORM.

It started well, we used code like the example above. But as the project grew and became more complex, there were some queries that were very inefficient when they were automatically generated by the ORM. So we started writing our own queries which defeated the purpose of using an ORM. We still used it on other places because it required less lines of code:

$result = $em->get("TableName")->findOneByName("jeremy creek");

The function findOneByName does not exists but Doctrine is smart enough to read the name of the function and automatically generate the correct query.

However, the time came where our project became bloated and very slow. The DBA team told us that MySQL was the bottleneck in some places and suggested that we switched to the more robust Postgres Database engine. "Thank God, we used Doctrine", is what the lead dev said.

The illusion of Doctrine fades away.

Doctrine is an ORM service that you use when you don't want to write SQL yourself. It is very complex and can do almost anything you want if you take the time to customize it. It abstracts any SQL from your code. As a matter of fact, if you use it properly, you don't need to write a single line of SQL. Doctrine is pretty damn slow, and it uses a lot of memory for the simplest task. But we used it anyway because it allowed us to have working prototypes in a matter of days instead of weeks.

But it is inevitable, as your application grow, no matter how much you planned it there will be a time where you need to customize it to your needs. And sometimes customization means not following the framework rules. And that means you will not have the smooth transition you hoped for when switching database engine.

Each database engines have small syntactical differences. A simple different between MySQL and Postgres is when MySQL uses back tics ``` while the other uses single quotes '.

Of course on our side the back tics were minimal. But the whole queries themselves needed to be updated. Making the switch was not a simple task, it was a whole project on its own. Making sure that the old and new queries performed the exact same tasks and returned the same data.

One thing to remember is that the database is not something to take lightly. You don't change databases like you change JavaScript frameworks. Your database is an integral part of your application. You wouldn't switch your programming languages just because one is slightly faster would you? You make the change when there are obvious benefits that outweigh the time and money it takes to make the change.

Despite Angular being very popular and supported by Google, Google didn't switch all their applications to run on it. It takes time to make the switch and even if it might be better, the cost of moving all your developers to use the new paradigm is not worth it.

So back to SQL. Here I am giving a generous example by using a popular ORM like Doctrine. But in reality, people make their own ORMs catered to their custom applications. They try to make it as abstract as possible to make the transition to different Databases swift. Projects rarely have to switch databases, and our cases was a rare one that was fought harshly but still approved.

You will probably never port your database to another engine, and if you do, no matter how abstracted it is, you will almost entirely rewrite it from scratch. Abstracting your data layer to the level of a database is like using Java to create your Applet. You can dream as much as you want but it's not going to work on your mac as well as your windows machine.

Easing the switch

The best way to make it easy to switch, is to organize your code from the beginning. Make your code look wrong when it is wrong. If all your SQL code is organized in known places, where you can close your eyes and find them then you know that you don't have to wear your hunting hat to find SQL in the wild.

If you find SQL code anywhere inside your controller or in a template then you know it is wrong. Let's look at a good structure and a bad structure.

Example of badly organized code

<html>
 ....
<?php
    $con = new mysqli("localhost","user","password","test");
?>
<div>
    <?php
        // Recent articles
        $sql = "SELECT * 
        FROM articles
        WHERE published = 1
        ORDER BY date_added DES
        LIMIT 10";

        $result = $con->query($sql);
        ...

    foreach($articles as $art):
    ?>
        <h2><?php echo $art->title?></h2>
        ...
    <?php endforeach;?>


    <?php
    // recent comments
    get_recent_comments();
    ?>
</div>
...

In this case, you can find SQL inside the html template, and another function generates the comments and is probably defined in a completely different file. If you want to upgrade the database, you will have to hunt down all the places you think there is a query and hoping not to forget any.

Example of well organized

class HomeController extend Controller {

    ...

    public function indexAction() {

        $em = $this->getDatabaseManager();

        // Recent Articles
        $articles = $em->get("Article")->getRecent(self::ARTICLE_PER_PAGE);

        // Recent Comments
        $comments = $em->get("Comment")->getLatest(self::COMMENT_LATEST_COUNT);

        // Most Popular
        $popular_articles = $em->get("Article")->getMostPopular(self::HOMEPAGE_POPULAR_COUNT);
        ...

    }

}

In this example, you don't see any SQL on the files where you call them (the controller in this example). The Database is abstracted, only so I don't have to have see SQL all over my code. We can have an Article class that is contains all the queries related to the article table. The same for the Comments and any other table you have. They can all be in their own folder so when the time comes to make a database switch, all we need to do is go through that folder and update a file at at time.


There is rarely a need to switch database engines. If you get to the point where you do, you probably are making lots of money and can afford to have specialized people to do the job for you.

But we can all dream, and hope for a future were we can have a unified SQL syntax that works exactly the same on any engine. But MySQL users like the way they write things, Postgres users have their own circle. How about MSSQL users? Which syntax is to be the standards? And don't even get me started with NoSQL engines.


Comments

There are no comments added yet.

Let's hear your thoughts

For my eyes only